Advent of YARV: Part 23 - Primitive
This blog series is about how the CRuby virtual machine works. If you’re new to the series, I recommend starting from the beginning. This post is about the Primitive module and the invokebuiltin instruction.
In October of 2019, Koichi Sasada proposed adding a new way for methods that were defined in C to be declared in Ruby. You can read about it in the initial proposal or view the RubyKaigi presentation.
If the code being compiled is in a core Ruby file, there are two ways to use the new syntax. You can call either __builtin_xxx or Primitive.xxx where xxx is a function name. This will compile in the invokebuiltin instruction which will call the associated C function when it is executed. Once the functions are called, another script (tool/mk_builtin_loader.rb) will run over those files, find all of the builtin calls, and create a C function that exports them.
There are two optimizations that can be applied to the invokebuiltin instruction. The first is opt_invokebuiltin_delegate. This optimization gets compiled when the arguments to the invokebuiltin instruction are the same as the arguments to the C function. The second is opt_invokebuiltin_delegate_leave which gets compiled when invokebuiltin is followed by a leave instruction. This optimization will call the C function and immediately perform the same actions as the leave instruction.
To actually see these instructions in instruction sequences, you need to use the RubyVM::InstructionSequence::of method. This method will return a RubyVM::InstructionSequence object that corresponds to the given method. You can then call disasm on that object to get back the disassembly as a string. If we call it on a method that uses the Primitive module, we can see the invokebuiltin instruction and its specializations.
Array#sample
As an example, let’s take a look at the Array#sample method. Here’s how it is declared in Ruby:
class Array
def sample(n = (ary = false), random: Random)
if Primitive.mandatory_only?
Primitive.ary_sample0
else
Primitive.ary_sample(random, n, ary)
end
end
end
This disassembles to:
== disasm: #<ISeq:sample@<internal:array>:60 (60,2)-(68,5)> (catch: false)
local table (size: 4, argc: 0 [opts: 1, rest: -1, post: 0, block: -1, kw: 1@0, kwrest: -1])
[ 4] n@0<Opt=0> [ 3] random@1 [ 2] ?@2 [ 1] ary@3
0000 putobject false ( 60)[Li]
0002 dup
0003 setlocal_WC_0 ary@3
0005 setlocal_WC_0 n@0
0007 checkkeyword 4, 0
0010 branchif 16
0012 opt_getconstant_path <ic:0 Random>
0014 setlocal_WC_0 random@1
0016 getlocal_WC_0 random@1 ( 66)[LiCa]
0018 getlocal_WC_0 n@0
0020 getlocal_WC_0 ary@3
0022 invokebuiltin <builtin!ary_sample/3>
0024 leave ( 68)[Re]
You can see the invokebuiltin instruction gets compiled in here to call ary_sample with 3 arguments.
Dir::open
Let’s take a look at another example. This time, we’ll look at the Dir::open method. Here’s how it is declared in Ruby:
class Dir
def self.open(name, encoding: nil, &block)
dir = Primitive.dir_s_open(name, encoding)
if block
begin
yield dir
ensure
Primitive.dir_s_close(dir)
end
else
dir
end
end
end
This disassembles to:
== disasm: #<ISeq:open@<internal:dir>:97 (97,2)-(108,5)> (catch: true)
== catch table
| catch type: ensure st: 0010 ed: 0014 sp: 0001 cont: 0018
| == disasm: #<ISeq:ensure in open@<internal:dir>:103 (103,8)-(103,34)> (catch: true)
| local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
| [ 1] $!@0
| 0000 getlocal_WC_1 dir@4 ( 103)[Li]
| 0002 invokebuiltin <builtin!dir_s_close/1>
| 0004 pop
| 0005 getlocal_WC_0 $!@0
| 0007 throw 0
|------------------------------------------------------------------------
local table (size: 5, argc: 1 [opts: 0, rest: -1, post: 0, block: 3, kw: 1@0, kwrest: -1])
[ 5] name@0<Arg>[ 4] encoding@1 [ 3] ?@2 [ 2] block@3<Block>[ 1] dir@4
0000 opt_invokebuiltin_delegate <builtin!dir_s_open/2>, 0 ( 98)[LiCa]
0003 setlocal_WC_0 dir@4
0005 getblockparamproxy block@3, 0 ( 99)[Li]
0008 branchunless 19
0010 getlocal_WC_0 dir@4 ( 101)[Li]
0012 invokeblock <calldata!argc:1, ARGS_SIMPLE>
0014 opt_invokebuiltin_delegate <builtin!dir_s_close/1>, 4( 103)[Li]
0017 pop
0018 leave ( 108)[Re]
0019 getlocal_WC_0 dir@4 ( 106)[Li]
0021 leave ( 108)[Re]
Here you can see opt_invokebuiltin_delegate is compiled in for dir_s_open and dir_s_close.
Integer#-@
As a final example, let’s look at Integer#-@. Here’s how it is declared in Ruby:
class Integer
def -@
Primitive.attr! 'inline'
Primitive.cexpr! 'rb_int_uminus(self)'
end
end
In YARV:
== disasm: #<ISeq:-@@<internal:numeric>:88 (88,2)-(91,5)> (catch: false)
0000 opt_invokebuiltin_delegate_leave <builtin!_bi0/0>, 0 ( 90)[LiCa]
0003 leave ( 91)[Re]
You can see that it’s only necessary to have a single instruction compiled in here, opt_invokebuiltin_delegate_leave.
Wrapping up
In this post we looked at invokebuiltin and its associated optimizations. There was a lot of work that went into the proposal and implementation, and it’s worth taking a glance at the issue and watching the video. This post only scratches the surface of what’s possible with invokebuiltin. A couple of things to remember from this post:
invokebuiltinis an instruction that allows CRuby developers to declare methods in Ruby that are implemented in C.- It’s possible that
invokebuiltincan enable some optimizations in the future, such as inlining.
The next post will be the last post in the blog series! We’ll wrap up the series with the final four instructions we haven’t covered yet.
← Back to home