Advent of YARV: Part 11 - Class and instance variables

This blog series is about how the CRuby virtual machine works. If you’re new to the series, I recommend starting from the beginning. This post is about how YARV handles class and instance variables.

In Ruby, objects need to be able to store state. This is done through the use of class and instance variables. This post is not about what these variables are and how to use them, I’m assuming you know that already. Instead, this post focuses on how they are implemented in YARV. There are four instructions in YARV corresponding to class and instance variables. They are the subject of today’s post.

getclassvariable

Whenever you use a class variable, YARV will compile a getclassvariable instruction. The instruction has two operands: the name of the class variable as a symbol and an inline cache.

The inline cache is used to cache the lookup to find where the class variable value is stored - not the value of the class variable itself. The cache key is the value of ruby_vm_global_cvar_state, a global variable used by the entire Ruby runtime that is incremented any time something happens in the VM that could result in a different lookup (e.g., a module being included).1

Once the storage location has been determined either through the cache or a fresh lookup, the value at that location is retrieved and pushed onto the stack.

getclassvariable

The diagram here leaves a bit to be desired, since the stack is not the most complex part of this instruction. If we were to look at this in Ruby:

class GetClassVariable
  attr_reader :name, :cache

  def initialize(name, cache)
    @name = name
    @cache = cache
  end

  def call(vm)
    clazz =
      cache.fetch do
        _self = vm._self
        _self.is_a?(Class) ? _self : _self.class
      end

    vm.stack.push(clazz.class_variable_get(name))
  end
end

In @@foo disassembly:

== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,5)> (catch: false)
0000 getclassvariable                       :@@foo, <is:0>            (   1)[Li]
0003 leave

setclassvariable

The setclassvariable is compiled whenever you assign to a class variable. It has the same operands as getclassvariable, and performs the same lookup to determine where the value is stored. Once the location is determined, the value is popped off the stack and stored there.

One interesting thing to note is that if the same class variable is being referenced within the same instruction sequence, it will use the same inline cache.

setclassvariable

In Ruby:

class SetClassVariable
  attr_reader :name, :cache

  def initialize(name, cache)
    @name = name
    @cache = cache
  end

  def call(vm)
    clazz =
      cache.fetch do
        _self = vm._self
        _self.is_a?(Class) ? _self : _self.class
      end

    clazz.class_variable_set(name, vm.stack.pop)
  end
end

In @@foo = 1 disassembly:

== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,9)> (catch: false)
0000 putobject_INT2FIX_1_                                             (   1)[Li]
0001 dup
0002 setclassvariable                       :@@foo, <is:0>
0005 leave

getinstancevariable

Instance variables used to work in a very similar way to class variables. They would cache the lookup of the storage location in the inline cache and then retrieve the value from that location. This has changed recently in Ruby 3.2 with the introduction of object shapes.

The implementation details of the object shapes mechanism are far outside the scope of this post, but the gist is that we can store instance variables in an array instead of a hash by caching the index of the instance variable in the object shape. This results in much faster access to instance variables. The value of the cache is both an object shape and the index of the instance variable in the object shape.

getinstancevariable

I’m not going to attempt to translate this into Ruby because any example I gave you would be quite a bit different from how it’s actually implemented. However, for a disassembly example, here’s @foo:

== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,4)> (catch: false)
0000 getinstancevariable                    :@foo, <is:0>             (   1)[Li]
0003 leave

setinstancevariable

Similar to getinstancevariable, setinstancevariable has an inline cache that caches the object shape and the index of the instance variable in the object shape. The value is popped off the stack and stored at the storage location indicated by these two factors.

Note that for instance variables because they use object shapes now, the inline cache is no longer shared between instructions.

setinstancevariable

Again, I’m going to forgo translating this into Ruby because it would be quite different from the actual implementation. For disassembly, here’s @foo = 1:

== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,8)> (catch: false)
0000 putobject_INT2FIX_1_                                             (   1)[Li]
0001 dup
0002 setinstancevariable                    :@foo, <is:0>
0005 leave

Wrapping up

In this post we discussed class and instance variables. We saw they are stored and use caches to improve lookup speeds. A couple of things to remember from this post:

In the next post we’ll continue looking at types of variables by digging into global variables.


  1. The inline cache for class variables didn’t come into play until Ruby 3.0. Prior to that, the instruction only had one operand and the class to look up the variable was determined every time the instruction ran. This led to many blog posts and linters deciding that class variables were bad and should be avoided on account of performance. Thanks to Eileen Uchitelle and Aaron Patterson, this is no longer the case. But you still may want to avoid them for other reasons. 

← Back to home