Ruby type conversion

Let’s talk about type conversion in Ruby.

First, we’re going to need some definitions:

With these definitions in mind, we can define the term “dynamically-typed”. Ruby is considered a dynamically-typed language because the types of the objects that are live during the execution of a Ruby program are not known at compile time, they are only known at run time. It follows that any method can receive any type of object for any argument. When a method receives an argument of a type that it doesn’t expect, it can do one of three things:

By far the worst option is the first one, as it’s abdicating responsibility to the next developer. The second option is somewhat better, it in that it enforces the expected type. The issue with the second one is that it’s enforcing the nominal type (the class) as opposed to the structural type (the interface). It precludes the possibility that the object could be converted into the desired type, and instead halts execution.

The third option is the subject of this blog post. Ruby is rife with examples of this kind of type conversion, especially within the standard library. This pattern has propagated into other popular Ruby projects, including Ruby on Rails. To start out, let’s look at some examples.

Type coercion in a method

Let’s look at Array#flatten. For those unfamiliar, #flatten is a method on Array that concatentates together internal arrays into a single array. For example:

[1, [2, 3], [4], [5, [6, [7]]]].flatten
# => [1, 2, 3, 4, 5, 6, 7]

It also accepts as an argument the “level” of flattening (i.e., the number of times it should recursively flatten). That looks like:

[1, [2, 3], [4], [5, [6, [7]]]].flatten(2)
# => [1, 2, 3, 4, 5, 6, [7]]

(Notice that the final array containing the integer 7 is not flattened because it was three levels deep.)

This is the base case, and works just fine in the examples provided above. However, there are times when the values inside the array are not so primitive. Let’s consider an example where we have a custom class that functions as a list of elements (and maintains its own internal array). In that case, if we call #flatten on an array of those objects, we’ll get back the same array, as in:

class List
  def initialize(elems)
    @elems = elems
  end
end

[List.new([1, 2]), List.new([3, 4]), List.new([5, 6])].flatten
# => [#<List @elems=[1, 2]>, #<List @elems=[3, 4]>, #<List @elems=[5, 6]>]

However, if we define the #to_ary type coercion method (remember that’s the implicit option), then Ruby will happily call this method for us, as in:

class List
  def to_ary
    @elems
  end
end

[List.new([1, 2]), List.new([3, 4]), List.new([5, 6])].flatten
# => [1, 2, 3, 4, 5, 6]

You can see how it’s implemented in various Ruby implementations like CRuby, TruffleRuby, and Rubinius. You can also see how it’s implemented in the Sorbet type system, which handles these kinds of special conversions in what it calls “instrinsic” methods.

This is an example of type coercion that happens because of a standard library method call. But as mentioned above in the definition of type coercion, it can also happen as a result of syntax.

Type coercion from syntax

Let’s say we have an array of integers and we want to convert them into an array of strings. We can accomplish that with the #map method that accepts a block and a type cast (the explicit option), as in:

[1, 2, 3].map { |value| value.to_s }
# => ["1", "2", "3"]

We can also create a proc that will be used for the mapping and then pass it as the block for the method by using the & unary operator, as in:

mapping = proc { |value| value.to_s }
[1, 2, 3].map(&mapping)

From the perspective of the #map method these two snippets are equivalent as in both it receives a block. There’s also the more terse version of this type conversion that most Rubyists prefer, which is to instead pass a symbol that represents the mapping, as in:

mapping = :to_s
[1, 2, 3].map(&mapping)
# => ["1", "2", "3"]

In the above snippet, we’re taking advantage of the fact that Symbol has the #to_proc implicit type conversion method defined on it, which Ruby will take advantage of when using the & operator. In fact, it will do this on any object that has that conversion method defined, as in:

class Double
  def to_proc
    proc { |value| value * 2 }
  end
end

[1, 2, 3].map(&Double.new)
# => [2, 4, 6]

This is an example of syntax implicitly calling methods on objects in order to achieve a type conversion.

Type conversion interfaces

Some languages have explicit interface constructs that allow the developer to define a set of methods that an object should respond to in order to say they “implement” that interface. (These are also sometimes called traits.) In Ruby they are implicit, though this doesn’t make them any less common. Here is a list of the most common ones that I could find in use within the standard library:

There are also the relatively new pattern matching interfaces:

As well as some very esoteric ones, like sleep accepting anything that responds to divmod.

All of these methods can be used for type coercion or type casting, depending on the context. Sometimes Ruby will trigger these method calls implicitly but any developer can also call these methods explicitly. Below I’ll go into some more details about where you can find these conversions and how they’re used.

to_a

Used by the Kernel.Array and Queue#initialize methods, but much more commonly used by the splat operator. For example, if you have an object that responds to #to_a, it can be splatted into an array, as in:

class List
  def initialize(elems)
    @elems = elems
  end

  def to_a
    @elems
  end
end

[1, *List.new([2, 3]), 4]
# => [1, 2, 3, 4]

to_ary

List of methods
  • Array#&
  • Array#+
  • Array#-
  • Array#<=>
  • Array#==
  • Array#[]=
  • Array#concat
  • Array#difference
  • Array#flatten
  • Array#flatten!
  • Array#intersection
  • Array#join
  • Array#product
  • Array#replace
  • Array#to_h
  • Array#transpose
  • Array#union
  • Array#zip
  • Array.initialize
  • Array.new
  • Array.try_convert
  • Enumerabe#flat_map
  • Enumerable#collect_concat
  • Enumerable#to_h
  • Enumerable#zip
  • Hash#to_h
  • Hash.[]
  • IO#puts
  • Kernel.Array
  • Proc#===
  • Proc#call
  • Proc#yield
  • Process.exec
  • Process.spawn
  • String#%
  • Struct#to_h

#to_ary is used in syntax for destructuring and multiple assignment. For example, let’s say you have a Point class that represents a point in 2D space, and you want to print out just the x coordinate from a list of points. You could define the class such as:

class Point
  attr_reader :x, :y

  def initialize(x, y)
    @x = x
    @y = y
  end
end

Then when you loop through to print the x coordinate, you would:

[Point.new(1, 2), Point.new(3, 4)].each { |point| puts point.x }

This works, but you can take advantage of the fact that block arguments can destructure values by defining #to_ary, as in:

class Point
  def to_ary
    [x, y]
  end
end

[Point.new(1, 2), Point.new(3, 4)].each { |(x, y)| puts x }

Similarly, you can destructure within multiple assignment as in:

x, y = Point.new(1, 2)
x
# => 1

to_hash

List of methods
  • Enumerable#tally
  • Hash#merge
  • Hash#replace
  • Hash#update
  • Hash.[]
  • Hash.try_convert
  • Kernel.Hash
  • Process.spawn

#to_hash is used with the double splat (**) operator. For example, if you have some kind of object that represents parameters being sent to an HTTP endpoint, you can:

class Parameters
  def initialize(params)
    @params = params
  end

  def to_hash
    @params
  end
end

class Job
  def initialize(foo:, bar:); end
end

parameters = Parameters.new(foo: "foo", bar: "bar")
Job.new(**parameters)

In the above example, we’re taking advantage of the implicit type conversion performed by the double splat operator in order to call #to_hash on our Parameters object.

to_s

List of methods
  • Array#inspect
  • Array#pack
  • Array#to_s
  • Exception#to_s
  • File#printf
  • Hash#to_s
  • IO#binwrite
  • IO#print
  • IO#puts
  • IO#syswrite
  • IO#write
  • IO#write_nonblock
  • Kernel#String
  • Kernel#warn
  • Kernel.sprintf
  • String#%
  • String#gsub
  • String#sub

#to_s is called implicitly whenever an object is used within string interpolation. (It’s minorly inconsistent in that most of the time in Ruby #to_str is used to implicitly convert to a String.) So, for example, if you have:

"#{123}"

this is equivalent to calling:

123.to_s

This is why in most Ruby linters it will show a violation for the code "#{123.to_s}" because it’s redundant.

to_str

List of methods
  • Array#*
  • Array#join
  • Array#pack
  • Binding#local_variable_defined?
  • Binding#local_variable_set
  • Dir.chdir
  • ENV.[]
  • ENV.[]=
  • ENV.assoc
  • ENV.rassoc
  • ENV.store
  • Encoding.Converter.asciicompat_encoding
  • Encoding.Converter.new
  • Encoding.default_external=
  • Encoding.default_internal=
  • File#delete
  • File#path
  • File#to_path
  • File#unlink
  • File.chmod
  • File.join
  • File.new
  • File.split
  • IO#each
  • IO#each_line
  • IO#gets
  • IO#read
  • IO#readlines
  • IO#set_encoding
  • IO#sysread
  • IO#ungetbyte
  • IO#ungetc
  • IO.for_fd
  • IO.foreach
  • IO.new
  • IO.open
  • IO.pipe
  • IO.popen
  • IO.printf
  • IO.readlines
  • Kernel#gsub
  • Kernel#instance_variable_get
  • Kernel#open
  • Kernel#remove_instance_variable
  • Kernel#require
  • Kernel#require_relative
  • Kernel.`
  • Module#alias_method
  • Module#attr
  • Module#attr_accessor
  • Module#attr_reader
  • Module#attr_writer
  • Module#class_eval
  • Module#class_variable_defined?
  • Module#class_variable_get
  • Module#class_variable_set
  • Module#const_defined?
  • Module#const_get
  • Module#const_set
  • Module#const_source_location
  • Module#method_defined?
  • Module#module_eval
  • Module#module_function
  • Module#protected_method_defined?
  • Module#remove_const
  • Process.getrlimit
  • Process.setrlimit
  • Process.spawn
  • Regexp.union
  • String#%
  • String#+
  • String#<<
  • String#<=>
  • String#==
  • String#===
  • String#[]=
  • String#casecmp
  • String#center
  • String#chomp
  • String#concat
  • String#count
  • String#crypt
  • String#delete
  • String#delete_prefix
  • String#delete_suffix
  • String#each_line
  • String#encode
  • String#encode!
  • String#force_encoding
  • String#include?
  • String#index
  • String#initialize
  • String#insert
  • String#lines
  • String#ljust
  • String#partition
  • String#prepend
  • String#replace
  • String#rjust
  • String#rpartition
  • String#scan
  • String#split
  • String#squeeze
  • String#sub
  • String#tr
  • String#tr_s
  • String#unpack
  • String.try_convert
  • Thread#name=
  • Time#getlocal
  • Time#localtime
  • Time.gm
  • Time.local
  • Time.mktime
  • Time.new

This is the main conversion method for objects into strings. It’s used all over the place in the standard library. This interface is very popular and can be seen even in recent pull requests to Ruby on Rails.

to_sym

This is a relatively common way to convert strings into symbols, but doesn’t get a ton of usage internally. The only method I could find in the standard library that converted an argument into a symbol using #to_sym was Tracepoint.new.

to_proc

Another one that doesn’t get used a ton internally, the only place I could find that used this in a method call was Hash#default_proc= (which will convert its only argument into a callable proc if it isn’t already one). #to_proc does get triggered through syntax, however, when passing a block argument (see the example above).

to_io

List of methods
  • File.directory?
  • File.size
  • File.size?
  • FileTest.directory?
  • IO#reopen
  • IO.select
  • IO.try_convert

This one is less well-known, but gets used a lot within the IO and File class to convert method arguments into objects that can be used as IO-like objects.

to_i

List of methods
  • Complex#to_i
  • File#printf
  • Kernel#Integer
  • Kernel.Integer
  • Kernel.sprintf
  • Numeric#to_int
  • String#%

This begins a series of conversion methods for Numeric subtypes, which all convert into each other. #to_i converts into an Integer object. It’s not all that commonly used compared in implicit ways compared to #to_int, but still gets some usage in the methods listed above. Much more commonly this is the method that is called on a string to convert into an integer (and it accepts as an argument a radix for this purpose).

to_int

List of methods
  • Array#*
  • Array#[]
  • Array#[]=
  • Array#at
  • Array#cycle
  • Array#delete_at
  • Array#drop
  • Array#fetch
  • Array#fill
  • Array#first
  • Array#flatten
  • Array#hash
  • Array#initialize
  • Array#insert
  • Array#last
  • Array#pack
  • Array#pop
  • Array#rotate
  • Array#sample
  • Array#shift
  • Array#shuffle
  • Array#slice
  • Array.new
  • Encoding::Converter#primitive_convert
  • Enumerable#cycle
  • Enumerable#drop
  • Enumerable#each_cons
  • Enumerable#each_slice
  • Enumerable#first
  • Enumerable#take
  • Enumerable#with_index
  • File#chmod
  • File#printf
  • File.fnmatch
  • File.fnmatch?
  • File.umask
  • IO#gets
  • IO#initialize
  • IO#lineno=
  • IO#pos
  • IO#putc
  • IO#tell
  • IO.for_fd
  • IO.foreach
  • IO.new
  • IO.open
  • IO.readlines
  • Integer#*
  • Integer#+
  • Integer#-
  • Integer#<<
  • Integer#>>
  • Integer#[]
  • Integer#allbits?
  • Integer#anybits?
  • Integer#nobits?
  • Integer#round
  • Kernel#Integer
  • Kernel#exit!
  • Kernel#exit
  • Kernel#putc
  • Kernel.Integer
  • Kernel.exit!
  • Kernel.exit
  • Kernel.putc
  • Kernel.rand
  • Kernel.sprintf
  • Kernel.srand
  • MatchData#begin
  • MatchData#end
  • Process.getrlimit
  • Process.setrlimit
  • Random#seed
  • Random.rand
  • Range#first
  • Range#last
  • Range#step
  • Regexp.last_match
  • String#%
  • String#*
  • String#[]
  • String#[]=
  • String#byteslice
  • String#center
  • String#index
  • String#insert
  • String#ljust
  • String#rindex
  • String#rjust
  • String#setbyte
  • String#slice
  • String#split
  • String#sum
  • String#to_i
  • Time#getlocal
  • Time#localtime
  • Time.at
  • Time.gm
  • Time.local
  • Time.mktime
  • Time.new
  • Time.new
  • Time.utc

This is a very commonly used conversion method for converting to Integer. A ton of methods will call this on arguments passed in to allow any kind of object to be used. It can also be triggered implicitly by setting the $. (the line number last read by the interpreter). I have no idea why that’s in there, but it is.

to_f

List of methods
  • Complex#to_f
  • File#printf
  • Integer#coerce
  • Kernel#Float
  • Kernel.Float
  • Kernel.sprintf
  • Math.cos
  • Numeric#ceil
  • Numeric#coerce
  • Numeric#fdiv
  • Numeric#floor
  • Numeric#round
  • Numeric#truncate
  • String#%

A method of converting an object into a float. Not all that commonly used except for when a developer wants to avoid integer division.

to_c

One of, if not the most, esoteric one I could find in the standard library. This is a way of converting an object into a Complex number type, which is only used by the Kernel.Complex method.

to_r

List of methods
  • Complex#to_r
  • Numeric#denominator
  • Numeric#numerator
  • Numeric#quo
  • Time#+
  • Time#-
  • Time#getlocal
  • Time#localtime
  • Time#new
  • Time.at

The final numeric conversion interface. #to_r is used to convert an object into a rational number. It gets most of its usage in the Time class.

to_regexp

A less-used interface for converting an object into a regular expression, this only gets used internally within the Regexp class in the Regexp.try_convert and Regexp.union methods.

to_path

List of methods
  • Dir#initialize
  • Dir.[]
  • Dir.chdir
  • Dir.children
  • Dir.chroot
  • Dir.each_child
  • Dir.entries
  • Dir.foreach
  • Dir.glob
  • Dir.mkdir
  • File#path
  • File#to_path
  • File.ftype
  • File.join
  • File.mkfifo
  • File.new
  • File.realpath
  • File::Stat#initialize
  • IO#reopen
  • IO#sysopen
  • IO.copy_stream
  • IO.foreach
  • IO.read
  • IO.readlines
  • Kernel#autoload
  • Kernel#open
  • Kernel#require
  • Kernel#require_relative
  • Kernel#test
  • Module#autoload
  • Process.spawn

I like this one a lot because it’s one of the few on this list that is named after the role that the converted object will fulfill as opposed to the type of object that is expected. That is to say, #to_path converts an object into a String that will function as the representation of a filepath. It’s used mostly within the Dir, File, and IO classes.

to_enum

Enumerable#zip interestingly allows you to pass any object that responds to #to_enum. This was the only mention of this method that I could find in the standard library.

to_open

Similarly to #to_enum, #to_open is also only used in one place: Kernel#open. Anything that you pass to that method that responds to #to_open will be converted implicitly.

Pattern matching

When pattern matching was introduced into Ruby, we got two additional methods for implicit type conversion: deconstruct and deconstruct_keys.

deconstruct

If you’re matching against an object as if it were an array, then deconstruct will be called implicitly. For example:

class List
  def initialize(elems)
    @elems = elems
  end

  def deconstruct
    @elems
  end
end

case List.new([1, 2, 3])
in 1, 2, 3
  # we've matched here successfully!
in *, 2, *
  # we would match here successfully too!
end

deconstruct_keys

If you’re matching against an object as if it were a hash, then deconstruct_keys will be called implicitly. For example:

class Parameters
  def initialize(params)
    @params = params
  end

  def deconstruct_keys(keys)
    @params.slice(keys)
  end
end

case Parameters.new(foo: 1, bar: 2)
in foo: Integer
  # we've matched here successfully!
in foo: 1, bar: 2
  # we would match here successfully too!
end

coerce

No discussion of type conversions in Ruby would be complete without mentioning the coerce method. coerce is an interesting little method that is used for converting between different numeric types. It allows you to effectively hook into methods like Integer#* without having to monkey-patch it. Say, for example, you were defining your own special class that you wanted to support numeric computations:

class Value
  attr_reader :number

  def initialize(number)
    @number = number
  end

  def *(other)
    Value.new(number * other)
  end
end

value = Value.new(2)
value * 3
# => #<Value @number=6>

This works well. However, if you reverse the operands for the * operator, Ruby breaks it down to 3.*(value), which results in TypeError (Value can't be coerced into Integer). If you define the coerce method, however, this can be accomplished, as in:

class Value
  def *(other)
    case other
    when Numeric
      Value.new(number * other)
    when Value
      Value.new(number * other.number)
    else
      if other.respond_to?(:coerce)
        self_equiv, other_equiv = other.coerce(self)
        self_equiv * other_equiv
      else
        raise TypeError, "#{other.class} can't be coerced into #{self.class}"
      end
    end
  end

  def coerce(other)
    case other
    when Numeric
      [Value.new(other), self]
    when Value
      self
    else
      raise TypeError, "#{self.class} can't be coerced into #{other.class}"
    end
  end
end

For good measure I’ve changed * to attempt to coerce its argument as well. Now all of the numeric types play nicely and you can run 3 * value without it breaking. You can see another example of this kind of coerce implementation in the standard library in the Matrix class as well as in Ruby on Rails in the Duration class.

Conclusion

Type conversion is a subtle art baked into the very syntax of the Ruby programming language. As with most everything programming related, use it with caution and with the context of the team that will be maintaining the software you’re writing. Especially with type coercion, it’s very easy to write code that is very difficult to reason about.

That said, using the existing interfaces from the standard library and defining interfaces within your own applications can lead to very beautiful code that never needs to perform type checks.

← Back to home