Ruby Association Intermediate Report
In accordance with the Ruby Association’s timeline, this is an intermediate report on the Ruby formatter project.
When I first proposed the project, here are the list of deliverables that I mentioned in the proposal:
- A definitive representation of the Ruby AST based on
ripper. It would be an additional shipped
Ripper::SexpBuilderPP) with Ruby. The difference is that every node has location information available on it. It will also involve documentation of every node type being shipped along with the parser.
- Updates and enhancements to the
prettyprintdoes not currently support all of the various node types that will be neccessary, so another pull request will be merging in additional functionality to the
prettyprintgem. This has the added benefit of allowing other developers to build formatters as well with the same infrastructure.
- The formatter itself that will convert the nodes from the
ripperparser into the
- A CLI for formatting files (this could be baked into the Ruby CLI as well). This will trigger the formatter on each of the files given.
- A language server that supports the
formatOnSaveoption. The idea here is to trigger the formatter whenever the developer hits save and watch everything snap into place.
Progress for each bullet is detailed below.
I’ve created an additional
ripper subclass here: https://github.com/kddnewton/syntax_tree/blob/main/lib/syntax_tree.rb. This file lives within the published
syntax_tree gem. Each node contains an instance of a
SyntaxTree::Location object that can be used to get definitive information about where it existed in the source. Each node also provides
attr_reader methods for each of the child nodes, which are all documented.
As a part of this work, I’ve also added documentation to all of the various node types that ship with
ripper here: https://kddnewton.com/ripper-docs/. Ideally, I’d like to upstream both the
syntax_tree AST builder and the
ripper documentation to make it easier for others to contribute and maintain it as a part of CRuby.
In order to support all of the necessary formatting capabilities of a Ruby language formatter, I’ve opened a pull request (https://github.com/ruby/ruby/pull/5163) against Ruby that adds a bunch of new functionality to the
prettyprint gem. That pull request itself has a lot of details on why the changes are necessary and details about how the gem is impacted.
The formatter itself is baked into the
syntax_tree gem. Each node has its own corresponding
format node (that functions in the same spirit as the
pretty_print method convention of accepting a
PrettyPrint object). For example, here is the code that handles formatting an
ARef node (a node in the syntax tree that corresponds to accessing a collection at an index like
As of the latest commit on the
main branch of the
syntax_tree supports all of the Ruby 3.1 syntax. As an additional guarantee of stability, I’ve added to the test suite a test that formats all of the files shipped with Ruby twice to test for idempotency.
syntax_tree gem now ships with an
stree executable that functions as a CLI for formatting files. It provides a lot of additional functionality is well (like displaying the syntax tree or the doc node print tree). One additional nicety that it provides is the ability to run
stree check **/* which will exit
1 if any files are not formatted as expected (which allows running this in a continuous integration environment).
Recently, I added language server support to the
syntax_tree project to support integrating with editors that support the language server protocol. The code for that lives here. Currently it supports the
textDocument/formatting request type, which allows
formatOnSave functionality (you can turn this on in your editor of choice today by manually bundling the language server).
One additional piece of functionality that the language server provides is the custom
syntaxTree/visualizing request. This request returns the syntax tree of the file corresponding to the given URI, which allows the requesting editor to display a tree-like structure inline with the code being edited. In VSCode, if you execute
syntaxTree.visualize, it will now open a side-by-side tab with the displayed tree.
I still have lots of functionality I’d like to bring to
syntax_tree and its related projects. I also have some far-flung dreams that may or may not come to fruition. First, here are the things that I definitely intend to complete before the end of this project:
- I’d like to upstream the
syntax_treeAST builder, the
ripperdocumentation, and the
prettyprintupdates. I’ll be asking for feedback on them soon, but ideally all of this would ship with CRuby to make it easier for it to stay up-to-date with syntax changes as they are built.
- I’d like to enhance the language server to not only provide the syntax tree as a custom request but to support it on hover so that you can hover over any syntactic structure in your code and have it explained. I think this will help both new programmers coming to Ruby to learn the syntax but also help veteran Rubyists learn new syntax as it comes out.
- I’d like to enhance the language server to better support incremental changes. At the moment, each time a file is changed the entire file is reparsed. This isn’t necessary because the change request comes with the changes ranges. We can instead only reparse the subset of the file that changed and replace the encapsulating nodes in the tree that correspond to those changes.
That’s the extent of the work that corresponds to the proposed work in the grant proposal. However, I have addition desires for other future work beyond the scope of this grant. That includes:
- A well-defined interface for programmatic code modification functionality. Currently you can replace nodes and have them formatted correctly, meaning you can programmatically change Ruby code. However, doing this is definitely not easy and requires a lot of knowledge of
syntax_treeinternals. Ideally this would be a lot easier.
- A backend for the
parsergem. Ideally I’d like to create an interface layer that would convert
syntax_treenodes into their
parsergem counterparts. I’d like to do this because it would make it trivial for gems that are consumers of the
parsergem to switch to using
syntax_treeas the parser for some additional speed boosts. Note that this wouldn’t mean switching off the
parsergem, it would just mean that the parsing would be faster.
- Decoupling the parsing functionality (the
rippersubclass) from the
syntax_treenode definitions. In the far future, this could potentially mean being able to switch out the parser backing
ripperto some other tool but maintaining all of the functionality built into the various node types.
The final report for this grant is due March 18th, and I will be publishing it here. If you’re interested in following this project, you can watch the
syntax_tree repository or check/subscribe to this blog.