Ruby is a single pass parser! Eeck!

Okay, I’ve been using ruby for a while now and I really like it, so don’t flame me. BUT! Ruby appears to be a single pass parser. This means if you have a class that uses another class in the same file, you have to define them in the correct order. Yeah, this is something that most languages have fixed a long time ago using a two pass parsing strategy where you collect the symbols in the first pass and then verify and compile in the second pass (or some variation on that). So, I’m annoyed a bit at Ruby for missing the boat on that one. Hopefully they fix this.

One solution is to define the class is a separate file and use a require to include that. This seems to be the best solution at the moment and will support modules as well, but really Ruby should handle both types of declarations regardless of ordering if they intend to let you define multiple classes in a single file.

6 thoughts on “Ruby is a single pass parser! Eeck!

  1. Ruby is an interpreted language. Therefore you’ll have problems when you evaluate code that causes the evaluation of unknown symbols. The code below runs fine, though.

    class Account
      def initialize
        @withdrawals = []
      end
    
      def withdraw(amount)
        @withdrawals
      end
    end
    

    Like

  2. Comment fixed somewhat. The code still isn’t complete, so I’m not sure what it did exactly. You should definitely use PRE tags rather than CODE tags for code snippets.

    Whether or not a language is interpreted or not, the parser still has to parse an entire file before execution. It needs to build the AST. Ruby doesn’t parse the entire file but appears to only parse the file until it see the class in question and then stops. Meaning that the AST doesn’t have all the symbols from the file (eeck!). This could be a performance thing, but I doubt it.

    I think the code snippet you supplied illustrates two things and both are runtime not parse time (it got truncated so I’m not certain). First that ruby class variables can be declared anywhere without problems because the class is not fixed and you can add new variables and methods at anytime. And second that a class is always initialized by calling the initialize method.

    Like

  3. The code I posted was just an example of referring to another class before its definition without any problem.

    An interpreter doesn’t necessarily have to parse the entire file. Typically, they read a line, then evaluate it. That is why that code blew up. Here’s a shorter code sample to illustrate that it isn’t ruby’s problem; it’s not a problem at all, really.

    # python
    f()
    def f():
    print “f() called”

    Like

  4. Hmmm not sure how you got that to work if the classes are in the same file. Like this blows chunks for me:

    class Test1
      def foo
        puts Test2.new.bar
      end
    end
    
    t = Test1.new.foo
    
    class Test2
      def bar
        puts "bar"
      end
    end
    

    I get this error

    test.rb:3:in `foo': uninitialized constant Test1::Test2 (NameError)
    
            from test.rb:7
    
    

    The issue is that in order to parse this code the interpreter starts at the top of the file and starts parsing. It encounters the class Test1 and probably (I haven’t looked at the code) builds a complete AST/symbol-table for it. Next it sees the plain line of code and parses it building another AST. Since that line of code belongs to the global space it is executed. This in turn executes the Test1 method called foo, which tries to create Test2. Since the parser never got to the Test2 definition, explosions. If ruby used a two pass parser it would build the AST/symbol-table for both classes and the global scope and then execute the global scoped code. Then when it encountered the Test2 reference it would already have the AST/symbols for it and could execute it.

    Of course I’m guessing a lot at the implementation, but most programming languages regardless of compiled or interpreted build symbol tables and syntax trees and all that jazz before anything is executed.

    Like

  5. The thing about ruby is that it’s mostly interpreted line-by-line in a very naive manner. This makes it possible to use things like attr_accessor/include/private/module_function.

    So, going through it line by line gives you something like: http://p.ramaze.net/17497 – that only covers a small part of what’s made possible by this approach. What is most important in my opinion, is the fact that every class is just another object assigned to a constant.

    Some people have experimented with using const_missing, which returns a symbol of the requested constant and resolves it at a later point when needed, but that wasn’t too successful.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s