Wednesday, April 12, 2006

Assignment in Ruby - Simple Scoped Assignment

The largest class of assignments in Ruby are the simple scoped assignments. Ruby represents assignment to a name in a particular scope with its own scope assignment node. I won't go into what nodes mean to the VM in this article but think of them as named expressions. The following table shows the simple scoped assignments and their related code and abstract syntax trees (AST).

ScopeAST NameCodeAST
Local:lasgna=nil[:lasgn, :a, [:nil]]
Instance:iasgn@a=nil[:iasgn, :@a, [:nil]]
Class:cvasgn@@a=nil[:cvasgn, :@@a, [:nil]]
Global:gasgn$a=nil[:gasgn, :$a, [:nil]]
Each of these ASTs was generated using ParseTree 1.4.1 and the following command:
echo "code" | parse_tree_show -f


As you can see, each of these has the same structure in the AST. By using different types of nodes for each one, it is simpler for the Ruby VM to determine where to look for a variable and set it.

The code below is a straight through SexpProcessor using the ParseTree gem and the included SexpProcessor as a base. This processor doesn't do anything exciting but by explicitly writing the process_type methods, we have exposed interaction points where we could do things like store a list of assigned variables, output funny messages or generate metrics.

Sexp stands for S-expression. ParseTree represents S-expressions in ruby as nested arrays of arrays. S-expressions are particularly well known for their use in most Lisp like languages to represent code and data. If you would like to know more about S-expressions, Google is your friend.

SexpProcessors have a few rules that you must observe in order to get a correct traversal of the expression. First, you need to understand that the dispatch method process(exp) is initially called for every pair of matched brackets. This is important because in our process_foo methods, we need to call process() on any members of exp that are arrays. If we don't, we essentially prune that piece of the S-expression from our processing. Usually, that is wrong. It is also important to not that process(exp) only takes arrays as input. If you call process(exp.shift) and the shifted element is a literal, you will have an error on your hands. process(exp) will then pass control onto process_foo(exp) where foo is the :literal that appears as the first element of the expression. (In our processor, because of the auto_shift_type in initialize, the first element of exp is shifted off before control passes to process_foo. This results in slightly cleaner, easier to read code in process foo. Instead of s(exp.shift, exp.shift, process(exp.shift)), we get the slightly more clear s(:iasgn, exp.shift, process(exp.shift)). This is obviously a personal preference.)

Another basic rule of SexpProcessor is that what comes in should be what comes out, or, in the case of auto_shift_type, what would have come out if auto_shift_type were false. This is best illustrated with an example.

process([:lasgn, :a, [nil]]) should return [:lasgn, :a, [nil]]

with auto_shift_type = false
process([:lasgn, :a, [nil]]) calls process_lasgn([:lasgn, :a, [nil]])

with auto_shift_type = true
process([:lasgn, :a, [nil]]) calls process_lasgn([:a, [nil]])

in either case process_lasgn() must return [:lasgn, :a, [nil]].

Still another rule, in a process_foo(exp) method, exp should be empty before the method returns. Failure to empty the exp is considered bad form (and is usually wrong too) and therefor raises an error. This is the quickest way to catch errors in coding like
# process_lasgn([:a, [nil]])
def process_lasgn(exp)
s(:lasgn, # auto_shift_type = true or this would be s(exp.shift,
exp.shift) # :a
# exp now == [[nil]]
end

This will really help you when you get the structure wrong for the s-expression representing a node.

The final bit of magic to understand before venturing off to write your own processor is s(). s(*args) is shorthand for Sexp.new(*args). It was added to keep things easier to read and as a user of ParseTree & SexpProcessor, I am sure you will appreciate it.

# A Straight Through SexpProcessor exposing
# the process_foo simple assignment methods.

begin require 'rubygems' rescue LoadError end
require 'parse_tree'
require 'sexp_processor'

class PassThroughProcessor < SexpProcessor
def initialize
super
self.auto_shift_type = true
end

def process_lasgn(exp)
s(:lasgn, exp.shift, process(exp.shift))
end

def process_iasgn(exp)
s(:iasgn, exp.shift, process(exp.shift))
end

def process_cvasgn(exp)
s(:cvasgn, exp.shift, process(exp.shift))
end

def process_gasgn(exp)
s(:gasgn, exp.shift, process(exp.shift))
end
end


You would invoke this processor on an unsuspecting class with the following magic incantation.

PassThroughProcessor.new.process(ParseTree.new.parse_tree(klass))


If you are a glutton for punishment, you could run it against PassThroughProcessor. (*Note: As is, this wouldn't create any output at all but the skeleton is all there. Enjoy yourself.*)

I have more than used up the time and space allotted for this article. I hope this was educational and I look forward to seeing you next time with Assignment in Ruby.

Monday, April 10, 2006

Assignment in Ruby

Would you believe Ruby has a dozen different ways to represent assignment internally? This may be something you don't expect to find when you go poking around the internals of things like the Ruby implementation or ParseTree. I know I didn't.

Over the coming weeks, I will:

  • document the 12 different representations
  • discuss what they are and what is unique about them
  • show you what ruby code creates each one and the AST (abstract syntax tree) generated by ParseTree for each
  • show you a code example of how you would write your own parse_foo method in a SexpParser if you wanted to track something about each one
  • show you what we have done (if anything) with each one in CheckR and explain why


Join me in an journey under the covers as I explore assignment in Ruby.

Saturday, April 08, 2006

Ping Pong Programming - No Pairing

The paradigm of ping pong pairing can be used without pairing. I have been ping pong programming remotely with Pat Eyler for a couple weeks now and it has been tremendously effective. Basically, the process for ping ponging remotely is the same as ping pong pairing but instead of passing a keyboard back and forth, we check our code into Subversion. A typical scenario looks something like this:

Pat updates his code to make sure everything is current. He runs the test suite to see what the currently failing test is. After some analysis, he fixes the code and makes the test pass. Now, he checks his code into Subversion. Next, he writes a new test and confirms that it fails. Because he has to communicate the change to me, he checks his failing test into Subversion with a comment about the current test. Finally, he notifies me that the ball is in my court. It is my turn to repeat the process.

This style of development is more powerful than simple Test Driven Development for the same reasons I mentioned before. By posing tests for your opponent, you develop stronger tests than you do for yourself.

Remote ping pong does have its shortcomings though. In particular, it doesn’t deal with time imbalances very well. While Pat is implementing my test, I am doing nothing. If I have several hours to devote to coding and Pat only has a few minutes, I will spend a lot of time waiting for a test when I could be coding.

I want to address the time disparity concern by figuring out what works in ping pong and creating a model with all the benefits but without the time constrained challenges. In addition, I would like to figure out how to apply the principles of ping pong to teams of more than two people. A tenative name for this model of development is Test Exchange.

I will describe Test Exchange in a later post.

Ping Pong Pair Programming

I don’t know who first coined the name Ping Pong Pair Programming but I know from the first time I heard of it, I was convinced. It seems likely that PPP (ping pong pairing) arose out of a need to address some of the challenges of pair programming. Namely, the difficulty of maintaining attention when you are not the one at the keyboard.

PPP is also a derivative of Test Driven Development. In TDD, you write your test first and you never write code for a feature that is not in a failing test already. It is a great way to ensure a robust test suite for all of your development. TDD works like this: 1) write a test, 2) confirm that it fails, 3) write code to make the test pass, 4) confirm the test passes, 5) confirm no other tests are broken, 6) refactor mercilessly. TDD is very effective but there are challenges. It is very easy to shortcut your tests a little by writing more than you really need. It is also easy to avoid testing the really hard cases.

Ping pong pairing works with two people. Let us call them Bob and Tom. The day starts with Bob at the keyboard. All the tests pass. Bob writes a test. Bob confirms it fails. The keyboard passes to Tom. Tom works on making the test pass with Bob at his side, helping him along. Tom finishes his implementation and confirms all the tests pass. Tom writes a new test and confirms it fails. The keyboard returns to Bob.

PPP is very similar to TDD in that it is made up of te same steps. The difference is that after a failing test is implemented, the other developer takes control. This can develop into a friendly game. Bob tries to write a good test that forces Tom to implement real and valuable functionality. Tom tries to pass Bob’s test with the simplest thing that could possibly work. The harder Bob and Tom try to push the difficult work to the other person, the more robust the test suite and the finished code become.

Ping pong pairing addresses the psychological challenges in both pair programming and test driven development. Because control switches so frequently, partners stay more engaged in what is going on. Because you don’t have to implement the code for your own tests, you are much less likely to take shortcuts just to allow yourself to get by.

Ping pong pairing is hard work and is has been very rewarding when I have done it. It results in much better code and more robust tests. It also helps to keep things moving at a good pace. If you pair program, I strongly recommend you try your hand at PPP.

StL.rb Hacking Nights

Seattle.rb has a meeting every week. Once a month they have presentations and the other weeks are dedicated to hacking nights. I have envied this for quite a while and I finally decided to do something about it.

Tuesday, April 4 was the first ever StL.rb hacking night. After announcing it a couple weeks in advance, I spent 3 hours at Borders and 2 other members of StL.rb made it. Borders wasn’t the best place to hold the meetings so next week we are going to try St. Louis Bread Company. Most of you might know it as Panera Bread Company. They didn’t change the name in St. Louis. I guess people might have gotten upset.

I’ve made myself available for hacking any night people want but the weekly hacking nights are going to be on the same weeknight as our regular meeting. Since that is Tuesday, I will be too busy to be jealous of Seattle.rb’s meetings on the same night.

CheckR

I started CheckR with a friend. It hopes to be a static source analysis tool for Ruby code, along the lines of lint for C/C++. We are actively developing on it in a remote, ping pong, test driven development way. We are very close to our first alpha release. The first release will hopefully add some small value and be able to catch a few errors earlier in the development cycle. I guess we need autocheckr too. And we definitely need to run checkr against itself.

Thursday, April 06, 2006

Emacs + ZenTest = ZenMacs?

Good friends make you a better person. Because of the recent example of Ryan Davis, Eric Hodel and especially Pat Eyler, and the less recent example of Gus Mueller, I am a better programmer who does better things. The following is one of those things.

I have been learning Emacs and the ZenTest suite at the same time. This morning I came up with the following bit of goodness.



M-x autotest from a buffer whose current dir is the root of your project results in:



You now have a shell with autotest running in the bottom left pane, an eshell ready for your subversion commands in the bottom right pane and you can work with your project in the top pane.

This is due to a macro I wrote to create this setup for me automagically. The macro does the following: (*warning emacs speak*)

C-x 2, C-x o, C-x 3
M-x shell
autotest
M-x set-variable
comint-scroll-to-bottom-on-output
all
C-x o
M-x eshell
M-x set-variable
comint-scroll-to-bottom-on-output
all


I have included this little macro in my .emacs file with the following bit of elisp:

(fset 'autotest
[?\C-x ?2 ?\C-x ?o ?\C-x ?3 ?\M-x ?s ?h ?e ?l ?l ?\C-m ?a ?u ?t ?o ?t ?e ?s ?t ?\C-m ?\M-x ?s ?e ?t ?- ?v ?a ?r ?i ?a ?b ?l ?e ?\C-m ?c ?o ?m ?i ?n ?t ?- ?s ?c ?r ?o ?l ?l ?- ?t ?o ?- ?b ?o ?t ?t ?o ?m ?- ?o ?n ?- ?o ?u ?t ?p ?u ?t ?\C-m ?a ?l ?l ?\C-m ?\C-x ?o ?\M-x ?e ?s ?h ?e ?l ?l ?\C-m ?\M-x ?s ?e ?t ?- ?v ?a ?r ?i ?a ?b ?l ?e ?\C-m ?c ?o ?m ?i ?n ?t ?- ?s ?c ?r ?o ?l ?l ?- ?t ?o ?- ?b ?o ?t ?t ?o ?m ?- ?o ?n ?- ?o ?u ?t ?p ?u ?t ?\C-m ?a ?l ?l ?\C-m])


Enjoy and let me know if you can think of any improvements. In particular, I would like to prompt for a directory path at the beginning and I need to figure out how to handle pre-existing *shell* or *eshell* buffers. Still, as is, this does what I need.

Wednesday, April 05, 2006

___Pandora____Just____Rocks____

For the first time ever, I find myself contemplating the use of a blink tag. Discretion has prevented it but it was a close race.

Pandora is incredible. My tastes in music have always been fairly eclectic. I like stuff from all over the board but I don't really understand why I like what I like. I have been using Pandora for a couple of days and I already understand more about my taste in music than 30+ years of developing that taste taught me.

As you listen to a custom play list based on Pandora's analysis of something you said you like (for example, I am listening to the Evanescence station right now.), take the time to explore you options. If you hear something you really like, give it a thumbs up and the qualities of that track will be used to select future tracks. If you hear something you don't like, give it a thumbs down. You will never hear that track again on that station. Also, Pandora seems to learn from that, and maybe the one difference that had from other tracks will be thrown out. In a frighteningly quick amount of time, artists I like start cross polinating channels and emerging spontaneously.

At any time if you want to know why Pandora suggested a track, you can ask it. It will give a quick summary of the musical properties that contributed to its selection. If you keep the process up, fairly quickly you will start to see common themes. If you pay attention to those descriptions, you start to see a pattern emerge. It turns out I really like minor key tonality, mixed acoustic and electric instrumentation, subtle use of vocal harmony and mild rhythmic syncopation. Woah. When I look back, almost everything that has ever knocked my socks off has combined a few of these features. Suddenly the most disparate of my tastes come together and make perfect sense.

I was totally unprepared to learn about myself from a random music link I found on the net. If you haven't tried it yet, why are you reading this tripe?! Expand your horizons and get yourself over to Pandora.

Tuesday, April 04, 2006

Opening Pandora's Box

Wow. Just bloody freaking wow. I don't know how I got there but you have got to check out Pandora if you like music, any music, even just a little tiny bit. The interface isn't great. I have no idea where they got it but it really does suck. Still, if you can help me find more music in some of the obscure areas I like, then you are golden and Pandora does it with spades.

Monday, April 03, 2006

unit_diff is your friend

Pat introduced me to unit_diff a while ago. At the time, it didn't resonate very well. There isn't much documentation and I didn't get what it was supposed to do. Now, I know exactly what it does and I can give first hand testimony to how great it is.

unit_diff is one of the tools included in ZenTest, along with zentest, multiruby and autotest. If you haven't installed it yet and you are using Test/Unit (You are using Test/Unit, right?), then you absolutely must grab the gem. $ sudo gem install ZenTest unit_diff runs diff on the expected and actual output of failed assert_equal tests in your test cases. Pipe your test output into unit_diff and relax with all the extra time you now have since you don't have to search through pages and pages of output to find the 2 characters that were different.

Actual command example I used recently:
$ ruby -Ilib test/test_parse_tree.rb | unit_diff
Loaded suite test/test_parse_tree
Started
.....FF.........................
Finished in 0.347 seconds.

1) Failure:
test_case_stmt2(TestParseTree) [(eval):1]:
6,10c6,8
< [:case,
< nil,
< [:when, [:array, [:lit, 1]], [:lit, 2]],
< [:when, [:array, [:lit, 3]], [:lit, 4]],
< [:lit, 5]]]]]
---
> [:when, [:array, [:lit, 1]], [:lit, 2]],
> [:when, [:array, [:lit, 3]], [:lit, 4]],
> [:lit, 5]]]]

2) Failure:
test_class(TestParseTree) [test/test_parse_tree.rb:370]:
Must return a lot of shit.
76,80c76,78
< [:case,
< nil,
< [:when, [:array, [:lit, 1]], [:lit, 2]],
< [:when, [:array, [:lit, 3]], [:lit, 4]],
< [:lit, 5]]]]],
---
> [:when, [:array, [:lit, 1]], [:lit, 2]],
> [:when, [:array, [:lit, 3]], [:lit, 4]],
> [:lit, 5]]]],

32 tests, 32 assertions, 2 failures, 0 errors


Without unit_diff, this was nearly 700 lines of output. Imagine how hard that was to process.

Debugging ParseTree (or Walking in Great Footsteps)

In the course of creating CheckR, Pat Eyler and I discovered the following strangeness:

First example. Case with expression followed by multiple whens and an else.
$ echo 'case 1; when 2; 3; when 4; 5; else; 6;end' | parse_tree_show -f
[[:class,
:Example,
:Object,
[:defn,
:example,
[:scope,
[:block,
[:args],
[:case,
[:lit, 1],
[:when, [:array, [:lit, 2]], [:lit, 3]],
[:when, [:array, [:lit, 4]], [:lit, 5]],
[:lit, 6]]]]]]]
$

This results in a clear AST (abstract syntax tree).

Second example. Case with no expression followed by identical whens and else.

$ echo 'case; when 2; 3; when 4; 5; else; 6;end' | parse_tree_show -f
[[:class,
:Example,
:Object,
[:defn,
:example,
[:scope, [:block, [:args], [:when, [:array, [:lit, 2]], [:lit, 3]]]]]]]
$


Since the only change was making case 1;' into 'case;, this AST is far from clear and is obviously missing things. There is no case node, no second when node and no else value. Unfortunately, for CheckR, we need the when nodes for our testing. Something has to be done.

Pat is on very good terms with drbrain (Eric Hodel) and zenspider (Ryan Davis) as the three of them ran Seattle.rb together. A few IMs later the bug was confirmed against the development branch of ParseTree. drbrain recorded a bug on the ParseTree RubyForge bug tracker and assigned it to zenspider.

Because I want to be a responsible netizen and (let's admit it) because I want to be cool like the big boys, I decided to create a test case for inclusion in the ParseTree test suite. Then, because it would be even cooler, and because Pat is always there egging me on, I thought I would take a whack at fixing ParseTree myself, with Pat's help.

Because of Pat's relationship with zenspider, we had access to a snapshot of the development tips of ParseTree. zenspider is not using RubyForge to host the dev code, so not everyone has access to this. As a result, I will document the rest of this story as if we had access to the same things the world has access to. All references to source code will be to the released code for ParseTree-1.3.7.

First the test case. This is very likely to be wrong since I don't have a correct AST of the case;when syntax but it is a start. ParseTree uses two files to dynamically generate its test case, something.rb and test_parse_tree.rb. something.rb contains method and class definitions to be parsed. test_parse_tree.rb contains expected ASTs to be compared against the ASTs generated during the tests. All it takes to add a new test case is add the method definition to something.rb and the expected AST to test_parse_tree.rb.

Before we go mucking things up, we need to know where the test suite stands as released. We have to make sure we don't break anything else when we add our own tests.

The following results were produced on Cygwin. It was also necessary on Cygwin to comment out $TESTING = true in test_composite_sexp_processor.rb and test_sexp_processor.rb in order to get RubyInline to work correctly when compiling for the test. Many thanks to drbrain for his patience and help with that particular problem.

$ cd /usr/lib/ruby/gems/1.8/gems/ParseTree-1.3.7
$ make test
Loaded suite test/test_all
Started
.....F.F..F......F..............................F................................
Finished in 0.328 seconds.

-- output snipped due to length --

81 tests, 94 assertions, 5 failures, 0 errors
make: *** [test] Error 1
$


If I recall, there were only 4 errors in linux when I ran this same test. Either way, it is very important to know about the failing tests before we start. We don't want to assume we broke more than we intended when we add our tests. (As an asside, the dev branch doesn't have these errors, allowing us to be even more confident in what we do.) Now, on to the test.

in test/something.rb
def case_stmt2
case
when 1
2
when 3
4
else
5
end
end


This is fairly simple but contains everything I know about the problem to date. Namely, ParseTree seems to eat the case, when 3 and else.

in test/test_parse_tree.rb
 @@case_stmt2 = [:defn, :case_stmt2,
[:scope,
[:block,
[:args],
[:case,
[:when, [:array, [:lit, 1]], [:lit, 2]],
[:when, [:array, [:lit, 3]], [:lit, 4]],
[:lit, 5]]]]]


At first glance, it seems fairly difficult to generate this and it would have been if I had done it by hand. The easy way was to run the following command, steal its output and munge it just a little bit.

$ echo 'class Blah; def case_stmt2; case true; when 1; 2; when 3; 4; else; 5; end; end; end' | parse_tree_show
[[:class,
:Blah,
:Object,
[:defn,
:case_stmt2,
[:scope,
[:block,
[:args],
[:case,
[:true],
[:when, [:array, [:lit, 1]], [:lit, 2]],
[:when, [:array, [:lit, 3]], [:lit, 4]],
[:lit, 5]]]]]]]


Looking at the other expected trees, we see they all start with the method definition. Also, we need to remove the [:true] because it doesn't appear in our test method. I want to reiterate, the resulting expectation, recorded above in test_parse_tree.rb may not be correct but it is a place to start.

Finally, we run our test again.

$ make test
ruby -w -Ilib:bin:../../RubyInline/dev test/test_all.rb
Loaded suite test/test_all
Started
.....F.F..FF......F..............................F................................
Finished in 0.354 seconds.

-- output snipped due to length --

3) Failure:
test_case_stmt2(TestParseTree) [(eval):1]:
<[:defn,
:case_stmt2,
[:scope,
[:block,
[:args],
[:case,
[:when, [:array, [:lit, 1]], [:lit, 2]],
[:when, [:array, [:lit, 3]], [:lit, 4]],
[:lit, 5]]]]]> expected but was
<[:defn,
:case_stmt2,
[:scope, [:block, [:args], [:when, [:array, [:lit, 1]], [:lit, 2]]]]]>.

-- output snipped due to length --

82 tests, 95 assertions, 6 failures, 0 errors
make: *** [test] Error 1
$


One new test failure and it's our test. Perfect. Also, if you look closely, you will see our new test is included in the massive output of test_class. That's important because even if we started with no test failures at all, our addition of one test would cause two failures. test_class is an integration test of the class as a whole (including all methods) and ensures that working methods still parse in a sane way as a class and also that the AST for the class definition itself is correct.

Now, any sane person would stop here, submit the test to the ParseTree developers and call it a day. Unfortunately, neither Pat nor I appears to be sane, so we started chasing the cause of the failure.

To Be Continued....