Tuesday, June 3, 2008

Asymmetrical database structures between production and non-production environments

So long time ago, before I was born as a consultant, there was some specific needs for the database tables. What we need was a table to mock external databases that we don't have access to in the development environment but the production does. What we did was creating a table right in the test cases and deal with it there.

Problems followed.

For one, MySQL and Rails doesn't have support for nested transaction.
For two, All tests written are in transaction that is rolled back every time the method ends.
For three, we need to ensure that the implementation employs the transaction mechanism somehow.

Whenever we create a new table, the transaction stops working. Whatever we did to the database, it was showing in the later test methods. Therefore there is no way we can ensure that when the transaction was supposed to be rolled back, it was indeed rolled back.

Solution:

The best we could do at that time was clearing out the tables ourselves. Andy Kotlinski wrote an assertion called assert_in_transaction to work around the problem. Talk to him if you would like more information on it.

Several months passed. The project was shelved for whatever reason.
Yet other months passed. The client decided to dig the zombie out of the grave.

All hell broke lose.

Somehow, Rails 2.0 doesn't allow us to work around it the way we did any more. The reason was something I couldn't remember any more because I forgot to blog about it at the time of digging that grave. What do we do?

Solution:

Instead of creating the tables in the test methods, I thought of a different way.

I want to have the transaction working.
I hate putting Data Description Language into the tests.
I like the "Elegance: Simple and Powerful", quoting Dr. David Matuszek from the University of Pennsylvania.

It turns out that we completely overlooked the power of Ruby. Database migrations are Ruby files. It means we can do pretty much anything a Ruby file can do. So we employ a simple condition to the migration

if RAILS_ENV == 'test'
create_your_fancy_test_tables
else
do_some_funny_stuff
end

Not bad, but still, not good. Why? Because we also want it in the development database too. And what happens when you have 8 different environments and you have to hand pick them?

Take two:

unless ['production'].include? RAILS_ENV do
create_your_fancy_test_tables
else
do_some_funny_stuff
end

Better. But still, what happens when you have 20 different migrations ahead?

Take three:

Define a module like so:

module SkipEnv
def skip_env
unless ['production'].include? RAILS_ENV do
yield
end
end
end

Put the module in the lib directory. Then, in the migration:

class << self do; include SkipEnv; end

def self.up
skip_env do
create_your_fancy_test_tables
end
end

def self.down
skip_env do
drop_your_fancy_test_tables
end
end

Hmm.. something's missing: the do_your_fancy_stuff. It turns out, our do_your_fancy_stuff does absolutely nothing, so we just cut it out.

It has to be a class method because the migration up and down methods are class methods.

Alright, looks good. We can create test tables anywhere except production, or any other environments you put into the array, and we can choose to skip environments in anything having access to the lib folder of the rails app, not only migrations. Awesome.

Further refinement would be to put it as a plugin, and install it in any Rails app you happen to have the need for such mechanism.

We didn't do that, because the zombie was buried back in the ground... again. This time, the reason is that to fix it and make it works for Rails 2 and have ActiveResource support, we would need to take a month and work on it full time.

Thursday, May 8, 2008

Pagination with ActiveResource

Ok, so we have a RESTful web service written in Ruby on Rails which has thousands of records to display. Viewing them all at once times out the client badly. What do we do? Pagination comes into the rescue, of course.

How do we accomplish that?

First try:
We added will_paginate to both the REST server and the Rails client. We render using the erb to include the total_entries and per_page attributes required by the will_paginate in the client side.

Problems arise with this. Hash conversions extension of Rails only expects that an array type node has only one kind of child, and it's the list of what it represents. However, we put the two attributes into the returned array too. What happened? Since XML is parsed out as a hash in the first place, and hash is unordered. The conversions extensions will pick any value from the hash whose key is not 'type'. So, sometimes it gets the correct things, other times, it doesn't. The only error message it sends out is "can't typecast #{entries}" where entries is mistakenly the value of either our total_entries or per_page.

It worked in ALL of our entities we paginated, except the last one. With the same code, doing only one different thing in one different cases, I pulled 75 hairs out of my ugly beard. What could possibly go wrong? Is it the REST server? Is it the find? NO! They behave in other entities, except this one. ... ok... stack trace.. ok, meta programming, we can't go any further... oh my god... it's the conversions!!!

To plow ahead and go forward with this approach, we can change the conversions.rb to detect anything NOT 'type', 'per_page', or 'total_entries'. BUT my pair over here is very strongly against changing rails. So came his idea ...

Take 2:
Apparently it's quite hard to have an array with a custom instance method. So, we don't do it. On the REST side, we don't include will_paginate at all. On the client side, we call the REST like so:

Model.find(:all, :params => {:limit => blah, :offset => blah})

blah's are calculated according to which page it is and how many per page we want.

On the REST side, we have to process the params a little bit. The params are sent in with strings as keys, all we need to do is change them all to symbols.

Model.find(:all, {:limit => params[:limit], :offset => params[:offset]})

As you can see here, you can access the params using the symbol as the key, but it is actually stored as strings, and the error you see is:

can't recognize key limit, offset

Not very obvious, and I lost 20 hair of my beard figuring this out.
Also, you need an extra action on the REST side to tell how many record it has. Call that from the client and keep it as a param for the WillPaginate::Collection creation


This way, we get only what we need from the REST, the big half purpose of pagination. (The lesser half is the usability.) Now, we construct the WillPaginate::Collection like so:

@entities = WillPaginate::Collection.create(page, per_page, count, returned_array)

Finally, in the view, you paginate the @entities as you would a paginate collection you got from the ActiveRecord counterpart.

Thanks Arnold for coming up with this idea for me to blog and brag about solving this. LOL