Hi All,
Thanks, John, for your notes from the Jan. 27th class.
I wanted to let you all know that there is a new version of RubyRx in HerokuGarden. The programs are called rubyrx.rb and summary_statistics.rb, and they are in app/models.
This latest version is 4 times faster than the previous version. So RubyRx has gone through 2 major iterations (and many minor ones) and for both of the major ones we've seen 4 x performance improvements. The first version took nearly 60 seconds to take 12,000 demographics records and produce a simple summary table. The second version did the same thing in about 15 seconds. The performamce of the latest version is under 4 seconds. This is probably about where we need it to be, at least for now. It's true that we will have this tool work on more complicated, bigger tables, but it seems to me that the performance will still be good enough. (Of course, we'll need to test that.) In any case, clinical studies are usually smaller than 12,000 patients, often a great deal smaller, so I think we are at a good point performance-wise.
One way I used to enhance the performance was to use lazy evaluation in the statistical methods. Here's an example of lazy evaluation:
def n
@n ||= @arr.size
end
The n method checks to see if the @n instance variable is anything other than nil or false. If it is not nil or false, it returns that value. If it is nil or false, it executes the code to the right of the ||=, places the result into the @n instance variable and returns that value. This means that it is no longer necessary to have an iteration step to calculate the stats (like I did in the previous versions). It also means that each statistic gets calculated only when it is used, and only once. This saved a great deal of calculation time. Thanks, David, for that tip.
One feature that I added is to create an object in the Results class that is equal to the ActiveRecord class that you want to use. So DM is no longer hardcoded into the class; you can pass the name of the table into the Results.new method when creating new objects out of the Results class. For example:
dm = Results.new('dms')
This will create a Results object called dm with a class inside it equal to an ActiveRecord object that points to the demographics table.
Here's another example:
ae = Results.new('aes')
This creates a Results object with an ActiveRecord class that points to the AE table in the database.
By the way, the Results class wass formerly the Output class. I renamed it because I think it made more sense to call it Results because its objects will hold instance variables with the results of the statistical calculations. It did not make sense to call it Output, because one of the things I really insisted on is to keep all of the output functionality in the erb templates. So the results are stored in the instance variables of the Results class -- in the nested hashes that you've seen in previous classes. The erb templates use those instance variables in whatever ways they see fit (ways that the Results class needs to know nothing whatsoever about).
Please let me know if you have any questions about the new code. It's a lot cleaner to read, because a lot of the fluff is gone. Also, I think that RubyRx has reached the point where the core part is not going to change a lot from week to week. The core will be pretty stable. That is good news, beacuse not only does it make it nmuch easier to see what's going on (and so works a lot better as a teaching example), but it also means that RubyRx is ready to move to the next level.
More on moving to the next level in a future post.
Thanks,
Glenn
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment