Monday, October 6, 2014

June TMIL - Ruby Under a Microscope

This post marks the halfway point of fulfilling my New Year's resolution to post about coding once a month this year. As the published date suggests, summer got in the way of timeliness! The curious reader can calculate the stats on how many days after the first day of the month it's taken to post. (Internal lawyer says: The resolution was to create a post for each month, not publish them on time.) Alright, let's get down to business.

In June, I completed Pat Shaughnessy's book, Ruby Under a Microscope. The writing is clear and concise, and the layout of each chapter extremely helpful in reinforcing the material. The chapters are interspersed with illustrations, "Experiments" (usually benchmarking code) and definitions (that start simple and grow more complete as nuances are explained). The book is also a great way to gain comfort in fundamental programming concepts, like how a compiler works. Below are a few highlights from my notes.

The book starts with an explanation of Ruby's tokenization and parsing process (take a peek with the Ripper class), and how code is compiled into instructions that YARV can execute. While the first few chapters go pretty deep into topics, such as how Ruby puts values on stack frame and how the environment and stack pointers deal with scope, there are still things to play with. If you run the following snippet in irb and would like to know what the output means, this book is your guide.

stuff = <<STR
  5.times do
  puts "foo!!!"
end
STR
puts RubyVM::InstructionSequence.compile(stuff).disasm


After the deep dive down to the metal, the author walks through the Ruby object model and its repesentation in C structures, then method and constant lookup, until we come to "The Hash Table: The Workhorse of Ruby Internals." Like the early chapters, it presents a fundamental computer science concept through the lens of Ruby, enhancing understanding of both. For example, when you create

hash = {}
hash[:some_key] = "foo"


the basic idea is that Ruby takes the key, runs it through a hash function, and puts it in an available "bin", determined by the hashed key modulus the number of bins. Then, when retrieving values, Ruby recalculates the hashed value and looks in the corresponding bin for the item you're retrieving.

Eventually, two keys will have the same hash, resulting in a hash collision. Ruby has built in constants that determine when to allocate more bins to avoid this, and Pat's experiments and corresponding graphs show the spike in milliseconds when this happens. (And also the spike after inserting the 7th item into a hash since Ruby 2 stores up to six items in an array instead of a pointer to the hash table. It then just compares the keys when looking things up, instead of using the hashing function to find the right bin to compare keys in.) Pat generously posted a draft of this chapter while writing it that you can find here. There's also a great explanation (using Ruby code to rebuild the basic functionality) of how hashing works posted here.

And finally, a few other fun snippets that get unpacked and explained in the book:
  • Each level on the stack can have a different self
  • instance_eval uses self and an environment pointer to access variables in different scopes - when you call instance_eval, a closure and new lexical scope is created, and self becomes the receiver
  • disabling garbage collection when running benchmarks can help avoid skewed results
  • There are lots of goodies in Ruby core and standard library that don't get much attention, like ObjectSpace
  • Object structures like RString, RArray and friends are defined in include/ruby/ruby.h 
This book provides much more than this post can do justice to. If you've been wondering what happens behind-the-scenes when you create a lambda and how it's later called, or have been confused about class variables (shared by subclasses) and class instance variables (not shared; each instance of a class or subclass has its own), wondered how JRuby and Rubinius fit into the picture, or the similarites between what RObjects, RStrings and the C struct that comprise Ruby objects look like, then this is the book for you. And if you haven't wondered about those things, there's no better way to get started!

Tuesday, July 22, 2014

May #TMIL - A Touch of eval()

In a Rails app, I've got a Plan that belongs to a Subscription. The Plan has a duration, and the Subscription has an expiration. The expiration gets set in an after_create callback using the current time (DateTime.now) plus the Plan's duration. Then, there's an expired? method on the Subscription which checks if self.expiration < DateTime.now. Pretty straightforward, right? (code snippet below)

When updating seeds for the Plan, I was hoping to get away with using ActiveSupport conveniences, like 2.weeks, for the Plan's duration. However, 2.weeks evaluates, i.e. it is not stored as "2.weeks" in the database. Storing the value of 2.weeks.to_s (1209600, or the number of seconds in two weeks) also wouldn't work, since DateTime.now + 1209600 gives an unexpected result - it adds that many days.

The above makes sense, but this is Ruby, there's gotta be another way. How about serializing 2.weeks and storing it as a lambda in the database? The seemed to work in a console:

[5] pry(main)> now = DateTime.now
=> Tue, 22 Jul 2014 17:48:52 -0400
[6] pry(main)> duration = -> {2.weeks}
=> #<Proc:0x007fc18b3a14a8@(pry):2 (lambda)>
pry(main)> now + duration.call
=> Tue, 05 Aug 2014 17:48:52 -0400

Next, I added serialize :duration into my model, and did the RAILS_ENV=test rake db:drop db:create db:migrate db:seed rigamarole (since the helper tasks for this never seem to work as thoroughly for me) ... BOOM! I pry'ed in where the Plan's were getting created and tried to manually create one, and it appears that ActiveRecord won't let you serialize a proc (or lambda, a flavor of proc). While tempting to go down the rabbit hole of why, I knew there had to be an easier way than converting all my DateTime's into seconds, and done some other conversions, etc.

The solution? Store the Plan duration as a string, then use eval(). Since the Plan's will rarely change or get newly created, and tests are in place, this seemed like an appropriate use of what many consider to be an evil method. Ahh, I love writing in Ruby. Snippet below:

Sunday, June 29, 2014

April #TMIL - RSpec

As expected, this series of posts has drifted far off schedule. At least the drafts have been on time! Here are a few RSpec tips, preferences and best practices I've been accumulating after spending the last year using it in my daily work, finally realizing that's it much more art than science.
  • Start by writing an outline of your specs so that when you run them with the --format documentation, or -fd, option, you get a readable outline that clearly explains what the code should (or shouldn't) be doing. This can be more high level, or have a line for the happy/sad/edge paths of each method, or whatever works for you. I've found that using documentation format as a guide, along with organizing your tests using context/describe blocks, helps keep the code SOLID and intentions clear. You can make this the default, and enhance readability with --color, by adding a line for each one to a .rspec file.
  • Along with the above, driving from the outside in has been extremely useful in many situations for me. This Ruby Tapas episode provides a great example of this approach. I've found that starting with BDD leads to more reliable code and helps me to see relationships between objects and scenarios, though I try to avoid mocking (or get rid of them once the design is driven out). I like to start by describing what should happen as a series of Given-When-Then statements, e.g. Given I'm on the home page, When I sign in with valid credentials, Then I should see something awesome, then converting those into grouped specs, extracting helper methods, etc. While Cucumber more clearly enforces that syntax, the additional overhead of setting it up gets in the way, and I stick with Capybara in RSpec feature specs.
  • Add complete coverage, but don't over do it. If it feels like you're rewriting your code in the specs or you're testing parts of your framework, you're doing it wrong. This video has a great discussion of the idea that testing should be "as little as possible to achieve a given level of confidence."  
  • Don't forget about Mini Test as an alternative to RSpec. After all, it's in the standard library, complete with examples involving cheeseburgers! Brandon Hilkert's post also reminds us of other benefits, like using fixtures (especially if your factories' associations and traits are getting out of control).
  • Get to know VCR for recording HTTP interactions and Puffing Billy to record browser interactions. This way your specs never make real web requests, resulting in more deterministic tests, and so you don't have to deal with "stubbing the internet" and can run your test suite without an internet connection.
  • Be careful with mocks. I find them helpful when it's absolutely essential to know a message on another object gets called as a result of what you're testing. More succintly, "stubs are for queries and mocks are for commands." Or sometimes using them to drive out the design of collaborator objects (and to encourage decoupled code via dependency injection as suggested here). For example, mocking out a Song class when writing tests for your Playlist class then converting the mocks to real objects (and non-stubbed methods) when you've built your Song class. More often, I've found that when there's too much mocked or stubbed behavior necessary to set up a test, it's usually a sign the test is focused on the wrong thing, in the wrong place, or the code is too coupled. Opinions on the use and misuse of mocks vary greatly; I've found more success avoiding them when possible. More discussion on the use/misuse of mocks here, here and here. And here. This post by James Golick demonstrates some benefits of the mockist style and reminds one of the importance of having intergration tests (which you should regardless of your testing style) to avoid the common problems with mocking (i.e. mocked behavior allows false positives when the underlying interface has changed). 
  • Redo these exercises every now and then and see how much better and faster you're getting.
  • GuardGuard::LiveReload, and Guard::RSpec are essential for front end development. The basic setup of your Guardfile found in the Railscast will get you going. Just add the gems and bundle, then install the LiveReload browser extension (for Chrome, Safari or Firefox). You can add more plugins, like Guard::CoffeeScript, by running `guard init pluginname`. A list of plugins can be found on the Guard wiki. After enabling the extension in your browser while your development server is running, when you change a file, Guard will refresh your browser window and kick off your specs. Nice.
  • There are lots of other must-have's in your development group, a good overview can be found here. I also like to liberally use Pry and so-called "REPL Driven Development" as described by Pry core contributor Conrad Irwin. (Video | Slides
That's all for now, happy spec'ing!