Using MemCached to speed up fragment caching

Like any web 2.0 site, leefjedoel.nl is currently in beta. During this phase we’re trying to find bottlenecks, fix the last few bugs and optimize heavy parts of the site.

During development we already prepared caching for all pages, mostly fragment caching. To expire cache that’s no longer current, we use sweepers that get called when something relevant is updated. These sweepers only sweep the caches that get outdated.

Because we were unfamiliar with caching and needed to see the result of our fragment caching, we used the file_store to store the generated caches. These files are stored on disk and this way you can easily see how many cache gets generated and what they contain.

Regex and file_store == FAIL

To sweep caches we used regular expressions, this way we could easily sweep all relevant caches at once. This was a bad idea, as you can read here and here

During the beta phase the size of the site steadily increased, more users, more goals, more groups. There was a noticeable delay whenever you updated/created something. A short investigation pointed to the cache sweepers as the culprit.

The file_store for cache isn’t exactly the fastest solution to store your cache, but when you combine it with regex sweepers, things really slow down. Whenever you do a regex sweep, all files in the cache directory get returned (not that surprising if you think of it), and the regex is run against it. So even if you do a sweep on /goals, it will also return files in /users and /groups. As a result, updating your profile could take 15 seconds.

MemCached

We’d been planning on moving the cache to memcached all along, so this seemed a good opportunity to do it. In the next few paragraphs I’ll describe how to install memcached, get the correct Rails plugin to make memcached play nice with fragment caching and how to configure your Rails application so it uses your memcache server.

Installation of MemCached on GNU/Linux

First of all, you’ll need to memcache daemon, assuming you have a nice linux distro you can:

sudo apt-get install memcached

This will work on Debian, Ubuntu and other Debian-based distro’s, on Gentoo you can

emerge memcached

The great thing about memcached is its simplicity, it requires no configuration after installation, just run it.

All about the gems baby

Now we’ll get the gem to allow Ruby to talk to memcache. There are two gems that do this, Ruby-MemCache and memcache-client. memcache-client is supposed to be faster, so I used that.

sudo gem install memcache-client

Plugin to play nice with rails

Rails’ fragment caching doesn’t work with memcached out of the box, you’ll need a plugin. This plugin also adds a nice bonus to the cache method in views.

script/plugin install svn://rubyforge.org/var/svn/zventstools/projects/extended_fragment_cache

Environment setup

Now we need to configure your Rails app to use the memcached server. You’ll need to edit your config/environments/production.rb

memcache_options = {
:c_threshold = 10_00,
:compression = true,
:debug = false,
:namespace = 'yourappname_or_anything_you_like',
:readonly = false,
:urlencode = false
}
CACHE = MemCache.new(memcache_options)
CACHE.servers = '127.0.0.1:11211'
config.action_controller.fragment_cache_store = CACHE, {}

That’s all folks!

That’s it, you’re all done, the ‘cache’ method in your views will now use the memcache server.

Oh wait, there’s an encore

But there’s more, using memcached you can set expiry times for caches. I edited the plugin for a default expiry time of 1 day. In vendor/plugins/extended_fragment_cache/lib/extended_fragment_cache.rb look for def write(key,content,options=nil)

def write(key,content,options=nil)
  expiry = options && options[:expire] || 1.day
  begin
    set(key,content,expiry)
    rescue ActiveRecord::Base.logger.error("MemCache Error: #{$!}")
    rescue MemCache::MemCacheError = err
    ActiveRecord::Base.logger.error("MemCache Error: #{$!}")
  end
end

You can change the 1.day to anything you want. To override this default behaviour, you can use the following code in your views

cache('goals/large_cloud', {:expire = 30.minutes.to_i})  do

This will make the cache called ‘goals/large_cloud’ to expire 30 minutes after it got created.

There are two important things to consider when you move to memcached

1. MemCached doesn’t support regex based expiry of caches. You need to manually enter every cache you want to expire. You can do this in some nice methods of course. Here’s ours for expiring the cache when a user gets updated.

def expire_user_fragments(user)
  fragments = %w[author_icon author_link side_block_friends ..snip...]
  fragments.each do |f|
    expire_fragment("user/#{user.id}/#{f}")
  end
end

2. Your application will fail when the MemCache server becomes unavailable. If you ever restart MemCache, or if it crashes (haven’t seen that happen yet), you need to restart your mongrel-cluster/thin/ebb.

3. When you restart MemCache, all cache is cleared, and you need to restart your mongrel-cluster/thin/ebb.

This guide only talks about fragment caching, over at Ben Curtis’ blog, you can read all about action caching.

Yet another Ruby server

Ebb benchRuby Inside just posted a pretty impressive performance graph of a new Ruby server called Ebb. The graph was taken from the homepage of Ebb

Now what is Ebb?

The design is similar to the Evented Mongrel web server; except instead of using EventMachine (a ruby binding to libevent), the Ebb web server is written in C and uses the libev event loop library.

Connections are processed as follows:

  1. libev loops and waits for incoming connections.
  2. When Ebb receives a connection, it passes the request into the mongrel state machine which securely parses the headers.
  3. When the request is complete, Ebb passes the information to a user supplied callback.
  4. The Ruby binding supplying this callback transforms the request into a Rack compatible env hash and passes it on a Rack adapter.

The graph describes performance of a simple rack application and compares Ebb to mongrel, evented mongrel and thin. I’m more interested in performance with a Rails application and decided to do a benchmark for that.

In my benchmark I used the same application I used for my previous benchmark, only this time I benchmarked some extra pages.

Page 1 is a heavily cached page with few dynamic elements

Page 2 is a less cached page with a bit more dynamic elements

Page 3 is a non-cached page with an N+1 performance issue.

Ebb was tested using version 0.0.3, while Thin was on version 0.7.0. Both were run in a cluster of 4 behing nginx as a load balancer.

Ebb vs Thin benchmark

Interestingly Ebb managed to outperform Thin by about 10% on every page.

Ruby web server performance

I’m currently helping a colleague to build a rather large community website using Ruby on Rails. As most Rails developers are well aware, Ruby isn’t exactly the quickest language you can use to build web applications. To get a rough idea of how much users our current code would handle, I decided to run some performance tests.

First, here’s a picture of the page that was used.

Leef je Doel index page

The page has a tagcloud, site statistics and a conditional menu in the upper right corner.

For the benchmark I’m using the following tools

ApacheBench 2.0.40. To fire off a large amount of requests into the webserver.

Mongrel 1.1.3. One of the Ruby web servers

Thin 0.6.4. An alternative to Mongrel.

nginx 0.5.26. A lightweight HTTP server, which will act as a proxy/balancer in front of the Ruby application servers

For the benchmarks I settled on the following scenario’s.

Single instance of Mongrel in development mode

Single instance of Mongrel in production mode

Cluster of 4 Mongrel instances behind nginx

Cluster of 4 Thin instances behind nginx

In a real production environment, you’d never see the single Mongrel instance, and certainly not in development mode. The graph will show you why.

The red bar is for 50 concurrent requests, with a total of 10000. The red blue is for 10 concurrent requests, with a total of 10000.

Benchmark graph

The raw numbers:

  10 concurrent 50 concurrent
Mongrel (development) 4.88 req/s 4.59 req/s
Mongrel (production) 77.15 req/s 67.18 req/s
Mongrel cluster + nginx 130.97 req/s 122.04 req/s
Thin + nginx 156.21 req/s 160.89 req/s

I’m pleased with the results, we still have a lot of stuff we could cache and database queries that could be optimized, so there’s room for improvement. The server that will eventually host the application runs an Apache proxy going to mongrel_cluster. I’m probably going to run some benchmarks later, to see how Apache proxy holds up against nginx.

Book on BDD in Ruby with RSpec

RSpec book

I was browsing through this excellent presentation given by Dave Astels and David Chelimsky at RubyConf2007. Turns out that David Chelimsky and Aslak Hellesøy are working on a book about Behaviour Driven Development using RSpec.

The presentation is an excellent introduction for anyone thinking about moving to RSpec. It tells you a bit about the (rather short) history of the framework and the differences between BDD and TDD.

It also shows off the awesome plain-text stories I posted about earlier.

Story-based acceptance testing

In my previous post I talked a bit about Behaviour Driven Development and RSpec’s beautiful syntax for writing specifications.

This week I stumbled upon acceptance testing in RSpec and a very pretty way of writing those tests.

A simple acceptance test would look like this:

Story: A user sends an invitation

The invitation page should allow users to invite friends

Scenario: Sending an invitation
Given a new unused email
When the user goes to /invite/create
And the user types 'Test' into the invite_name field
And the user types 'User' into the invite_surname field
And the user types 'someemail@email.com' into the invite_email field
And the user clicks the commit button
Then the page should contain the text 'Test User was invited'

Scenario: Sending an invitation to an already used email address
Given an already used email address someemail@email.com
When the user goes to /invite/create
And the user types 'Test' into the invite_name field
And the user types 'User' into the invite_surname field
And the user types 'someemail@email.com' into the invite_email field
And the user clicks the commit button
Then the page should contain the text 'Email already exists'

Now these tests don’t do anything by themselves, you need to run them against the application.
This is where Selenium comes in. Using RSpec’s story runner, you can feed these plain-text stories to Selenium.

Selenium will open your web application in a browser and go to URL’s, enter data, click buttons/checkboxes/radiobuttons. It’s kinda spooky to see your browser navigate to pages by itself 😉

Here’s another test that checks the login function

Story: A user logs in

Scenario: Logging in fails
Given a user with username 'Arie' and password 'test'
When the user goes to /login
And the user types 'Arie' into the login field
And the user types 'not-test' into the password field
And the user clicks the commit button
Then the page should contain the text 'Unable to login'

Scenario: Logging in successfully
Given a user with username 'Arie' and password 'test'
When the user goes to /login
And the user types 'Arie' into the login field
And the user types 'test' into the password field
And the user clicks the commit button
Then the page should contain the text 'Logged in'

For now I’m not planning on covering my entire application with tests like this. I’m just going to make a few tests that I can show to my school when I finish my internship.

Behaviour Driven Development

I’m currently doing an internship as a final part of my study. Luckily I’ve been able to find a company where I can use Rails to create a new application.

One of the choices I made for the development is the use of BehaviourDrivenDevelopment (BDD), which is a variation on TestDrivenDevelopment (TDD).

Rspec is a BDD-framework for Rails. It features a beautiful way of expressing the expected behaviour of your application.

Here’s an example of the expected behaviour of a user controller

  it "should flash a notice after succesful signup" do
    User.should_receive(:new).with({"name" => 'Arie'}).and_return(@user)
    @user.should_receive(:save!)
    post 'signup', {:user => {:name => 'Arie'}}
    flash[:notice].should eql(_('Thanks for signing up, you will have to activate your account before you can log in'))
  end

And an example of expected behaviour of a user model

  specify "should be invalid with invalid zip code" do
    @user.attributes = valid_user_attributes.except(:zip)
    @user.zip = 'tralalalalala'
    @user.should_not be_valid
    @user.errors.on(:zip).should eql(_('is invalid'))
    @user.zip = '1234AB'
    @user.should be_valid
  end

The main idea of TDD and BDD is that you write these tests or specifications before writing the actual code. The basic workflow while doing BDD/TDD is:

1. Write tests/specifications

2. Run tests/specifications

3. See them fail because there’s no code yet

4. Code until tests no longer fail

It takes some discipline to strictly follow this pattern, because a future piece of code might seem so trivial, that you want to code it immediately.

rcov resultsWriting the specifications like this takes some time, but you’ll easily make up for it while you’re coding your application. You can use Rcov to check if your specifications cover all your code. When your specifications cover all your code with sensible tests, you can easily refactor your code, because the specifications will make sure your new code works properly.

Also, you can generate human-readable specifications from these rspec files, here’s a snippet of how that looks:

A user
– should be invalid without a username
– should have unique login name
– should be invalid without an email
– should be invalid if email is not between 3 and 100 characters in length
– should have unique email address

The UserController
– should redirect to profile after successfull activation
– should flash a notice when a valid activation code was used
– should flash an error when an invalid activation code was used

In combination with AutoTest, BDD is already saving me a lot of time during development.

Whenever I’ve changed some of my sourcecode or specifications, the specifications get tested automatically. If something breaks, there’s a small error popup, so you know you should check the AutoTest window for the test where it failed.

autotest error

Spot the typo

‘The User Controller should flash an error when an invalid activation code was used‘ FAILED
expected “Unable to activate the account.”, got “Unable to activate the accnt.” (using .eql?)

./spec/controllers/user_controller_spec.rb:29:
script/spec:4:

If all tests pass you get a cheerful message.

AutoTest success