Moving from Opscode Chef to Rightscale

Moving from Opscode Chef to Rightscale

It all started with, “Hey Anthony, since we’re launching in a few weeks we should probably make the Bright Score infrastructure scalable.  We might get a few users.”

Anthony, the CTO, wasn’t worried: “No problem — we’ll just use Rightscale.  They recently released support for Softlayer (our hosting provider).  Rightscale also supports Chef, so we can just import our existing Chef scripts and be done.”

This is our ensuing tale of failures and triumphs.

The Bright Score at its simplest takes a job description and a resume, and calculates a score.  We needed to scale that up to millions of resumes, and millions of job descriptions, while maintaining low latency.  Big data today usually refers to Hadoop, but our data is a variant of a bipartite graph problem, so Hadoop does not work (think of the Bright Score as an edge between a job node and and person node).  We have more of a traditional batch analytics problem.  Since our tasks are largely CPU limited (resumes and job descriptions are rather sparse), the application is nearly horizontally scalable by adding more nodes.

Rightscale is built for such problems.  You create a Server template that bootstraps nodes using either RightScripts (bash scripts) or Ruby based Chef scripts.  Coincidentally, the Bright Score project used Chef as its configuration management tool. Rightscale upsizes or downsizes your Server Array (clusters of VM instances that perform common tasks) based on various factors, in our case the CPU load of the servers.  Nodes can vote to grow or shrink the Server Array cluster based on user defined criteria (i.e. CPU usage).

Although this migration eventually worked, we learned a couple of lessons that we’re documenting here for those who follow in our footsteps:

  • Rightscale Chef works well with a large number of operating systems and hosting providers, as long as they are Ubuntu and Amazon Web Services.  The nodes that we wanted to clone were Fedora 16.  While building the Bright Score, we tried various flavors of CentOS and Fedora, but for the particular combination of packages we preferred, Fedora 16 was the only OS that worked seamlessly.  We were running Fedora 16 at Softlayer so surely it should have been trivial to spin up a Fedora 16 node using Rightscale and manage it using Chef … wrong.  The Rightscale/Chef combination only works if you have a specially built “RightImage” – a server image with special packages that enable it to communicate with RightScale.  The RightImage contains the operating system as well as Ruby and a dated version of Chef solo (version 0.09 of Chef, which is deprecated).  The available RightImages at Softlayer were both limited and rather old (Ubuntu 10.04 and CentOS5.5), and did not support our stack (which is known to work on Ubuntu 11.10 or later or Fedora 15 or later).  We tried to build a Fedora 16 RightImage, but after a week abandoned this and moved our efforts to Amazon.  At Amazon, we were greeted with hundreds of available RightImages.   We were up and running within hours.  This is a classic chicken and egg problem.  So many people are using RightScale/AWS that it works better there.
  • Rightscale Chef is only vaguely related to Opscode/Open Source Chef. The scripts look the same, but the way they are used is fundamentally different.  The way we were instructed by the Jedi Chef Master involved using Knife to upload chef scripts to the chef server and then, using Knife, bootstrap a new node with a run list and desired environment from the chef server.  The last recipe would add a cron job that runs chef-client at regular intervals.  Rightscale Chef gets rid of many fundamental concepts that we utilize, like environments, databags and attributes.   Here are the key differences:
  1. Chef recipes can only be loaded into Rightscale via github, as far as we can tell.  You hook up your Github repository to RightScale, click a button, and all your cookbooks are imported.
  2. Recipes are only run automatically at boot time.
  3. The “Run List” is created manually using the web site (this is rather tedious).
  4. Recipes can be run individually at later times using the dashboard.
  5. There is no possibility of asking RightScale nodes to “phone home” by calling chef-client.
  6. Knife is not needed except to create the metadata file.
  7. Conversely, we discovered after a day or two of puzzlement that the metadata.json file needs to be there and needs to be correct.  Each sub-recipe needs to be specified.
  8. Chef attributes are replaced by Rightscale inputs that can be set in the browser.  This introduces a dependency in the chef recipe on Rightscale specific recipes, which means that recipes using attributes NEED TO BE REWRITTEN FOR RIGHTSCALE.
  9. We were forced to move our Databag items into RightScale inputs.

If none of this is news to you, you need to come work for us.  These are things that happen when a Scientist (Bright Score 33 for that position) and CTO try to do a devops’ job.

David Hardtke and Anthony Duerr

,

One Comment on “Moving from Opscode Chef to Rightscale”

  1. Ryan J. Geyer
    July 20, 2012 at 8:48 am #

    Dave & Anthony,

    Thanks for sharing your experiences getting your infrastructure up and running on RightScale. I wanted to share a few tips and tricks I’ve picked up being a seasoned RightScale user, which may be applicable for you. (I’m also now a RightScale employee, but opinions expressed here are my own, your mileage may vary, etc, etc)..

    Operating system support is an ongoing challenge, but the good news is that you can create your own “RightImage”. The only real magic there is the inclusion of the “RightLink” agent which can be installed on a variety of operating systems with a package. Below is a link to instructions on installing it and creating your own “RightImage”. They are unfortunately for AWS (chicken and egg, you said so yourself ;-) ) but are more-or-less applicable for any cloud including SoftLayer.
    http://support.rightscale.com/12-Guides/RightLink/04-Creating_RightScale-enabled_Images_with_RightLink/RightLink_Installer_for_RedHat_Enterprise_Linux_(RHEL)

    Better still the latest RightScale images using RightLink 5.8 actually include Chef 0.10.10, so that’s no longer deprecated.

    On the Chef front, you can also pull in your code from SVN or a zip file. RightScale does cache your cookbook code much the same as Chef Server. The difference is that when it’s uploaded to RightScale, it’s replicated to a distributed content delivery network for availability and performance.

    The recipes are also run when a server is rebooted, but it’s true that you can only define that runlist using the UI (or the API, which could be a bit tedious as well). Rightscale is looking into various approaches for supporting something akin to roles, and I’m sure the product manager would love to talk to you. :)

    Inputs that are set in the RightScale dashboard do get passed into the Chef run as node attributes, but that shouldn’t necessarily introduce any dependencies unless you’re using the RightScale specific cookbooks or definitions (like rs_utils_marker etc). You can simply write your cookbook to have default attributes in the cookbook, and use them with “normal” chef server/client by overriding those attributes in a role or at the chef server for a specific node.

    Databags are another sticking point indeed, and there’s some discussions going on there as well. In the interim, I’ve written some VERY preliminary POC code which monkey patches the “search” keyword to give databag like functionality by pulling complex data from a node attribute containing json. I’ll be happy to share this with you if you’d like, though it’d need some work to be genuinely useful.

    Lastly, if you want to have chef-client like functionality as far as continuously converging is concerned, there is a RightScale cookbook/recipe for that. You can specify a list of recipes to be re-run on a cadence (currently 15 minutes) so you can bring your nodes back to the specified configuration. Check it out at the URL below.
    https://github.com/rightscale/rightscale_cookbooks/blob/master/cookbooks/sys/recipes/do_reconverge_list_enable.rb

    Sorry for the long winded response! I really enjoyed reading this and appreciate that you documented your experience. I’d be happy to provide additional feedback if there’s anything you left out. :)

    -Ryan J. Geyer-

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 26 other followers

%d bloggers like this: