RailsConf Recap II: Fuzed 0

Posted by timgoh
on Sunday, June 08

This is second in a series of RailsConf notes. The first one on custom nginx modules includes background on the raison d’etre for these, and can be found here


Build Your Own Distributed, Self-Configuring Rails Cluster

Speakers: Dave Fayram and Tom Preston-Werner of Powerset

Primary Materials: Additional Resources:

Rating: A: Hell Yes, B: Yes, C: Somewhat, D: Yes (see previous post for what this means)

Dave and Tom started the talk with two choice quotes. One from an Evan Phoenix tweet

“If humans can land robots on Mars, then there is no reason we can’t make a fast Ruby.”

Naturally they pointed out the rather large difference in budgets!

And one of their own:

“No matter how fast Ruby is, we will always need a way to scale it!”

They then launched into a description of traditional scaling.

Simple case: nginx -> proxy-pass to different mongrels

More: nginx -> proxy-pass to mongrels on different boxes

Even more: load balancer -> multiple nginxes on multiple boxen -> multiple mongrels

This was followed by a list of questions which they answered later with regards to Fuzed. I shall practice DRY by not listing the questions here.

Fuzedemo

With that, time for the demonstration to begin. First, on Tom’s laptop:

$ fuzed start -n master volcano.local
$ fuzed frontend -z volcano.local -r test/app/public -s 'kind=rails' -n1@volcano.local -j api

At this point, static assets are up, but dynamic requests can’t be handled yet. Just one more command:

fuzed rails -z volcano.local --rails-root=test/app -n n1@volcano.local -c3

Now we have three Rails workers running on Rack, and dynamic requests can be served.

(More elaboration on these commands can be found on the README at Github )

Then Dave joined in the fun, adding his machine to the mix. The new node was connected to the same cluster, and the master “self-assembled”. They then demonstrated the self-healing ability of Fuzed, taking down a node. The master gave feedback instantly, and the node was replaced.

For the next point, I have an exclamation in my notes: “reports over HTTP!”. Basically, status information can be queried over HTTP. In their example, the root page of localhost:9001 was reporting “There are 1 pool(s) attached to this master with a total of 3 worker(s)”

And this naturally extends to localhost:9001/status/rails/4 reporting “Expected at least 4 worker(s) in the 1 rails pool(s). Found 3.

Now why the exclamation? Because HTTP support provides so much flexibility (Jacob Kaplan Moss’s elaboration of how significant that is for CouchDB is particularly enlightening). Also, the obvious application of this is that the nodes running inside the cluster can be easily be made aware of the bigger picture.

And it was about to get cooler.

But first, the “Fuzedvantages” (their word, not mine!), in the form of answering some de facto scalability questions mentioned earlier:

Fuzedvantages

Q: What happens if a machine fails?

A: It will automatically be taken out of the rotation, and rejoin once back up

Q: What happens if you need to add more hardware fast?

A: No configuration changes necessary. Just add more nodes. They mentioned that PowerSet was running Fuzed on a lot of machines—Erlang can handle it

Q: What happens if you have a mixture of very fast and very slow pages?

A: Fuzed uses “next available resource” queueing. Nodes with long-running requests will not be available so they won’t be overloaded by new ones

Q: What do you do about a staging setup?

A: Run multiple versions of your app in the same cloud!

Q: How do you deal with scaling in a flexible cloud (like EC2)?

A: Puh-leeze. Fuzed was made for a dynamic cloud environment

Q: What happens if you change your hosting environment? (cloud to colo, etc)

A: Change a few hostnames in start-up commands, and you’re done

More advantages given outside of the Q&A format:

  • insanely flexible: multiple dimensions of horizontal scalability
  • front-end and back-end can scale independently * one master node * faceplate (ports connected to the internet): static assets, portal to dynamic assets * worker (rails nodes, other types)

And things that Fuzed does not help you with:

  • scaling your DB (too application specific)
  • raw speed – this is about scaling horizontally, not vertically

FuzedGuts

Basic Fuzed architecture

master node a single Erlang process
faceplate any Erlang process that knows how to translate outside world to Erlang requests
resource_manager monitor over unix pipes talking between masters and workers through ports
  • master creates resource pool, with ‘ports’ that correspond to resource_manager ports
  • faceplate obtains a direct connection to the machine that hosts the resource through the resource manager, which translates the response from the worker from Ruby to erlang

Then a colorful description of a Fuzed Chassis:

Like the River Styx only where Erlang is the Earth and Ruby is Hades.

Chassises (chasses? chassi?) are language agnostic constructs that handle deployment of different platforms (Rails, Merb, Django, etc).

Request is made rack-compliant, sent to Rails rackhandler, which produces a html response. This response is changed to a “yaws-style response” (YAWS is the Erlang server; it expects deeply nested arrays), sent out to a faceplate, and then to the client.

Have a look at how few lines of code the Rails Chassis is.

Code for a simple chassis they used:

require 'chassis'
class AdderNode < Chassis
  kind "calculon" 
  handle(:crash) do |args|
    raise "You asked me to crash, so I did." 
  end
end

Next they had a bonus presentation on a scalable json server, using curl to make a json request and getting json back.

Finally, they elaborated on their to-do list:

  • eliminate master node as single point of failure
  • include additional dispatch details: url paths, http headers, flexible error handling strategies
  • more chassis modules

All in all this was a vastly interesting talk that had the audience oohing and aahing at times. I don’t know if Dave and Tom have presented together before but they handled it with aplomb—both got a good amount of air time and they didn’t step on each other’s toes.

It’s cool that this RailsConf has shown so many new options for scaling that are language agnostic as well (CitrusByte’s own Pool Party which is specialized for AWS is also language agnostic). It’s a strong signal to the rest of the web developer community that the Ruby community is by no means insular. More on this some other time…

TechCrunch on Twitter: Translation 10

Posted by timgoh
on Friday, May 23

I’m eventually going to write a more formal piece on my company blog, but in the meantime snark will suffice.

TechCrunch has been slamming Twitter and Rails . Let’s have a look at what they’re really saying.

(With apologies to Mark Pilgrim and John Gruber )


[Twitter’s] small team contains a handful of engineers, with only a person or two committed to infrastructure and architecture. He goes on to point out that at Digg the team for network and systems alone is bigger than the total engineering team at Twitter

I have never read the Mythical Man Month. More people are better.

I’ll also ignore the fact that Scribd has more web traffic than Twitter with only 11 total employees, information which I can get from my own affiliate site

The problems at Twitter are often attributed to their use of RubyOnRails, a web development framework.

This sentence gives me an easy way of saying I never blamed RoR for Twitter But of course it’s Rails’ fault, it’s “often attributed”! I will also ignore Evan Williams’s statement ‘Lots of our code is not in RoR, already, though.’

Twitter is almost certainly the largest site running on Rails

There are various ways of interpreting “almost certainly the largest”. One of which is ‘6th largest’.

Utilizing a framework that has never conquered large-scale territory [...]

I don’t consider the 474th or 775th ranked Alexa sites ‘large-scale territory’.

As an out-of-the box framework, Rails certainly doesn’t lend itself to large-scale application development.

I shall now pull terms with no meaning like “out-of-the-box framework” out of my ass. Oh and in case you still haven’t gotten my point, Twitter doesn’t scale because Rails doesn’t scale. Marvel at my skilful use of the non causa pro causa and cum hoc ergo propter hoc logical fallacy combo!

[...] But the old adage of “Good, Fast, Cheap – pick two” certainly applies;

Will trotting out a cliche let me get away with making subjective assertions? Hell yeah!

Rails would do itself no harm by conceding that it isn’t a platform that can compete with Java or C when it comes to intensive tasks

I shall now compare this “out-of-the-box framework” to two languages, without any substantiation. And look at how I wriggled out of that one… I have the “would do itself no harm by conceding” qualifier! I never said it isn’t a competitive platform. And if you read the comments, I’ll show you the intensive tasks I’m talking about that are total CPU-killers, such as waiting on database requests.

What we see at Twitter today is a very useful and popular service, but one with very complex underlying technical challenges to overcome. Twitter will require not only a new architecture approach and a big injection of the best minds they can find [...]

Damn it, I still haven’t hit my required word count. Here, enjoy this poo-poo platter of my best verbiage.

And later in the comments

It wouldn’t surprise me if the DB is falling over on those requests. Also judging by message ID’s and what I have seen so far, their data is in no way segmented – so you have one BFT (big fucking table)

Sorry, did I say their problem was Rails? It’s actually their database architecture! No wait, all of the above! Oh and my superpower is divining a site’s database architecture based the IDs it uses. Why, just the other day I saw an ID on Google Docs and it hit me: “Those guys use BigTable”.

I’ll make random unsubstantiated guesses about their technology instead of investigating with other sites their usage of Twitter’s API and what they’ve found out, or benchmarking Starling .

You want proper investigative journalism or insightful technical analysis? Sorry you’re in the wrong place. This is the “hearsay and hyperbole” section. You’re looking for somewhere else .