It is rather late for a set of RailsConf recap posts, but I did promise friends that couldn’t make it that I’d take notes and give them what I’ve got. I decided that I might as well post these notes while I’m at it. I realize that in most cases the slides will be available, but I believe there’s enough value added (it’s easier to read prose than slides, some slides are just one phrase that was expanded on in speech, picture slides would be explained, links are clickable etc) to justify my time spent writing it.
I also recognize the code use may be an issue, so:
All code in this post courtesy of Adam Wiggins
Here’s a quick format for how I’ll be rating each talk. I dislike arbitrary X out of Y scores: I’ll be rating each under the following criteria:
- A. Was it entertaining/interesting?
- B. Did I learn anything?
- C. Was I given more resources to follow up?
- D. Did this help me make a decision?
First three are fairly straightforward. The last means, the talk should help me choose a direction—whether I’m interested in learning more about the subject matter, or if I’m convinced I never want to touch it or hear about it. Either is fine, as long as I left the talk with an opinion.
You’ll notice these are binary yes/no questions, but I’ll be employing fuzzy logic when necessary.
Enough of the preamble, on with the show:
Custom Nginx Modules: Accelerate Rails, HTTP Tricks
Speaker: Adam Wiggins of Heroku
Primary Materials:
Additional resources:
Rating: Yes for criteria A, C, D. Emphatic Yes! for B.
Adam first takes us through a brief introduction of Nginx and its place in the
future:
Why Nginx? Nginx has replaced Apache for the following reasons: faster, smaller
memory footprint, more stable under load, more secure.
More importantly, it is a better fit for Rails. mod_proxy is a bolt on that
doesn’t fit well with Apache architecture. Apache is the right tool for
mod_php, or owning your own server hardware. But that era is now coming to a close.
What’s the new paradigm?
One element of the new paradigm is standalone long running processes. The
frontend server just proxies to these processes. Another element is cloud
computing, which Adam defines as
“transient and horizontal scalable resources”
A smart proxy can track resources in real-time and route stuff to where it needs
to go.
This is where nginx shines. Proxying is nginx’s primary (in fact, only)
mechanism for serving dynamic content. It embraces the constraint of keeping
application VMs out of the front-end web server. This allows it to be leaner,
more focused, faster, and more stable. Adam pulled out the canonical City
Slickers quote here: “One thing. Just one thing. You stick to that and the
rest don’t mean shit.”
“Wait, there’s something funny going on here”
So, nginx is an intermediary, is it still right to call it a web server? No,
because this is a leftover term from a previous era. Adam calls nginx a “HTTP
router”.
Here Adam used a Back to the Future metaphor: lightning striking the clock tower
just as Doc is connecting the cables. Doc Brown is the http router, and
lightning is the http request. The important thing to take away is that your primary goal is just-in-time handling of requests.
Adam then took us through some sample setups1:
Basic nginx conf
upstream myapp_mongrels {
127.0.0.1:3000;
127.0.0.1:3001;
}
location / {
proxy_pass http://myapp_mongrels;
}
With memcached in front
location / {
set $memcached_key $uri;
memcached_pass 127.0.0.1:11211;
error_page 404 502 = @myapp;
}
location @myapp {
internal;
proxy_pass http://myapp_mongrels;
}
You’re proxying two completely different services, but from one place. But you
don’t want to cache POST requests, so you get:
Memcached in front with method filter
location / {
if ($request_method = GET) {
set $memcached_key $uri;
memcached_pass 127.0.0.1:11211;
error_page 404 502 = @myapp;
break;
}
proxy_pass http://myapp_mongrels;
}
Next up, Adam tackles filtering, which we’re all familiar with: mod_ssl,
mod_rewrite, mod_gzip. He defines filtering as
“reaching in and tinkering with requests and responses”
He then took us to extending the concept to every single portion of
requests/responses (headers, bodies, etc). The main philosophy is that the
server only needs to worry about serving, and endpoints can be ignorant of
what’s going on.
How to use filters for separate concerns:
- filters in your app for business logic
- filters in your http router for server infrastructure
- filters in either for application infrastructure
Adam subsequently “got hardcore”—showing as how an authentication filter
(application infrastructure) would be done as a custom nginx module.
The goal is granular user access control. While normally done with a
before_filter in Rails, the goal is to have this without any Rails code.
Solution? An input filter module in Nginx, ngx_heroku_gate
Rails version
1. before_filter :authorize
2.
3. def authorize
4. @user = User.find(session[:user_id])
5. @resource = request.env['REQUEST_URI']
6. end
7.
8. redirect_to '/login' unless @user
9. redirect_to '/access_denied' unless @user.can_access(@resource)
Nginx version
corresponding to line 1
static ngx_int_t ngx_heroku_gate_init(ngt_conf_t *cf)
{
phase = cmcf->phases[NGX_HTTP_ACCESS_PHASE];
h = ngx_array_push(&phase.handlers;}
*h = ngx_heroku_gate_handler;
}
The rest:
static ngx_int_t ngx_heroku_gate_handler(hgx_http_request_t *req) {
// corresponding to lines 4-5
user = get_logged_in_user(req->headers_in.cookies);
app_name = get_app_name(req->headers_in.host->value.data);
if (!user) // corresponding to line 8
{
redirect_to(req, '/login');
return NGX_HTTP_MOVED_TEMPORARILY; // short circuits everything else
e e}
if (!user_can_access(user, app)) // corresponding to line 9
{
redirect_to(req, '/access_denied');
return NGX_HTTP_MOVED_TEMPORARILY;
}
else
{
write_heroku_user_header(req, user); // app level code
return NGX_OK;
}
}
#define X_HEROKU_USER "X-Heroku-User"
static void write_heroku_user_header(ngx_http_request_t *r, u_char *user)
{
h = ngx_list_push(&r->headers_in.headers);
h->hash = 1;
h->key.len = sizeof(X_HEROKU_USER);
h->key.data = (u_char *) X_HEROKU_USER;
h->value.len = strlen((char *)user);
h->value.data = user;
}
The helper function code for get_logged_in_user and redirect can be found at Adam’s follow-up blog
post
Performance-wise, nginx modules are automatically “insanely faster”. When apps
get big, they’re prone to abuse: spam, malicious spiders, etc. With a custom
nginx module for authentication, you get to turn away bad requests for very low
cost (essentially almost free).
Adam’s closing thoughts:
- http router is what nginx is evolving into, and can be taken a lot further
- rack: stacking handlers in a framework independent way
- merb router is very powerful
- having inter-process http routing gets you horizontal scaling
- http is the enabling protocol for the era of cloud computing
- apps should be changed to take advantage of cloud computing, since web
- requests are inherently stateless they’re very suited for that
This was exactly the kind of talk I love—no dumbing down, lots of code, has practical applications. Thanks Adam!
1 If I had known these would be available, I wouldn’t have typed like a madman to get all these in. They’re now available on the slides, but I’ll go ahead and copy-paste from my notes anyway to justify my keyboard abuse during the conference!