A Second Here, A Second There…

A Mysterious Performance Problem

In production, we were seeing lots of requests to our backend services taking a long time. They would typically be over 1000ms when we would expect them to be much, much faster. It was intermittent too. Sometimes the same endpoints were fast, sometimes they were slow.

When we investigated, we were able to reduce the performance problem to the following mystifying test case, which should happen on your system too.

With this Sinatra app:

require 'sinatra'

post "/foo" do
  status 200
  "There are #{params.length} params"

Making this request:

curl -d "a=1" "localhost:3000/foo"

takes almost exactly 0 seconds.

(Note, we use libcurl to talk to services instead of Ruby’s Net::HTTP, so this is very close to a real service request from one of our applications.)

On the other hand, making this request:

curl -id "a=$A_KB_OF_DATA" "localhost:8080/foo"

takes almost exactly 1 second.

Now, it shouldn’t take 1 second to make such a simple request just because I’m sending 1K of data.

So what on earth is happening here? If you already know the answer then much respect to you; but this was a head scratcher for me.

After spending a long time in the Sinatra, Rack and Unicorn stack, I was able to trace the long wait to the socket read call in the Unicorn request object.

At which point, Graham (our Head of Tech Ops) suggested that we dump the packets to examine what Curl was actually sending to the server and receiving back. The command we used for this was:

tcpdump -i lo -s 0 port 8080

(Which says, capture all packets on the loopback interface, in their entirety, to and from port 8080.)

Inspecting the contents of these packets led to us (or at least me) learning some new things about HTTP.

I Suck At HTTP

Here’s something I didn’t know about HTTP, from the RFC. An HTTP client can omit the request body and add this header to the request:

Expect: 100-continue

Which is a way of saying “I’m not sure if you want this request body, do you?”

If the server does want the request body, it should respond with a 100 response, and the client can then upload the body and the request/response continues as normal from there.

If it does not want the request body, it can respond with whatever other HTTP code.

Now, this is handy if there are circumstances where you want to reject a request before you go to the trouble of uploading a perhaps massive request body. (Maybe an enormous image file being posted to an endpoint that is not accepting uploads right now.)

It’s entirely up to the client whether to include this header. You can not bother at all, or only bother for very large files, or do it all the time. All valid approaches.

Curl says: if your post data is larger than 1K, I’ll send the Expect header. I say: OK Curl. 1 kilobyte seems pretty small, but whatever, fine.

So far so good.

Here’s the problem: Ruby web servers do not implement 100 Continue automatically.

And they shouldn’t, either. It’s up to the application to make that choice, receive or not receive, so application authors need to implement that themselves in general. And we didn’t.

So what happens? Curl sits there waiting. “Come on, come on, do you want this request body or not?

Curl makes a Weird Choice

So true, our app was technically broken. It happens. And we would have fixed it if Curl had timed out waiting for the 100, as you might expect. Because that would have been an error we would have noticed and fixed.

But Curl doesn’t timeout waiting. If it has sent the Expect header, Curl will wait for the 100 response, but only for exactly one second. After one second has passed, it will send the request body anyway.

So from our apps point of view, everything is fine. It was just a bit slow. No errors raised or reported. Just a slow service. Sometimes.

I guess from the point of view of Curl, it’s best to make every effort to have a successful request, even with performance degradation. From our point of view, failing hard would have been better than months of silent performance regression.

Working Around It

So we now have two options. The first and best is to actually implement the 100 Continue behaviour correctly. The second is to make Curl not behave like this.

Because we control all the clients to our internal services and wanted a quick fix, we decided to go with the second approach, which I’ll describe first.

If you set the Expect header yourself, even to a blank string, that overrides Curl’s behaviour and forces it to send the entire request immediately:


We did this and fixed all the slow requests we were seeing instantly.

Fixing it Properly (With Unicorn)

It’s not entirely obvious how to implement the required behaviour in a Sinatra application. How would you respond with 100 and then continue processing a request in Sinatra?

Fortunately, Unicorn has support for 100 Continue built in. If you respond with 100 from your application, it will return that to the client, and then immediately turn around and call your app a second time with the full data.

So we can make our application detect the Expect header and respond with 100 if appropriate.

We’d like to be able to make this decision from Sinatra, but I believe that Sinatra eagerly reads all of the request, so the delay is triggered before your action block is run.

So instead, include the logic in a Rack middleware, like this. Although here there is no logic, we simply request the body always and immediately if we detect that the client is waiting for 100 response:

class AlwaysRequestBody
  def initialize(app)
    @app = app

  def call(env)
    if env["HTTP_EXPECT"] =~ /100-continue/
      [100, {}, [""]]

use AlwaysRequestBody
run Sinatra::Application

With this in place, the previously slow curl request returns instantly, and you can see here Curl logging the intermediate 100 response as well.

$ time curl -id "a=$A_KB_OF_DATA" "localhost:8080/foo"
HTTP/1.1 100 Continue

HTTP/1.1 200 OK
Date: Tue, 27 Nov 2012 17:32:52 GMT
Status: 200 OK
Connection: close
X-Frame-Options: sameorigin
Content-Type: text/html;charset=utf-8
X-XSS-Protection: 1; mode=block
Content-Length: 18
There are 1 params

real    0m0.008s
user    0m0.008s
sys     0m0.008s

Problem solved.


Hence my initial assertion that your app is probably broken. If a client sends the Expect header it has every right to expect a correct response,