Re: Throughput degradation after upgrading haproxy from 1.3.22 to 1.4.1

From: Erik Gulliksson <erik.gulliksson#diino.net>
Date: Mon, 15 Mar 2010 10:27:38 +0100


Hi Willy,

Thanks for your elaborative answer.

> Did you observe anything special about the CPU usage ? Was it lower
> than with 1.3 ? If so, it would indicate some additional delay somewhere.
> If it was higher, it could indicate that the Transfer-encoding parser
> takes too many cycles but my preliminary tests proved it to be quite
> efficient.

I did not notice anything special about CPU usage. It seems to be around 2-4% with both versions. When checking munin-graphs, this morning I did however notice that the counter "connection resets received" from "netstat -s" was increasing a lot more with 1.4.

This led me to look at the log more closely, and there seems to be a lot new errors that looks something like this: w.x.y.z:4004 [15/Mar/2010:09:50:51.190] fe_xxx be_yyy/upload-srvX 0/0/0/-1/62 502 391 - PR-- 9/6/6/3/0 0/0 "PUT /dav/filename.ext HTTP/1.1" This is only for a few of the PUT requests, most requests seem to get proxied successfully. I will try to reproduce this in a more controlled lab setup where I can sniff HTTP-headers to see what is actually sent in the request.

> No, I've run POST requests (very similar to PUT), except that there
> was no Transfer-Encoding in the requests. It's interesting that you're
> doing that in the request, because Apache removed support for TE:chunked
> a few years ago because there was no user. Also, most of my POST tests
> were not performance related.

Interesting. We do use Apache for parts of this application on the backend side, although PUT requests are handled by an in-house developed Erlang application.

> A big part has changed, in previous version, haproxy did not care
> at all about the payload. It only saw headers. Now with keepalive
> support, it has to find requests/responses bounds and as such must
> parse the transfer-encoding and content-lengths. However, transfer
> encoding is nice to components such as haproxy because it's very
> cheap. Haproxy reads a chunk size (one line), then forwards that
> many bytes, then reads a new chunk size, etc... So this is really
> a cheap operation. My tests have shown no issue at gigabit/s speeds
> with just a few bytes per chunk.
>
> I suspect that the application tries to use the chunked encoding
> to simulate a bidirectionnal access. In this case, it might be
> waiting for data pending in the kernel buffers which were sent by
> haproxy with the MSG_MORE flag, indicating that more data are
> following (and so you should observe a low CPU usage).
>
> Could you please do a small test : in src/stream_sock.c, please
> comment out line 616 :
>
>   615                          /* this flag has precedence over the rest */
>   616                     //     if (b->flags & BF_SEND_DONTWAIT)
>   617                                  send_flag &= ~MSG_MORE;
>
> It will unconditionally disable use of MSG_MORE. If this fixes the
> issue for you, I'll probably have to add an option to disable this
> packet merging for very specific applications.

I tried to comment out the line above as instructed, but it made no noticable change. As stated above, I will try to reproduce the problem in a lab setup. This may be an issue with our application rather than haproxy.

Best regards
Erik

-- 
Erik Gulliksson, erik.gulliksson#diino.net
System Administrator, Diino AB
http://www.diino.com
Received on 2010/03/15 10:27

This archive was generated by hypermail 2.2.0 : 2010/03/15 10:30 CET