Re: HAProxy Linux kernel 2.6 tuning for high load/connections

From: Willy Tarreau <w#1wt.eu>
Date: Sat, 1 Aug 2009 08:22:47 +0200


Hello,

On Wed, Jul 29, 2009 at 12:17:47AM -0700, xx yy wrote: (...)
> But lest do it one more time, but this time we will let it run a little longer
>
> root#lighttpdubu:~# httperf --server 127.0.0.1 --port 80 --uri /index.html --rate 250 --num-conn 100000 --num-call 1 --timeout 5
(...)
> Errors: total 39141 client-timo 31987 socket-timo 0 connrefused 0 connreset 0
> Errors: fd-unavail 7154 addrunavail 0 ftab-full 0 other 0
>
> ====== I had to stop it because it timed out with the same settings ==========
>
> So this is just a matter of time until it times out because the tcp ports do not reuse,
> this is why I need some advices on tuning the system stack.
>
> This is what I have until now:
>
> net.ipv4.tcp_fin_timeout = 1

This one is definitely wrong ! You're not even allowing the other side to lose the last ACK and retransmit it ! Please increase this value to at least 30s.

> net.ipv4.ip_local_port_range="1024 65536"

This one is wrong too. Port 65536 does not exist. I don't know what the system will do with such a range. Maybe it ignores it, maybe it will randomly fail to bind to the source port, maybe it internally limits it and will not be affected... Lots of maybes.

Also you need to set net.ipv4.tcp_tw_reuse to 1, so that the client can bind to a socket which is still in TIME_WAIT after all ports have been used.

> sysctl -w fs.file-max=2000000
>
> Are my tests wrong or the system can't handle more than 200 requests/second ?

There is no reason. I regularly test above 75000 requests/s between client and server.

> > You also need to adjust haproxy settings according to your webservers
> > capabilities (maxconn/weight/minconn/inter).
>
> Can you please exemplify this ?

What Benoit meant is that if your servers are configured for 200 concurrent connections, you must not exceed that in haproxy's configuration (server's "maxconn" parameter). If your servers are unbalanced in terms of capacity, you should reflect this in the "weight" parameter, etc...

> > A basic apache2 memory imprint is around 10Mb/process, so to answer 100
> > concurrent connections it need 1Go of memory
>
> This is why I quited Apache. I am looking for a testing methodology in order
> to determine the maximum request/second that a server can handle - I know it is not
> HAProxy fully related but any suggestions are very welcomed.

The number of requests/s is easily checked by requesting a small object, so that you don't account the network time.

> > I would recommend tuning the software stack *before* tuning the system
> > stack.

I agree a lot here ! Too often we see people doing crap on system tunables instead of fixing the application. The system must only be tuned when you know *why* you have to tune it.

> I followed your advice and I tuned the webservers a little changing to epoll, sendfile
> and noatime, but Willy Thereau did not posted his sysctl parameters and HAProxy
> settings used in the last test

It's precisely because there is nothing magic here, I have not changed them from the defaults my system boots on (though I agree that those ones were already tuned a bit a long time ago). But basically, they just consist in increasing the source port range, increasing the TIME-WAIT buckets and max_orphans, setting tw_reuse to 1 and increasing somaxconn and max_syn_backlog in order to accept 10-20000 un-acknowledged connections. When I unpack the machines and boot them again I can give you the specific values, I don't have them all in mind, but once again, nothing really magics here.

> and I find it hard to beleive that a 2.6 kernel can handle 38000 concurrent
> connections without at least increasing the tcp_backlog queue or decreasing
> the fin_timeout.

The backlog and fin_timeout have nothing to do with concurrent connections, they are related the the connection rate. The source port range will impact concurrent connections. The max number of sockets too (fs.file-max). And BTW I just rechecked and in this test I reached 38000 connections/s, those were not concurrent connections, so yes, backlog and fin_timeout have to slightly be adjusted. But if haproxy never takes more than 10ms between two accept() cycles, even a very small backlog of 380 is OK. The fin_timeout issue is properly handled by the tw_reuse setting and the fact that haproxy does a setsockopt(SO_REUSEADDR).

Regards,
Willy Received on 2009/08/01 08:22

This archive was generated by hypermail 2.2.0 : 2009/08/01 08:30 CEST