Haproxy timing issues

From: Erik Torlen <erik.torlen#apicasystem.com>
Date: Tue, 1 Nov 2011 20:16:23 +0000


Hi,

I am currently (and have been from time to time the last weeks) doing some heavy loadtesting against haproxy with stud in front of it handling the ssl.

My loadtest has been focused on loadtesting SSL traffic through stud against haproxy on amazon ec2.

Our current problem is that we cannot get more then ~30k active connections (~150 conns/s) until we starting to see increased response time (>10-60s) on the client side. Running with 38k connections now and seeing much higher response time.

The setup is:
1 instance running haproxy + stud
2 instances running varnish server 3 cached images

Varnish has 100% cache hit ratio so nothing goes to the backend.

We have tried using m1.xlarge and the c1.xlarge. The m1.xlarge uses almost 100% cpu when doing the loadtests while c1.xlarge has a lot of resources left (stud using a few percent per process) and haproxy ~60-70%cpu. The only difference is that c1.xlarge gives quite better response time before the actual problem happens where resp times are increasing.

Haproxy is running with nbproc=1
Stud is running with n=6 and shared session cache. (Tried it with n=3 as well

From the logging in haproxy I could see that the time it takes to establish a connection against the backend and receive the data:

Haproxy.log
Nov 1 18:40:35 127.0.0.1 haproxy[18511]: x.x.x.x:54113 [01/Nov/2011:18:39:40.273] varnish varnish/varnish1 4519/0/73/50215/54809 200 2715 - - ---- 238/236/4/5/0 0/0 "GET /assets/images/icons/elite_logo_beta.png HTTP/1.1" Nov 1 18:40:35 127.0.0.1 haproxy[18511]: x.x.x.x:55635 [01/Nov/2011:18:39:41.547] varnish varnish/varnish1 3245/0/81/50207/53535 200 1512 - - ---- 238/236/3/4/0 0/0 "GET /assets/images/icons/favicon.ico HTTP/1.1" ...
Nov 1 18:40:44 127.0.0.1 haproxy[18511]: x.x.x.x:34453 [01/Nov/2011:18:39:25.330] varnish varnish/varnish1 3082/0/225/32661/79559 200 1512 - - ---- 234/232/1/2/0 0/0 "GET /assets/images/icons/favicon.ico HTTP/1.1" Nov 1 18:40:44 127.0.0.1 haproxy[18511]: x.x.x.x:53731 [01/Nov/2011:18:39:25.036] varnish varnish/varnish1 3377/0/216/32669/79854 200 1725 - - ---- 233/231/0/1/0 0/0 "GET /assets/images/create/action_btn.png HTTP/1.1"

Haproxy.err (NOTE: 504 error here)

Nov 1 18:40:11 127.0.0.1 haproxy[18511]: x.x.x.x:34885 [01/Nov/2011:18:39:07.597] varnish varnish/varnish1 4299/0/27/-1/64330 504 194 - - sH-- 10916/10914/4777/2700/0 0/0 "GET /assets/images/icons/favicon.ico HTTP/1.1" Nov 1 18:40:12 127.0.0.1 haproxy[18511]: x.x.x.x:58878 [01/Nov/2011:18:39:12.621] varnish varnish/varnish2 314/0/55/-1/60374 504 194 - - sH-- 3692/3690/3392/1623/0 0/0 "GET /assets/images/icons/favicon.ico HTTP/1.1"

Nov 1 18:40:18 127.0.0.1 haproxy[18511]: x.x.x.x:35505 [01/Nov/2011:18:39:42.670] varnish varnish/varnish1 3515/0/22078/10217/35811 200 1512 - - ---- 1482/1481/1238/710/1 0/0 "GET /assets/images/icons/favicon.ico HTTP/1.1" Nov 1 18:40:18 127.0.0.1 haproxy[18511]: x.x.x.x:40602 [01/Nov/2011:18:39:42.056] varnish varnish/varnish1 4126/0/22081/10226/36435 200 1512 - - ---- 1475/1474/1231/703/1 0/0 "GET /assets/images/icons/favicon.ico HTTP/1.1"

Here is the logs from running haproxy with varnish as a backend on the local machine:

Haproxy.log
Nov 1 20:00:52 127.0.0.1 haproxy[18953]: x.x.x.x:38552 [01/Nov/2011:20:00:45.157] varnish varnish/local_varnish 7513/0/0/0/7513 200 1725 - - ---- 4/3/0/1/0 0/0 "GET /assets/images/create/action_btn.png HTTP/1.1" Nov 1 20:00:54 127.0.0.1 haproxy[18953]: x.x.x.x:40850 [01/Nov/2011:20:00:48.219] varnish varnish/local_varnish 6524/0/0/0/6524 200 1725 - - ---- 2/1/0/1/0 0/0 "GET /assets/images/create/action_btn.png HTTP/1.1"

Haproxy.err
Nov 1 20:00:38 127.0.0.1 haproxy[18953]: x.x.x.x:39649 [01/Nov/2011:20:00:08.665] varnish varnish/local_varnish 7412/0/22090/23/29525 200 1511 - - ---- 15700/15698/267/268/1 0/0 "GET /assets/images/icons/favicon.ico HTTP/1.1" Nov 1 20:00:38 127.0.0.1 haproxy[18953]: x.x.x.x:54565 [01/Nov/2011:20:00:12.255] varnish varnish/local_varnish 3823/0/22090/23/25936 200 1511 - - ---- 15700/15698/266/267/1 0/0 "GET /assets/images/icons/favicon.ico HTTP/1.1"

I see on all these tests that haproxy-stats is showing %1 idle but "top" shows that haproxy are using ~70% cpu?

The "jungle" aka amazon and its internal network are causing a lot of latency when running varnish on external machine. The response times get better when running varnish locally (~0.5s improvement).
But there is still very high response times in haproxy.err when running varnish locally?

I have played around with sysctl values and found some that improved my performance. My feeling is that I need to tune some more values in order to go beyond this level, suggestions?

Kind Regards
Erik Received on 2011/11/01 21:16

This archive was generated by hypermail 2.2.0 : 2011/11/01 21:30 CET