Re: Using HAproxy in web server benchmark configuration (e.g. SPECweb)?

From: Willy Tarreau <w#1wt.eu>
Date: Mon, 5 Jan 2009 22:50:18 +0100


On Mon, Jan 05, 2009 at 01:22:52PM -0800, Hsin, Chih-fan wrote:
> Thank you for your quick response. The SPECweb2005 and diagram can be found at http://www.spec.org/web2005/docs/designdocument.html

OK, approximately what I expected.

> What you described is basically correct.
> The primary client is a PC which does all the benchmark control stuff. There are actually several client machines that send the web request to SUT. SUT communicates with the backend simulator (which simulates the backend application server). SUT will send the web response back the corresponding client machine.
>
> The distribution part from the real client web requests to SUT (a set of web servers) seems to be straight-forward. I can use HAproxy and use the round-robin method for now. The web responses from the web servers to the real clients should be the direct communication (without going through HAproxy). I guess, but not sure. The web responses need to have the source address as the virtual common SUT address though. Does HAproxy support this?

No, you cannot proceed like this. Haproxy receives a TCP connection from one client and establishes another TCP connection to a server. Those are two distinct TCP connections (it's a proxy). So the server's response cannot directly return to the client. It's not just a matter of source address (haproxy can simulate the client's IP when connecting to the server), it's really a matter of different TCP sessions (sequence numbers, options, window, etc). You'll have the same limitation with any proxy-based load-balancer. On the other hand, only proxy-based load-balancers will be able to consider layer 7 contents (eg: URL, file name and extension, etc...).

Also, I see in SpecWeb2005 that the server is Java-based. From my experience on real applications, the performance factor is around 160 between haproxy and java apps. That means that with 16 processors at 100% running an application in a JVM, you observe about 10% of one processor used by haproxy. It will be somewhat similar for other event-based load-balancers, they generally consume very little resource on the machine they're installed on.

> I am trying to see how to set up the Primary Client and Backend Simulator communications to the corresponding web server in the SUT set since the original SPECweb configuration seems to use only 1 SUT machine. I am talking to some SPECweb expert, and will update you my findings.

Your architecture should look like this :

[ clients ] -.                  ,-> [ web server ] -.
[ clients ] --+-> [ haproxy ] -+--> [ web server ] --+-> [ BeSim ]
[ clients ] -'                  `-> [ web server ] -'

All clients are configured to attack haproxy's address:port. Haproxy knows about all servers' address:port.

When a request from any client goes to haproxy, it selects a server and forwards the request there. The server in turn delegates some processing to the backend simulator (BeSim).

In fact, my real big concern here is the BeSim. It's not clear whether it's permitted by SpecWeb2005 to have multiple BeSims. If it's not possible, it could very quickly become the bottleneck. It already appears to be the case in already published reports, which indicate overall similar performance for 8 cores, 16 cores and 24 cores on the web server.

> Do you know any other public-accepted web server benchmarks that HAproxy can be easily inserted?

As I said, I'm not familiar with all those "well-known" references. I tend to observe behaviours on real applications and real hardware, and only use very basic benchmarking tools in order to optimize system tuning and to verify that nothing is wrong at the lower levels (eg: network losses, etc...).

What are you precisely trying to benchmark ? From your initial description, I understood that you wanted to publish a big SpecWeb result using lots of web servers, but now I'm confused, it seems that in fact you'd like to benchmark haproxy. If the later is the case, I can help you tweak it (as well as the OS which is the longest part), and provide you with some tools to run the benchmark. It will not be a public-accepted benchmark but it will ensure that your system is ready to run almost any benchmark of your choice. Also, if you want to benchmark simple components like haproxy, I'd suggest contacting Spirent. Their platform is a well-established reference in this area, and some unpublished benchmarks of haproxy have already been run with their solution.

Regards,
Willy Received on 2009/01/05 22:50

This archive was generated by hypermail 2.2.0 : 2009/01/05 23:00 CET