Dealing with a server going down

From: Martin Goldman <me#mgoldman.com>
Date: Sun, 18 Nov 2007 23:33:22 -0500


Hi folks,

I'm new to haproxy, and I'm working on a simple SMTP cluster with 2 servers to learn how it works. I've got a quick question:

When both SMTP servers are up and running, when I telnet to the cluster on port 25, I'm immediately connected and receive the SMTP server's greeting banner, sometimes indicating I'm connected to one server and sometimes to the other (demonstrating that load balancing is working). If I stop the SMTP service on one server and retry my connection to the cluster, I still get immediately connected and receive the SMTP greeting banner, now always indicating that I'm connected to the other server. But if I take one server off the network entirely and connect to the cluster once again, there's very often a delay before the SMTP greeting banner appears (the duration of which depends upon the value of contimeout). I suppose this means that haproxy is trying to send some of my requests to the server that's down, then it's waiting for the connections to time out before trying the other server.

Now, I could be wrong, but it seems to me that this arrangement doesn't provide the highest quality of service when one of the nodes in your cluster goes down -- although all requests are completed eventually, some are sent to the bad node and have to time out before they can be serviced by another node. It seems like it should be possible to determine when a node goes down, and to have haproxy stop sending requests to that node until it comes back up.

So I guess my question is: is there a way to do this? Or am I completely misunderstanding something?

Thanks!

Regards,
Martin Goldman Received on 2007/11/19 05:33

This archive was generated by hypermail 2.2.0 : 2007/11/19 06:15 CET