Hypermail

From: Willy Tarreau <w#1wt.eu>
Date: Tue, 1 Jul 2008 22:30:09 +0200

On Mon, Jun 30, 2008 at 01:31:48PM -0400, James wrote:
> Willy Tarreau wrote:
> >On Mon, Jun 23, 2008 at 06:33:50PM -0400, James wrote:
> >
> >>I have a few questions I would like to pose to the HAProxy list.
> >>
> >>First, is there a listing in the documentation as to what exactly
> >>triggers the various Warnings and Errors that come up in the stats
> >>output? When one of these conditions occur, does it come up in the
> >>haproxy logs? I am seeing a lot of retry warnings on 3 out of 5 servers
> >>defined in one of my backends (the numbers tend to keep increasing on
> >>every reload).
> >>
> >
> >The retries indicate that haproxy failed to establish a connection to a
> >server within the "contimeout". It might be because the server is already
> >overflowed and does not respond, because there is a poorly configured
> >ip_conntrack somewhere, because you don't have enough source ports on
> >your haproxy machine. There are many reasons for this to happen. You
> >should check with tcpdump that the packets are correrctly going out of
> >the machine, and observe if the other end responds.
> >
>
> The servers typically show very low load, even during high load times. I
> even see this issue when the number of clients connecting to the LB is
> low as well, so I don't think its load related (though, higher the load
> the rate of resp warnings increases faster). This weekend, I got rid of
> ip_conntrack on the backend servers by modifying iptables to not load it
> up, but I am still seeing the same issues. I do not think its switch
> related either, as this group of servers use to be on another switch and
> the same issue occurred. Removing ip_conntrack from the LB itself is the
> obvious next step, but I'm not sure how much it will help. Any other
> suggestions?

Well, once you remove ip_conntrack everywhere and you're 100% sure it's neither the network nor the servers, you'll have to sniff the network to find what is wrong.

> >>Also, I've noticed that at times of high usage, the number of active
> >>connections between the backend servers is uneven by a large margin. The
> >>servers that run linux seem to end up with hgher numbers than machines
> >>with freebsd installed on it. Could this have something to due with
> >>socket issues on the backend servers? Anyway to go about debugging this?
> >>
> >
> >Possibly, or also network packet losses between the linux servers. Wouldn't
> >you be running gigabit ethernet ports connected to forced-100 Mbps
> >switches ?
> >In this case, you would most likely end up with a 100-half machine. Check
> >with ethtool.
> >
>
> Ethtool shows that everything is fine (all interfaces are gigabit
> ethernet). I have a feeling that the higher number of connections is due
> to the issue above, as these issues are only occuring with Linux based
> servers.

So running tcpdump on one of those servers would really help.

Regards,
Willy Received on 2008/07/01 22:30

Re: Retr Warnings and Socket Issues