Willy Tarreau wrote:
> On Mon, Jun 23, 2008 at 06:33:50PM -0400, James wrote:
>
>> I have a few questions I would like to pose to the HAProxy list.
>>
>> First, is there a listing in the documentation as to what exactly
>> triggers the various Warnings and Errors that come up in the stats
>> output? When one of these conditions occur, does it come up in the
>> haproxy logs? I am seeing a lot of retry warnings on 3 out of 5 servers
>> defined in one of my backends (the numbers tend to keep increasing on
>> every reload).
>>
>
> The retries indicate that haproxy failed to establish a connection to a
> server within the "contimeout". It might be because the server is already
> overflowed and does not respond, because there is a poorly configured
> ip_conntrack somewhere, because you don't have enough source ports on
> your haproxy machine. There are many reasons for this to happen. You
> should check with tcpdump that the packets are correrctly going out of
> the machine, and observe if the other end responds.
>
The servers typically show very low load, even during high load times. I even see this issue when the number of clients connecting to the LB is low as well, so I don't think its load related (though, higher the load the rate of resp warnings increases faster). This weekend, I got rid of ip_conntrack on the backend servers by modifying iptables to not load it up, but I am still seeing the same issues. I do not think its switch related either, as this group of servers use to be on another switch and the same issue occurred. Removing ip_conntrack from the LB itself is the obvious next step, but I'm not sure how much it will help. Any other suggestions?
>> Also, I've noticed that at times of high usage, the number of active
>> connections between the backend servers is uneven by a large margin. The
>> servers that run linux seem to end up with hgher numbers than machines
>> with freebsd installed on it. Could this have something to due with
>> socket issues on the backend servers? Anyway to go about debugging this?
>>
>
> Possibly, or also network packet losses between the linux servers. Wouldn't
> you be running gigabit ethernet ports connected to forced-100 Mbps switches ?
> In this case, you would most likely end up with a 100-half machine. Check
> with ethtool.
>
Ethtool shows that everything is fine (all interfaces are gigabit ethernet). I have a feeling that the higher number of connections is due to the issue above, as these issues are only occuring with Linux based servers.
This archive was generated by hypermail 2.2.0 : 2008/06/30 19:45 CEST