Willy Tarreau wrote:
>>> What state are they in ? I would suspect FIN_WAIT_1 or 2, meaning >>> that haproxy would have closed upon timeout, but the socket is maintained >>> in the system as long as the client has not ACKed. I have already observed >>> very long-lasting FIN_WAIT_2 sockets on OpenBSD with no obvious solution. >>> Maybe Hugo Silva would have hints on the subject. >> Right, most where (>99%) in FIN_WAIT_2 state. On the other site (Debian >> Linux nodes) sockets where in LAST_ACK state.
I thought Haproxy is the client in this case and the HTTP Linux nodes are the servers? Nevertheless - this is a good explanation where these dead sockets came from.
> It is a bit surprizing that this happens in this direction. Maybe I
> misundersood the role of the Debian nodes, are they servers in fact ?
> The problem is that there is no reason for a client which receives
> a FIN to maintain the connection longer. However this normally happens
> on a server until it has sent the whole remaining data.
The Debian system are servers, yes. Apache is running on the systems. Another thing to mention: on the Haproxy server runs a session of stunnel to en-/decrypt traffic and forward request to Haproxy. But stunnel should behave like a normal client so this shouldn't matter.
> My guess is that you have a stateful equipment in between, such as a
> firewall, with very short FIN timeouts, and that it quickly removes
> the session from its table when the debian node remains silent for
> a few seconds. This may be caused by lost packets, in which case the
> firewall would be on the side of the haproxy box (maybe pf on the box
> itself ?). Then, once a session is dropped, the last ACK cannot pass
> anymore and the FIN_WAIT_2 state remains.
The systems are connected with a simple GigaBit switch. No firewall - whether hard- or software firewall - is between them. This means also no pf or iptables on the systems.
>>>> So I think haproxy does not close all the sockets correctly or gets >>>> somehow in trouble with the FreeBSD sockets system. As for the moment >>>> I'll try to raise the maxsockets and see if the problem disappears. >>> If raising the sockets fixes the problem, it means that the sockets >>> get purged late, and that you need to cover the time-frame between >>> the close() and the expiration, times the connection rate. Also, are >>> you sure that your timeouts are correctly set on haproxy ? That's >>> particularly important for the client timeout. >> That is exactly what I've done yesterday. I changed the following sysctl >> values: >> >> net.inet.tcp.keepidle=10000 >> net.inet.tcp.keepintvl=5000
As this is not a test system I'm not able to do this at the moment. Maybe I can test this later in maintenance time.
>> Now a dead unused socket will get closed more quickly (less than a >> minute). This works fine, there are only a few connections left in the >> FIN_WAIT_2 state.
Hmm, can I use TCP keepalives with simple HTTP nodes? I thought these settings were for all other TCP connections but HTTP.
>> Here are my timeout settings in haproxy.conf: >> clitimeout 150000 >> srvtimeout 30000 >> contimeout 2000 >> >> Could these settings be the problem? What settings would you recommend?
I'll increase contimeout to 4000.
Thanks for your help. If you need any further details just ask me. I'm wondering if any other FreeBSD user has similar problems with haproxy and Linux nodes.
Matthias Received on 2008/04/30 08:28
This archive was generated by hypermail 2.2.0 : 2008/04/30 08:32 CEST