Re: haproxy alleatory fails on config reload

From: Pablo Escobar <pescobar#cipf.es>
Date: Mon, 9 Jun 2008 19:07:52 +0200


Hi,

I haven´t dissapeard but been away from the keyboard for some days :)

I have been doing some testing and still can´t find the reason for the failed reloads.

I tried to disable ip_conntrack module with no luck :( Still the same behaviour. Connecting directly to haproxy on port 81 I see backends up but they are not avaiable trought apache until they reach EXACTLY 1min uptime.

I think I need to do some "tunning" on the system network parameters ( /proc/net ) but not sure where to start.

Something strange that I found is that I have a lot of TIME_WAIT connections on my haproxy machine.

[root]$ netstat -anp | grep TIME_WAIT | wc -l 2236

I have been reading around on google about this and it seems this shouldn´t affect but I am not really sure. ¿anyone has any suggestion about which network config to modify under /proc/net or what to look at ?

many thanks in advance for any help.

I will continue trying to find the reason for this and I hope I can write to the list with the solution........... I hope :)

Pablo.

El Friday 30 May 2008 14:55:17 Willy Tarreau escribió:
> Hi Pablo,
>
> On Fri, May 30, 2008 at 02:22:02PM +0200, Pablo Escobar wrote:
> > El Friday 30 May 2008 06:43:34 Willy Tarreau escribió:
> > > On Fri, May 30, 2008 at 01:22:56AM +0200, Pablo Escobar wrote:
> > > > Hi Krzysztof and Willy,
> > > >
> > > > I have compiled haproxy 1.3.15.1 for x86_64
> > > >
> > > > I found with dmesg some of this errors:
> > > > ip_conntrack: table full, dropping packet.
> > > >
> > > > So I tougth I had found the problem but I was wrong. I doubled the
> > > > ip_conntrack_max value and the dmesg error dissapeared but I still
> > > > get the same problem. Around 5 reloads ok and then a wrong reload
> > > > wich takes all my websites offline for EXACTLY 1 MIN.
> > > >
> > > > connecting to port 81 I see every backend up. connecting on port 80 I
> > > > get 503 error on every backend. exactly when the stats web arrives to
> > > > "uptime 1:01min" I get every backend up on port 80. I am sure the
> > > > 1min is not random. I have tried it 2 times with the same result.
> > >
> > > 1 minute might be the time needed to expire old sessions from your
> > > conntrack. You can try to reduce ip_conntrack_tcp_timeout_time_wait in
> > > /proc/sys/net/ipv4/netfilter to see if this has any effect.
> >
> > I have a value of 120 on ip_conntrack_tcp_timeout_time_wait. ¿would be
> > safe to change it to 60?
>
> Yes. Don't go below 20-30 though, otherwise you'll get some erroneous DROP
> logs due to late retransmits.
>
> > also I have found a 60 value on "ip_conntrack_tcp_timeout_syn_recv"
> >
> > [root#haproxy]$>
> > cat /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_timeout_syn_recv
> > 60
> >
> > ¿maybe the problem is related to this?
>
> I don't think so. It's related to incoming connections which are not ACKed.
> Typically needed in case of synflood.
>
> > > Also, it is possible that your conntrack is buggy and does not initiate
> > > a new session on a SYN reusing a same source port (once your source
> > > ports wrap around).
> > >
> > > You should *really* unload conntrack to see if it makes any difference.
> >
> > Sorry for my ignorance but........¿If I unload this module could it make
> > my haproxy machine stop working? This is the main machine on my web
> > cluster (doing reverse proxy on every webserver) so if it stops working
> > everything goes down. ¿is safe to remove the conntrack module? Right now
> > I can´t do it because this hours is when I have most of the web traffic
> > but if anyone can confirm that is safe to unload the module I will do
> > some testing tonight.
>
> The conntrack module is needed ONLY if you :
> - have netfilter configured on the machine to do stateful firewalling
> - are using NAT or transparent proxying through the TPROXY patch.
>
> Regards,
> Willy

-- 
Pablo Escobar Lopez
Head of Infrastructure & IT Support
Bioinformatics Department
Centro de Investigación Príncipe Felipe (CIPF)
Tfn: (34) 96 328 96 80 ext: 1004
http://bioinfo.cipf.es
Received on 2008/06/09 19:07

This archive was generated by hypermail 2.2.0 : 2008/06/09 19:15 CEST