Hypermail

From: Willy Tarreau <w#1wt.eu>
Date: Thu, 3 Dec 2009 06:16:29 +0100

On Wed, Dec 02, 2009 at 07:44:40PM -0500, Lincoln wrote:
> Thanks Willy for offering to help us out with this.
>
> We are running on an Amazon EC2 m1small instance which is very common for a
> load balancer machine.
>
> I changed /proc/sys/net/ipv4/tcp_timestamps to 1 - unfortunately to no
> effect.

OK.

> Here are my iptables settings (nothing special here that I can see - I
> haven't modified anything):
> root#lb1:~$ iptables -L
> Chain INPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain FORWARD (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination

OK so most likely it was not even loaded.

> I would like to try accepting INVALIDs as you suggest - just to see if that
> addresses the problem before digging deeper. Unfortunately I'm not very
> familiar with iptables - could you show me what I should run to try that?

you don't need to because you don't have any iptables rules, so those are implicitly allowed. The common case I was talking about was when people explicitly drop packets in invalid state.

> If not that, perhaps something else about the EC2 infrastructure is using
> sequence number randomization? Are there other things I can look for?

If you don't have iptables, the your machine should have sent either a SYN/ACK or an ACK. If you really took the trace from the machine itself, then I have no explanation about the problem :-(

You said that in every trace it was the same pattern, ie the first packet which was accepted was the SYN without timestamps. Are you absolutely sure it's *always* the case and it's not just random ? I'm asking because the system might refrain from sending a SYN/ACK when the TCP SYN backlog is full, which is completely independant from the SYN packet's shape. Your tcp parameters tuning were OK, but for the backlog you also need to set /proc/sys/net/core/somaxconn to a large value otherwise it serves as a max. By default it's very low (128). Try setting it to 10000 (you need to restart haproxy for the change to take effect).

A "uname -a", "netstat -i" and "netstat -s" can help too.

Regards,
Willy Received on 2009/12/03 06:16

Re: weird tcp syn/ack problem