Re: 503 errors when services available and other connections succeed to same service during failure window

From: Jerry Champlin <jgc#absolute-performance.com>
Date: Mon, 2 Aug 2010 01:11:48 -0600


Willy:

iptables is enabled on the load balancer as it also serves as part of a firewall pair for the environment. I just noticed that conntrack is also enabled on the backend servers. Could these tables be filling up and causing the resets? There are none of the usual errors from conntrack in dmesg so probably not. Apache is actively processing other requests during this period and we have not seen it crash. We have seen some odd things with Rails and Merb apps which are attached to apache through passenger but those were all related to mid-level xml-library problems. What is the difference between the [R] and [R.] lines in this tcpdump output?

Thanks for your help!

-Jerry

The tcpdump output corresponding to the following 503s is below:

Aug 2 06:45:22 localhost haproxy[24462]: 172.17.48.32:58322[02/Aug/2010:06:45:17.308] ssbe-core internal_core/pn2 0/5030/-1/-1/5030 503
212 - - SC-- 57/5 3/51/26/+5 0/0 "GET /clients HTTP/1.1" Aug 2 06:45:45 localhost haproxy[24462]: 172.17.48.32:60347[02/Aug/2010:06:45:40.801] ssbe-core internal_core/pn1 0/5036/-1/-1/5037 503
212 - - SC-- 23/1 9/18/10/+5 0/0 "GET /accounts/24 HTTP/1.1"

---
06:45:17.308640 IP 172.17.48.31.58322 > 172.17.48.11.38527: Flags [R.], seq
0, ack 690067132, win 0, length 0
06:45:17.471816 IP 172.17.48.11.113 > 172.17.48.32.57746: Flags [R.], seq 0,
ack 238711199, win 0, length 0
06:45:18.083756 IP 172.17.48.11.113 > 172.17.48.31.45365: Flags [R.], seq 0,
ack 4008574618, win 0, length 0
06:45:18.315208 IP 172.17.48.31.58322 > 172.17.48.11.38541: Flags [R.], seq
0, ack 706773563, win 0, length 0
06:45:18.707227 IP 172.17.48.11.113 > 172.17.48.32.60827: Flags [R.], seq 0,
ack 245965809, win 0, length 0
06:45:19.318039 IP 172.17.48.31.58322 > 172.17.48.11.38546: Flags [R.], seq
0, ack 722309128, win 0, length 0
06:45:20.328024 IP 172.17.48.31.58322 > 172.17.48.11.38559: Flags [R.], seq
0, ack 742389004, win 0, length 0
06:45:21.338016 IP 172.17.48.31.58322 > 172.17.48.11.38570: Flags [R.], seq
0, ack 755972054, win 0, length 0
06:45:21.464936 IP 172.17.48.11.113 > 172.17.48.31.45161: Flags [R.], seq 0,
ack 4062096655, win 0, length 0
06:45:21.612122 IP 172.17.48.11.113 > 172.17.48.32.48943: Flags [R.], seq 0,
ack 288307143, win 0, length 0
06:45:22.040168 IP 172.17.48.11.113 > 172.17.48.31.49065: Flags [R.], seq 0,
ack 4065985435, win 0, length 0
06:45:22.338645 IP 172.17.48.32.58322 > 172.17.48.11.40572: Flags [R.], seq
0, ack 775368237, win 0, length 0
06:45:22.842400 IP 168.95.5.59.25 > 172.17.48.31.52179: Flags [R], seq
1204256066, win 49232, length 0
06:45:22.842659 IP 168.95.5.59.25 > 172.17.48.31.52179: Flags [R], seq
1204256066, win 0, length 0
06:45:24.373467 IP 172.17.48.11.113 > 172.17.48.31.60639: Flags [R.], seq 0,
ack 4106982704, win 0, length 0
06:45:25.477069 IP 168.95.5.59.25 > 172.17.48.31.56695: Flags [R], seq
4127387940, win 49232, length 0
06:45:25.477973 IP 168.95.5.59.25 > 172.17.48.31.56695: Flags [R], seq
4127387940, win 0, length 0
06:45:25.931536 IP 172.17.48.11.113 > 172.17.48.32.58445: Flags [R.], seq 0,
ack 366615288, win 0, length 0
06:45:27.702765 IP 172.17.48.11.113 > 172.17.48.31.42189: Flags [R.], seq 0,
ack 4166572480, win 0, length 0
06:45:33.897753 IP 168.95.5.59.25 > 172.17.48.31.57316: Flags [R], seq
1245004230, win 49232, length 0
06:45:33.897844 IP 168.95.5.59.25 > 172.17.48.31.57316: Flags [R], seq
1245004230, win 0, length 0
06:45:34.127586 IP 172.17.48.11.113 > 172.17.48.32.34442: Flags [R.], seq 0,
ack 489438008, win 0, length 0
06:45:34.732253 IP 172.17.48.11.113 > 172.17.48.31.49586: Flags [R.], seq 0,
ack 4273226635, win 0, length 0
06:45:35.192744 IP 172.17.48.11.113 > 172.17.48.32.58752: Flags [R.], seq 0,
ack 515053015, win 0, length 0
06:45:37.405094 IP 172.17.48.11.113 > 172.17.48.32.52436: Flags [R.], seq 0,
ack 540846567, win 0, length 0
06:45:37.430697 IP 172.17.48.11.113 > 172.17.48.31.45584: Flags [R.], seq 0,
ack 13446395, win 0, length 0
06:45:37.481270 IP 172.17.48.11.113 > 172.17.48.32.49972: Flags [R.], seq 0,
ack 539900156, win 0, length 0
06:45:37.842159 IP 172.17.48.11.113 > 172.17.48.31.38986: Flags [R.], seq 0,
ack 17042584, win 0, length 0
06:45:40.801923 IP 172.17.48.32.60347 > 172.17.48.11.39593: Flags [R.], seq
0, ack 1065559817, win 0, length 0
06:45:41.808666 IP 172.17.48.32.60347 > 172.17.48.11.39604: Flags [R.], seq
0, ack 1081974137, win 0, length 0
06:45:42.386284 IP 172.17.48.11.113 > 172.17.48.32.55885: Flags [R.], seq 0,
ack 621260860, win 0, length 0
06:45:42.818050 IP 172.17.48.32.60347 > 172.17.48.11.39613: Flags [R.], seq
0, ack 1094294243, win 0, length 0
06:45:43.828735 IP 172.17.48.32.60347 > 172.17.48.11.39626: Flags [R.], seq
0, ack 1105823413, win 0, length 0
06:45:44.441916 IP 172.17.48.11.113 > 172.17.48.31.37687: Flags [R.], seq 0,
ack 133083105, win 0, length 0
06:45:44.838640 IP 172.17.48.32.60347 > 172.17.48.11.39633: Flags [R.], seq
0, ack 1120797411, win 0, length 0
06:45:45.764524 IP 172.17.48.11.113 > 172.17.48.31.60251: Flags [R.], seq 0,
ack 151529189, win 0, length 0
06:45:45.838662 IP 172.17.48.31.60347 > 172.17.48.11.44688: Flags [R.], seq
0, ack 1146461444, win 0, length 0
06:45:46.863950 IP 172.17.48.11.113 > 172.17.48.32.36912: Flags [R.], seq 0,
ack 694695743, win 0, length 0
06:45:52.349545 IP 172.17.48.11.113 > 172.17.48.31.46588: Flags [R.], seq 0,
ack 253985688, win 0, length 0
06:45:56.680539 IP 172.17.48.11.113 > 172.17.48.32.47961: Flags [R.], seq 0,
ack 844215653, win 0, length 0
06:45:57.155242 IP 172.17.48.11.113 > 172.17.48.32.48796: Flags [R.], seq 0,
ack 853722426, win 0, length 0
06:45:57.159651 IP 172.17.48.11.113 > 172.17.48.31.58654: Flags [R.], seq 0,
ack 314204910, win 0, length 0
06:45:57.492846 IP 172.17.48.11.113 > 172.17.48.32.37829: Flags [R.], seq 0,
ack 861517579, win 0, length 0
06:45:57.755249 IP 172.17.48.11.113 > 172.17.48.31.40816: Flags [R.], seq 0,
ack 328578328, win 0, length 0
06:45:58.275751 IP 172.17.48.11.113 > 172.17.48.32.41681: Flags [R.], seq 0,
ack 861124033, win 0, length 0
06:46:00.099884 IP 172.17.48.11.113 > 172.17.48.31.35457: Flags [R.], seq 0,
ack 374943459, win 0, length 0
06:46:00.506873 IP 168.95.195.16.25 > 172.17.48.31.43319: Flags [R], seq
179881015, win 24616, length 0
06:46:00.664754 IP 168.95.195.16.25 > 172.17.48.31.43319: Flags [R], seq
179881015, win 0, length 0
06:46:01.011770 IP 172.17.48.11.113 > 172.17.48.32.36374: Flags [R.], seq 0,
ack 907888211, win 0, length 0
06:46:01.555063 IP 172.17.48.11.113 > 172.17.48.31.41323: Flags [R.], seq 0,
ack 383037637, win 0, length 0
06:46:03.042201 IP 172.17.48.11.113 > 172.17.48.32.41771: Flags [R.], seq 0,
ack 949710694, win 0, length 0
06:46:04.503503 IP 172.17.48.11.113 > 172.17.48.31.34415: Flags [R.], seq 0,
ack 439416783, win 0, length 0
06:46:04.656624 IP 172.17.48.11.113 > 172.17.48.31.59158: Flags [R.], seq 0,
ack 432188120, win 0, length 0
06:46:09.198697 IP 168.95.5.62.25 > 172.17.48.31.33785: Flags [R], seq
494672182, win 49232, length 0
06:46:09.199021 IP 168.95.5.62.25 > 172.17.48.31.33785: Flags [R], seq
494672182, win 0, length 0
06:46:09.467256 IP 76.12.34.30.80 > 172.17.48.32.52708: Flags [R], seq
3555484570, win 0, length 0
06:46:09.467301 IP 76.12.34.30.80 > 172.17.48.32.52655: Flags [R], seq
3563106719, win 0, length 0
06:46:09.467323 IP 76.12.34.30.80 > 172.17.48.32.52611: Flags [R], seq
3559151742, win 0, length 0
06:46:09.467338 IP 76.12.34.30.80 > 172.17.48.32.52486: Flags [R], seq
3461909063, win 0, length 0
06:46:11.708929 IP 172.17.48.11.113 > 172.17.48.32.41215: Flags [R.], seq 0,
ack 1074243864, win 0, length 0

Jerry Champlin
Absolute Performance Inc.
--
Enabling businesses to deliver critical applications at lower cost and
higher value to their customers.


On Mon, Aug 2, 2010 at 12:15 AM, Willy Tarreau <w#1wt.eu> wrote:


> Hello Jerry,
>
> you trace is very interesting.
>
> > 03:36:45 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 35
> > 03:36:45 setsockopt(35, SOL_TCP, TCP_NODELAY, [1], 4) = 0
> > 03:36:45 connect(35, {sa_family=AF_INET, sin_port=htons(44757),
> > sin_addr=inet_addr("172.17.48.32")}, 16) = -1 EINPROGRESS (Operation now
> in
> > progress)
> > 03:36:45 sendto(35, "GET /hosts/45/metrics HTTP/1.1\r\n"..., 454,
> MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = -1 EAGAIN (Resource temporarily
> unavailable)
> > 03:36:45 sendto(35, "GET /hosts/45/metrics HTTP/1.1\r\n"..., 454,
> MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = -1 ECONNREFUSED(Connection refused)
>
> What this means is that something between haproxy and the server
> actively rejected the connection. If your local conntrack table
> was full, it would not do that, it would just time out.
>
> Still is it possible that the remote server sometimes rejects some
> connections ? One of the possibilities would be that your local
> source port range is too short and that it rejects early reuse of
> a same source port.
>
> In fact what could be done would be to track only SYN, FIN, and RST
> packets between the two machines :
>
> tcpdump -npi eth0 'tcp[13]&7!=0' -w tcp-flags.cap
>
> Then look for any RST there and check if it matches a SYN once the
> source port is found :
>
> tcpdump -Svvnr tcp-flags.cap 'tcp[13]&4!=0'
>
> Are you sure that the server never unbinds nor crashes ? And do you
> have any intermediary firewall between haproxy and the remote server ?
>
> Regards,
> Willy
>
>
Received on 2010/08/02 09:11

This archive was generated by hypermail 2.2.0 : 2010/08/02 09:15 CEST