Re: IIS times out on large post requests

From: Willy Tarreau <w#1wt.eu>
Date: Thu, 10 Jul 2008 00:11:12 +0200


Hi Rainer,

On Wed, Jul 09, 2008 at 04:10:02PM +0200, Rainer Sabelka wrote:
> Hi,
>
> I use haproxy 1.3.14.6 in a production environment together with apache
> configured as ssl proxy and 10 Windows IIS backends.
> All incoming traffic is ssl encrypted and is received via the apache server.
> Apache handles ssl en-/decryption, adds some request headers (X-Forwarded-For,
> X-Forwarded-Proto) and then uses mod_proxy to forward the requests to haproxy.
>
> Haproxy then balances the those requests among the applications servers, some
> of then run IIS-5 on Windows 2000, the rest IIS-6 on Windows 2003R2.
>
> Everything seems to run well, but from time to time users (using IE 6.0)
> complain that they cannot upload documents (large POST requests), however I
> was never able to reproduce the problem.
>
> In haproxy's logfile I can see 502 errors like the following:
>
> Jul 9 14:42:55 127.0.0.1 haproxy[9809]: 127.0.0.1:40984
> [09/Jul/2008:14:39:02.950] myapp_http myapp_default/web10
> 1921/0/0/-1/232731 502 204 - - SHVN 6/1/1/0 0/0 {212.XXX.YYY.ZZZ|1916297}
> "POST /AttachDocument.Asp?SID=150280&action=120&state=0&binaryattach=1
> HTTP/1.0"
>
> Between the {|} I log the original IP-address (X-Forwarded-For header) and the
> content-length of the request.
>
> At the same time (well, 2 hrs difference because GMT is used here instead of
> local time) I see the following line in IIS's logfile:
>
> 2008-07-09 12:42:54 10.4.0.37 41958 10.4.0.33 80 HTTP/1.0 POST
> /AttachDocument.Asp?SID=150280&action=120&state=0&binaryattach=1 - 1
> Timer_EntityBody DefaultAppPool
>
> Obviously IIS is waiting for some more data in the body of the POST request
> and times out after some time, but haproxy is already waiting for a response
> from the server, but instead gets a closed connection after the IIS timeout.
>
> I've observed this only with post requests which are larger then 20kByte.
> And it happened only when haproxy connected to the W2003/IIS-6 server, but
> never with W2000/IIS-5.
>
> Does anyone have an idea what could be wrong with my setup and/or what I can
> do to debug this problem?

I've just found the cause of the problem you're experiencing. I could even partially reproduce it. It is caused by a bug introduced in 1.3.8 which is in fact quite hard to trigger : if the client is streaming data into haproxy faster than the server reads, and manages to always maintain data in the request buffer, the server timeout is not updated. Then when the client stops sending data after it has been streaming for at least as long as the server timeout, the timeout is seen as expired. In my case, I reproduce a 504 by unplugging a cable between the client and haproxy in the middle of some streaming. But in your case, I suspect that only the last chunk of data does not reach IIS, which then closes the connection. Haproxy then notices this closing and returns a 502.

I believe you're not the first one to report this problem, but maybe this time your logs looked suspect enough to direct my nose to the right location.

I'm currently working on fixing this awful bug. I will release a fix for both 1.3.14 and 1.3.15 after some careful testing. In the mean time, you might increase your server timeout to a large value so that you've got a chance that a client does not manage to maintain the request buffer full for *that* long.

Thanks very much for your report, and I'm sorry for such a mistake. The fact that it remained unnoticed for 15 months clearly indicates it's hard to trigger, but apparently not in your case.

Best regards,
Willy Received on 2008/07/10 00:11

This archive was generated by hypermail 2.2.0 : 2008/07/10 00:15 CEST