Re: Q about http-parser

From: Willy Tarreau <w#1wt.eu>
Date: Wed, 14 Nov 2007 08:59:18 +0100


Hi Aleks,

On Tue, Nov 13, 2007 at 08:11:24PM +0100, Aleksandar Lazic wrote:
> Hi,
>
> today I have have take a look into your http-parser and have not found
> where you handle the:
>
> ftp://ftp.rfc-editor.org/in-notes/rfc2616.txt
>
> 3.6.1 Chunked Transfer Coding
>
> in src/proto_http.c

The http-parser only parses the first line and header parts. In chunked-encoding, you also have to read one line of data, which contains the chunk size, then skip over that exact number of bytes, then read next line of data, etc... until you read 0 indicating the line ends, or until the peer closes (in which case I believe you can report an error).

> --- from rfc
> 19.4.6 Introduction of Transfer-Encoding
>
> HTTP/1.1 introduces the Transfer-Encoding header field (section
> 14.41). Proxies/gateways MUST remove any transfer-coding prior to
> forwarding a message via a MIME-compliant protocol.
>
> A process for decoding the "chunked" transfer-coding (section 3.6)
> can be represented in pseudo-code as:
>
> length := 0
> read chunk-size, chunk-extension (if any) and CRLF
> while (chunk-size > 0) {
> read chunk-data and CRLF
> append chunk-data to entity-body
> length := length + chunk-size
> read chunk-size and CRLF
> }
> read entity-header
> while (entity-header not empty) {
> append entity-header to existing header fields
> read entity-header
> }
> Content-Length := length
> Remove "chunked" from Transfer-Encoding
> ---
>
> I found a nice compressed info about MUST/SHOULD/... here
> http://www.and.org/texts/server-http.

Oh, that's a very interesting analysis. I did not know that LWS were allowed in the middle of numbers (and I still doubt about it, it looks strange). Otherwise his analysis is fine and he seems to have covered many of the crappy aspects of HTTP. However, I don't agree with his last statement :

  "All major clients send the correct CRLF encoding, and while    it's possible some minor clients may be sending just LF I    have no sympathy for accomodating them."

TuX did this in its early versions, and it quickly changed when people got bored by not being able to telnet into it and send test requests by hand! And-http may be a young project, and will certainly change this counter-rfc limitation when people start to complain ;-)

> As we talked many times I think there are still some issues before we
> can start to look into some 1.1 features but, thanks to your change to
> modules and filters we are not so far any more ;-))

No, I really think we're close to being able to support basic keep-alive, at least for adjacent content-less requests such as HEAD and GET. When the HTTP processing moves into process_http function, it should get even easier because the underlying states will serve only data transfer.

Cheers,
Willy Received on 2007/11/14 08:59

This archive was generated by hypermail 2.2.0 : 2007/11/14 09:45 CET