Re: haproxy and orbited (COMET server)

From: Christoph Dorn <christoph#christophdorn.com>
Date: Fri, 01 Feb 2008 00:04:36 -0700


This exchange has been most beneficial. Thank You. I just want to touch on two things you mentioned to wrap things up for now.

  1. "LB or FW silently timing out" - under which circumstances do you think this would occur? - after how long? Could this happen regularly within the 1 to 2 minute mark (I know there are lots of variables and different implementations, but from a protocol/conceptual and standard implementation point of view?) - Whats the underlying difference between a "silent" timeout and a "clean" one where haproxy and the client would correctly close the connection? - May the intention of a silent timeout from a FW or proxy be to trap your client as it matches criteria of typical malicious traffic? - I guess what I am trying to get at is what timeout would you use that falls within the parameters of normal operation for most standard setups? The objective is to fall under the radar of any firewall/proxy/security appliance. Would 50 seconds be reasonable or too long?
  2. Port 443. Thats interesting. Never thought of that. - What do you think is the likelihood of intermediary agents detecting non-encrypted or non-standard traffic patterns and subsequently blocking your access? - So you believe if the client is unsuccessful in connecting to port 443 it will return immediately rather than a lengthy timeout? - So in theory COMET should work as intended over port 443 without the limitations that HTTP has in terms of long-standing connections? So it could be used for streaming?

Please excuse my network terminology. My background is primarily in non-network related development.

Christoph

Willy Tarreau wrote:
> On Thu, Jan 31, 2008 at 03:36:58PM -0700, Christoph Dorn wrote:

>>> Well, what an awful concept! Had it appeared at the beginning of the web,
>>> maybe it could have been the subject of an HTTP evolution to 1.2 for instance.
>>> But now... With all those proxies, anti-virii, firewalls, load balancers...
>>> does it have any chance to ever work for at least a small bunch of users ?
>> => I guess some people hope so. As a web developer I see it as a
>> technology to aid in making your web-application more real-time instead
>> of polling the server in regular intervals. Most implementations have
>> fallbacks to your standard polling model if the client connection will
>> not support the streaming of data.
>>
>> => I believe COMET should serve as an optional communication model with
>> your client, which when supported provides a great user-experience,
>> however the application must not rely on it being available.

>
> I agree with you on this. It's just like heavy use of javascript on the
> browser. It should be for comfort only, not to access the site.
>
> What is difficult however is to detect that it will not properly work. One
> of the examples I gave (LB or FW silently timing out) is hard to detect and
> costs a lot. The other one (anti-virii proxies) is complicated too because
> the client will stay connected waiting for data which will never come until
> the end of the transfer. Also, such a proxy will generally be installed in
> a big company and will not resist a heavy load. After 2-3 service outages,
> there are chances that you site will get blacklisted :-)
>
>> => If streaming does not work, there is still some value to have
>> connections held on the server for a minute or two before they time out
>> as it allows the application to trigger an update event which the client
>> can receive even if the connection is ended after each event. Basically
>> a polling model with a long server timeout.

>
> If you know the average delay between two data pushes, it may be wise to
> automatically cut the connection slightly after, let's say, the time by
> which 90% of the events would be processed. That way your server will be
> friendly to intermediate components, still saving responsiveness for 90%
> of the cases.
>
>>> To be honnest, I would really wish such a concept to succeed, because it would
>>> help getting rid of all the crappy heavy technologies people are using to build
>>> slow web sites. You know, the ones which work on their PC when they are alone,
>>> which enter production without any modification, and which saturate a server
>>> when the second user connects...
>> => Lol. Sounds like you have had some frustrations with this in the past
>> :) I totally agree. Its amazing what passes as "professionally
>> developed" software especially in the PHP language.

>
> Can you imagine a trivial professional app consuming 18 GB of RAM for 20
> concurrent users ? I was contacted to help them scale by setting up
> load-balancing. I was amazed, really !
>
>>> The problem you'll have with haproxy is not to scale to tens of thousands
>>> of concurrent connections, but to perform content switching on keep-alive
>>> connections. Haproxy does not support keep-alive (yet, I'm working on trying
>>> to get basic things working). So anything after the first request is considered
>>> data and will not be analyzed. That means that even if you push the timeouts
>>> very far, a client connecting to a server would always remain on that server
>>> until the connection closes. This will change when keep-alive gets supported,
>>> but it will not be before a few months from now it seems.
>> => What challenges do you foresee with employing a polling model where
>> the server holds the connection until an event occurs or a reasonable
>> timeout (1 or 2 minutes). The client would re-connect after each event
>> thus requiring no keep-alive.

>
> That method should be the right one. Experimentations will give you the
> appropriate timeout. Hint: you should stay slightly below round minutes
> because most intermediate equiments will be configured with integer minutes
> timeouts. So if you use 50s or 1mn50, you may get the most reliable compromise.
>
>> Will haproxy drop the connection to the COMET server if it looses the
>> client connection?

>
> yes of course! At the very beginning, haproxy was just a TCP proxy, so it
> respects the usual square model for the connection closures :
>
> client closes -------------------> close(server)
> ^ |
> | |
> | v
> close(client) <------------------- server then closes
>
>
> Timeouts also apply on this chain. They break horizontal arrows and the
> information propagates all around the square. I could not even imagine
> a different model. But of course, you're right to ask, since you too might
> have encountered a lot of crappy things :-)
>
>> This would be important
>> to ensure that the COMET server always as an accurate list of connected
>> clients and can notify the application if a triggering event cannot be
>> delivered because a client is not connected.

>
> If the timeout is short enough, this will work. It will also propagate server
> closure to the client, so that you can manage timeouts on the server side.
>
>>> Maybe nginx would be able to do that (I don't know all of its features). But
>>> it's known to scale at least. On the contrary, pound uses threads and will
>>> exhibit the well-described problems of this model after a few thousands
>>> connections.
>>> Depending on the site you're doing that for, maybe you'd want to turn to
>>> commercial solutions such as Zeus ZXTM which should scale and support
>>> keep-alive ?
>> => At this point I am looking into the practicality of COMET and the
>> possible benefits and challenges. I think I would opt to run the COMET
>> server on a different port for now if I am unable to proxy it. The
>> client could always try and connect on the dedicated port and if it
>> fails it can fall-back to the long-timeout polling or worst case
>> standard polling.

>
> The problem in trying another port is that the client may hang up trying
> to connect for 1 or 2 minutes because a firewall does not let their proxy
> pass through. What you might use however is the HTTPS port for your permanent
> connection. It will be tunnelled through proxies and escape anti-virii. It
> should then remain pretty fast. Also, if the client cannot connect to it, it
> will be for policy reasons and it will immediately be notified about the
> problem.
>
>> => I don't necessarily require COMET for streaming data, but rather for
>> near-realtime event notification from server to client.

>
> OK.
>
>>> I hope this helps at least a little bit, but I'm sorry I'm not very positive
>>> about the future of such a technology :-/
>> => I appreciate the constructive feedback. If I was wanting to be
>> convinced I would talk to a sales rep of a commercial solution :)

>
> :-)
>
> regards,
> willy
>
>
Received on 2008/02/01 08:04

This archive was generated by hypermail 2.2.0 : 2008/02/01 08:15 CET