I have a theory on a recent issue i've been experiencing for the last 2 days that i would like some clarification on.
I have a load balancer with 30 backend servers with a roundrobin balance line and a maxconn per server of 75. from my understanding is that the round robing will go in turns one by one. once the balancer starts to queue up ej reaches maximum of 75 requests on all servers. will the queue wait for the next available server on the stack or will it wait for the actual server that was next in turn ?? if so 1 slow request that takes 6 seconds would mean that ever single request in the queue will take at least 6 seconds even if the server response time for all those queued up requests once server would be say 200ms causing a snowball effect of slow requests?
Am i seeing this right? and if so would it be better to avoid this issue to change my balance to leastconn so it takes the next avaiable one instead of waiting for the next in the round robin?? The issue you see at around 11:45 was me changing the max conn per server from 75 to 300 affectively clearing the queue increaseing the avg server response time from 1s to about 4 secons but overall dropping my total response time to client to about 4 seconds Here is a sample graph
This archive was generated by hypermail 2.2.0 : 2010/01/14 04:15 CET