Why Layer 7 load balancing sucks…
by Malcolm Turnbull
Not Sure if if should put a caveat at the top but: Willy Tarreau has written an excellent argument on behalf of Layer 7 Load Balancers :
Layer 7 is part of the OSI model called the application layer. A typical example would be a web server or database server.
While load balancing hardware marketing execs get very excited about the fact that their product can magically scale your application by using amazing Layer 7 technology in the load balancer such as cookie inserts and tracking/re-writing. What they fail to mention is that any application that requires the load balancer to keep track of session related information within the communications stream can never ever be scalable or reliable.
But lets step back a minute and think about what we are trying to achieve with our load balancing solution.
Are we just looking for increasing the load or performance of our application by adding more application servers?
Or are we trying to achieve true scalability and true horizontal scaling to our application?
I would hope that you are trying to achieve scalability, this will allow you to cope with inevitable changes in demand for the application as well as enable simple maintenance. True scalability of a system enables you to be comfortable in the knowledge that you have a plan when your traffic increases 100 fold. Google is spending a lot of man hours and dollars finding a way to scale their current (fairly large cluster), they are somewhat in a league of their own when they are trying to scale a system by 100 times when they already have an estimated 4 million cores.
BTW just to digress again I don’t know for definite but am pretty sure Google uses layer 4 load balancing mixed with a lot of BGP stuff for global distribution. Stack loads of read-only and partitioned MySQL replicas, cached by stack loads of partitioned index servers (memchached maybe) + stack loads of clustered file systems for storage… probably worth several blogs in itself…
And how do I know Google is using Layer 4 load balancing and not Layer 7? Because they are not stupid, that’s why!
This should become more obvious as I continue rambling.
A lot of talks about load balancing start of by saying that originally people used multiple DNS ‘A’ records to allow round robin access to a bunch of web servers to increase scalability. They will then go on to explain that this was rubbish because it didn’t have health checking, server weighting, feedback agents or cookies. Which is kind of obvious but its not too hard to add health checking to your DNS server. What do you think a GSLB (Global Server Load Balancer) is? A lot of large scale production systems with enormous traffic still use this method because:
A) It’s simple
B) It works
I just realised that technically DNS round robin IS a Layer 7 Load Balancer… oh well that’s marketing for you.
Anyway most people started using little black boxes from CISCO, Alteon etc. that were effectively simple routers. They called them layer 4 routers because as well as doing standard layer 3 router stuff they would also do application health checks such as ping or HTTP GET etc. The nice thing about these little 486 class boxes with 32MB RAM is that they could handle 10’s of thousands of connections without breaking a sweat.
NB. The Loadbalancer.org entry level appliance is a P4 3GHz with 512MB RAM but a VIA-EPIA would have similar performance characteristics apart from the limited IO.
However not content with a technology that was so boring because ‘it just worked’ various marketing departments came up with the idea that rather than bothering to make your application scalable you should just slap some sticky tape on the load balancer at the front end that ensures on client connections go to the same server. What they did was put a proxy application on the load balancer that allowed it to terminate communication streams and read or modify them i.e. insert cookies so that it would know which server in the group to send the connection to. This introduced a whole heap of issues because of the horrendous architecture design that could be sticky plaster fixed by the load balancer vendor and in the process charge the customer more money for the priveledge :-).
I have nothing against cookies (especially chocolate chip) but they should be in either:
a) The Database
b) Memcached (or something similar)
NOT on the load balancer. If your cookies are on the load balancer, then they are totally useless to the application and therefore you can’t get a session to fail over to another server in the cluster.
Small caveat before I get flamed: Yes I know some applications are legacy and badly written (i.e. without a persistent data store), that doesn’t make cookies on load balancers any more elegant than a sticky plaster.
A quick summary:
Layer 4 Load Balancer:
- Scalable
- Reliable
- High Performance
- Load Balancing
Layer 7 Load Balancer
- Not Scalable
- Low performance (needs very powerful hardware)
- Far more complex code base
- Terminates Connections and therefore needs custom code for each protocol + security headache
- Becomes a single point of failure (requires bandwidth for syncing session state with backup device)
- Real Servers in cluster by definition can’t fail over and therefore:
- NOT conducive to High-Availability
- Load is distributed but definitely not balanced
Another caveat: OK, so I’m being a bit mean. Yes you can minimize virtually all of the downsides to layer 7 load balancing with modern devices, good planning etc. You can apply enough sticky plaster bits of code to get the whole thing working as expected, you could use layer 4 load balancers to balance the load over multiple layer 7 units to get over the performance issues and….
Hang on a minute, if the application handled persistence correctly then surely we wouldn’t need all this crap?
KISS (Keep It Simple Stupid)
Surely any web application developer in the whole wide world by now understands that any persistent data should be available to all nodes in the cluster? One of the first things a web developer should decide is where is the application state data going to be held?
What the developer should do with session data:
- In a database (clustered or replicated of course)
- In a memory cache pool (clustered or replicated of course)
What they do in practice and then regret latter:
- In the standard local PHP or ASP/.Net container
NB. Their are now well established routes to making the standard containers persistent.
I will come back to this later…


