Load balancing

Load balancing is used to avoid server overloading and efficiently delivering web services from a multi-server architecture. It helps ensure scalability and availability of services. Deployment options and differences between the Beryllium and Boron LBaaS implementations are described in this chapter.


Load balancers distribute the server requests or traffic in some proportion to the back-end servers. A load balancing implementation therefore takes into account flow parameters of offered requests and uses an algorithm for distributing flows between the back-end servers (server farm), which may or may not take the state of the servers into account when deciding the routes. By performing health checks of the back-end servers, load balancers are able to avoid sending traffic to those that are not able to fulfill a new request.

Load balancing should be regarded as a traffic management concept rather than a specific function. It can be implemented in many different ways. Load balancers can be implemented on dedicated hardware or as software running on existing resources. In an SDN environment, load balancers are typically software-based to maximize flexibility.


A Layer-3 load balancer uses only the source and destination IP addresses from the packet header for routing decisions. In ECMP, routing is controlled by flow-related functions in addition to information included in the of the packet header.

Layer-4 load balancers operate at the transport level. The routing decisions are based on the TCP or UDP ports that packets use along with their source and destination IP addresses. L4 load balancers perform network address translation but do not inspect the actual contents of each packet in general.

L4 load balancers simply forward network packets to and from the upstream server without inspecting the content of the packets. They can make limited routing decisions by inspecting the first few packets in the TCP stream.

A Layer-7 load balancer operates at the application layer, which means that the actual content of each message can affect the routing. This typically applies to HTTP traffic.

A Layer 7 load balancer terminates the network traffic and inspects the message in a packet. It then sets up a new TCP connection to the selected back-end server based on the message content, the URL or a cookie and forwards the request.

Common routing algorithms

A load balancer needs an algorithm to determine how requests are distributed across the server farm. There are many different ways to do this - from very simple ones to sophisticated algorithms taking server performance indicators and other information into account.

Common load balancing algorithms include:

  • Round robin - A simple technique for making sure that a virtual server forwards each client request to a different server based on a rotating list. It is easy to implement, but does don’t take into account the load a server is currently experiencing. There is a danger that a server may receive a lot of processor-intensive requests and become overloaded.

  • Least connection method - Virtual servers using the least connection method will send any new request to the server with the least number of active connections.

  • Least response time method - Relies on the time taken by a server to respond to a health monitoring request. The speed of the response is an indicator of how loaded the server is and the overall expected user experience. Some load balancers take into account the number of active connections on each server as well.

  • Least bandwidth method - Selects the server currently serving the least amount of traffic as measured in megabits per second (Mbps) to send a new request to.

  • Hashing methods - Makes routing decisions based on a hash function of various data from the incoming packet. This includes connection or header information, such as source/destination IP address, port number, URL or domain name, from the incoming packet.

Equal Cost Multi-Path

A router-based load balancing function would try to spread the traffic evenly over multiple paths. To achieve this, ECMP can be used. However, using plain ECMP causes lack of symmetry and stickiness.

For every forward flow, the reverse flow must follow the same path. This is necessary because many services are stateful and need to capture both directions of the flow. However, the normal ECMP hash function is not symmetric. Symmetry is achieved by using a special flow table for this purpose.

Once a flow has been assigned to a particular path, it should follow that path for the entire duration of the flow. This is known as stickiness. If the flow is moved to a different path in the middle of the flow, the service will typically fail. When a new resource is added, hashing typically leads to most flows being moved to another path. This problem can also be completely eliminated by using the flow table.

LBaaS in Beryllium

Contrail implementation

In a Contrail vRouter, the hypervisor is using Equal Cost Multi Path (ECMP), which assigns flows to paths based on a hash function taking as argument source and destination IP addresses, ports and protocol, supported by an interface-route table.

LBaaS is a Contrail feature offered by Pan-Net as an optional service. It is characterized by

  • Can be used for connectivity within a data center or between data centers

  • Flow hashing based on 5-tuple (source IP, destination IP, protocol, source port, destination port), that is, layer 4

  • Establishes direct connection between components and supports flow stickiness

  • Health checks can be configured on a port basis. The port will be set to down status if the health check fails and will therefore be removed from the list of next-hops. Health checks can be configured for ping and HTTP.

To enable LBaaS in Beryllium, please contact Technical Support

LBaaS in Boron

L4/L7 LBaaS

The LBaaS implementation in Boron is based on Octavia. It consists of a number of resource types in the architecture shown in Figure 1.

Briefly, the main components and terminology are the following:

  • API controller and controller worker - the API is accessed by OSC commands to deploy, configure or remove load balancer (amphora) instances.

  • Health monitor - monitors the status of amphorae instances and handles failover events, in case an instance should fail. Note that it that a back-end server are reachable by responding to ping, but not the status of the application.

  • Housekeeping manager - manages pools, database records and certificate rotation.

  • Loadbalancer - the topmost object of the load balancer. When the load balancer is created, a VIP address is assigned to it and an amphora instance is launched on a compute node.

  • Amphora - the instance performing the load balancing. These are typically running on compute instances, configured automatically and managed through the OSC. Each amphora instance sends a heartbeat signal to the health monitor.

  • Listener - this defines the listening endpoint of a service, for example HTTP. A listener may refer to several pools. A pool is associated with one listener only.

  • Pool - group of members behind a load balancer, associated with a listener.

  • Member - these are the back-end servers (compute instances), organized in a pool, which the load balancer distribute requests to.

This LBaaS is available as self-service. For deployment details, please see Configure LBaaS

L3 LBaaS with ECMP

In Boron, a LBaaS product supporting ECMP is also available. It is based on BGPaaS for route management. For deployment of this load balancer, please contact Technical Support