Custom Load Balancing Endpoints in an Azure Web/Worker/VM Role

Windows Azure Web, Worker and Virtual Machine roles provide an easy built-in way to customise health monitoring for a load balanced endpoint, allowing you to disable a single endpoint for a role without causing the entire role to recycle. This can be achieved through use of the LoadBalancerProbes schema element, which is available in Azure SDK 1.7+.

Background

The Windows Azure Load Balancer running on the Azure Fabric Service acts as the default controller for determining how to route incoming network traffic to endpoints on your role instances. A default load balancer probe is provided that covers all endpoints for each role instance - this probe is high level and simply returns HTTP 200 OK if the role is in the Ready state (not Busy, Recycling, Stopping etc). If the response is not 200 OK, the load balancer stops all traffic being routed to that instance.

Once the role instance starts returning HTTP 200 again, the load balancer resumes traffic flow. When running a standard web role, your code is usually contained in the w3wp.exe process which isn't actually monitored by the load balancer (so failures like your web application returning Internal Server Error 500 won't stop the role becoming unavailable).

Overriding the default probe

If you override the default probe for an endpoint, you can provide more complex, lower level logic for each individual endpoint in your service. Your probe is checked regularly (every 15 seconds by default) - if your probe responds with a HTTP 200 or TCP ACK within the timeout period (31 seconds by default) then the associated endpoint will have traffic routed to it as normal. If it starts returning any other HTTP codes or TCP messages, it will be removed from load balancing.

Usages

You can use this in multiple ways, for example:

  1. Ensuring only one instance of your role provides a selected endpoint at a time.
  2. Disabling an instance if one of your websites starts returning an unusually large number of HTTP errors for a specified URI.
  3. Removing a single endpoint from load balancer rotation if it becomes overloaded - for example, temporarily disabling new requests to port 80 on a web role if that instance becomes overloaded by a small number of unusually heavy requests (this would normally cause problems given the default load balancing is round robin).
  4. Disabling an endpoint when a custom service becomes unavailable, for example stopping requests to a virtual machine role database if the database is encountering issues (while still allowing requests to all  other services).

Gotchas

Example .csdef schema

<ServiceDefinition>
  <LoadBalancerProbes>
    <LoadBalancerProbe name="TestProbe" protocol="{http|tcp}" path="{uri-for-checking-health-status-of-vm}" port="{port-number}" intervalInSeconds="{interval-in-seconds}" timeoutInSeconds="{timeout-in-seconds}" />
  </LoadBalancerProbes>
  <WorkerRole>
  ...
    <Endpoints>
      <InputEndpoint name="HttpIn" protocol="http" port="80" localPort="80" loadBalancerProbe="TestProbe" />
    </Endpoints>
  ...
  </WorkerRole>
</ServiceDefinition>

For a real world example of when a LoadBalancerProbe might be useful, see this post.

LoadBalancerProbe element attributes

Tweet