Thursday, March 13, 2014

Performance of N+1 Redundancy

How can you determine the performance impact on SLAs after an N+1 redundant hosting configuration fails over? This question came up in the Guerrilla Capacity Planning class, this week. It can be addressed by referring to a multi-server queueing model.

N+1 = 4 Redundancy

We begin by considering a small-N configuration of four hosts where the load is distributed equally to each of the hosts. For simplicity, the load distribution is assumed to be performed by some kind of load balancer with a buffer. The idea of N+1 redundancy is that the load balancer ensures all four hosts are equally utilized prior to any failover.

The idea is that none of the hosts should use more than 75% of their available capacity: the blue areas on the left side of Fig. 1. The total consumed capacity is assumed to be $4 \times 3/4 = 3$ or 300% of the total host configuration (rather than all 4 hosts or 400% capacity). Then, when any single host fails, its lost capacity is compensated by redistributing that same load across the remaining three available hosts (each running 100% busy after failover). As we shall show in the next section, this is a misconception.

The circles in Fig. 1 represent hosts and rectangles represent incoming requests buffered at the load-balancer. The blue area in the circles signifies the available capacity of a host, whereas white signifies unavailable capacity. When one of the hosts fails, its load must be redistributed across the remaining three hosts. What Fig. 1 doesn't show is the performance impact of this capacity redistribution.

Wednesday, March 12, 2014

Modeling the Mythical Man-Month


This article was originally posted in 2007. When I updated the image today (in 2014), it reappeared with the more recent date and I don't know how to override that (wrong) timestamp. This seems to be a bug in Blogger.

In the aftermath of a discussion about software management, I looked up the Mythical Man-Month concept on Wikipedia. The main thesis of Fred Brooks, often referred to as "Brooks's law [sic]," can simply be stated as:

Adding manpower to a late software project makes it later.
In other words, some number of cooks are necessary to prepare a dinner, but adding too many cooks in the kitchen can inflate the delivery schedule.