Friday, August 4, 2017

Guerrilla Training September 2017

This year offers a new 3-day class on applying PDQ in your workplace. The classic Guerrilla classes, GCAP and GDAT, are also being presented.

Who should attend?

  • architects
  • application developers
  • performance engineers
  • sys admins
  • test engineers
  • mainframe sysops
  • database admins
  • webops
  • anyone interested in getting beyond ad hoc performance analysis and capacity planning skills

Some course highlights:

  • There are only 3 performance metrics you need to know
  • How to quantify scalability with the Universal Scalability Law
  • Hadoop performance and capacity management
  • Virtualization Spectrum from hyper-threads to cloud services
  • How to detect bad data
  • Statistical forecasting techniques
  • Machine learning algorithms applied to performance data

Register online. Early-bird discounts run through the end of August.

As usual, Sheraton Four Points has bedrooms available at the Performance Dynamics discounted rate. Link is on the registration page.

Also see what graduates are saying about these classes.

Tell a colleague and see you in September!

Wednesday, March 15, 2017

Morphing M/M/m: A New View of an Old Queue

The following abstract has been accepted for presentation at the 21st Conference of the International Federation of Operational Research Societies — IFORS 2017, Quebec City, Canada.

  • Update July 31, 2017: Here are my IFORS slides
  • Update June 08, 2018: In response to an audience question at my IFORS 2017 session, I have now demonstrated that there is an upper bound for the error in the morphing approximation. See footnotes below.
  • Update August 18, 2020: This new paper has the complete exact solution method that I had been seeking all along.

This year is the centenary of A. K. Erlang's paper [1] on the determination of waiting times in an M/D/m queue with $m$ telephone lines.* Today, M/M/m queues are used to model such systems as, call centers [3], multicore computers [4,5] and the Internet [6,7]. Unfortunately, those who should be using M/M/m models often do not have sufficient background in applied probability theory. Our remedy defines a morphing approximation to the exact M/M/m queue [3] that is accurate to within 10% for typical applications. The morphing formula for the residence-time, $R(m,\rho)$, is both simpler and more intuitive than the exact solution involving the Erlang-C function. We have also developed an animation of this morphing process. An outstanding challenge, however, has been to elucidate the nature of the corrections that transform the approximate morphing solutions into the exact Erlang solutions. In this presentation, we show:
  • The morphing solutions correspond to the $m$-roots of unity in the complex $z$-plane.
  • The exact solutions can be expressed as a rational function, $R(m,z)$.
  • The poles of $R(m,z)$ lie inside the unit disk, $|z| < 1$, and converge around the Szego; curve [8] as $m$ is increased.
  • The correction factor for the morphing model is defined by the deflated polynomial belonging to $R(m,z)$.
  • The pattern of poles in the $z$-plane provides a convenient visualization of how the morphing solutions differ from the exact solutions.

* Originally, Erlang assumed the call holding time, or mean service time $S$, was deterministic with unit period, $S=1$ [1,2]. The generalization to exponentially distributed service periods came later. Ironically, the exponential case is easier to solve than the apparently simpler deterministic case. That's why the M/D/1 queue is never the first example discussed in queueing theory textbooks.
† The derivation of the morphing model is presented in Section 2.6.6 of the 2005 edition of [4], although the word "morphing" is not used there. The reason is, I didn't know how to produce the exact result from it, and emphasizing it would likely have drawn unwarranted attention from Springer-Verlag editors. By the time I was writing the 2011 edition of [4], I was certain the approximate formula did reflect the morphing concept in its own right, even though I still didn't know how to connect it to the exact result. Hence, the verb "morphs" timidly makes its first and only appearance in the boxed text following equation 4.61.
‡ The relative error peaks at 10% for $m \sim 20$ and $\rho \sim 90\%$, then peaks again at 20% for $m \sim 2000$ and the servers running 99.9% busy. However, the rate of increase in peak error attenuates such that the maximum error is less than 25%, even as $m \rightarrow \infty$ and $\rho \rightarrow 100\%$. A plot of the corresponding curves gives a clearer picture. This behavior is not at all obvious. Prior to this result, it could have been that the relative error climbed to 100% with increasing $m$ and $\rho$.

References

  1. A. K. Erlang, "Solution of Some Problems in the Theory of Probabilities of Significance in Automatic Telephone Exchanges," Electroteknikeren, v. 13, p. 5, 1917.
  2. A. K. Erlang, "The Theory of Probabilities and Telephone Conversations," Nyt Tidsskrift for Matematik B, vol 20, 1909.
  3. E. Chromy, T. Misuth, M. Kavacky, "Erlang C Formula and Its Use In The Call Centers," Advances in Electrical and Electronic Engineering, Vol. 9, No. 1, March 2011.
  4. N. J. Gunther, Analyzing Computer System Performance with Perl::PDQ, Springer-Verlag, 2005 and 2011.
  5. N. J. Gunther, S. Subramanyam, and S. Parvu, "A Methodology for Optimizing Multithreaded System Performance on Multicore Platforms," in Programming Multicore and Many-core Computing Systems, eds. S. Pllana and F. Xhafa, Wiley Series on Parallel and Distributed Computing, February 2017.
  6. N. J. Gunther, "Numerical Investigations of Physical Power-law Models of Internet Traffic Using the Renormalization Group," IFORS 2005, Honolulu, Hawaii, July 11—15.
  7. T. Bonald, J. W. Roberts, "Internet and the Erlang formula," ACM SIGCOMM Computer Communication Review, Volume 42, Number 1, January 2012.
  8. C. Diaz Mendoza and R. Orive, "The Szegő curve and Laguerre polynomials with large negative parameters," Journal of Mathematical Analysis and Applications, Volume 379, Issue 1, Pages 305—315, 1 July 2011.

Tuesday, January 17, 2017

GitHub Growth Appears Scale Free

Update of Thursday, August 17, 2017: It's looks like we can chalk up another one for the scale-free model (described below) as Github apparently surpasses 20 million users. Outgoing CEO Wanstrath mentioned this number in an emailed statement to Business Insider.
"As GitHub approaches 700 employees, with more than $200M in ARR, accelerating growth, and more than 20 million registered users, I'm confident that this is the moment to find a new CEO to lead us into the next stage of growth. ....."

The Original Analysis

In 2013, a Redmonk blogger claimed that the growth of GitHub (GH) users follows a certain type of diffusion model called Bass diffusion. Here, growth refers to the number of unique user IDs as a function of time, not the number project repositories, which can have a high degree of multiplicity.

In a response, I tweeted a plot that suggested GH growth might be following a power law, aka scale free growth. The tell-tale sign is the asymptotic linearity of the growth data on double-log axes, which the original blog post did not discuss. The periods on the x-axis correspond to years, with the first period representing calendar year 2008 and the fifth period being the year 2012.