Tuesday, September 29, 2015

Remember the Alamo at CMG 2015

The Alamo is a reference to an episode in Texan history about defeat and revenge. But, there's nothing defeatist or mythical about the sessions I'll be giving at CMG in San Antonio this year.

Workshop: How to Do Performance Analytics with R, Mon Nov 2, 8-12am

You've collected cubic light-years of performance monitoring data, now whaddya gonna do? Raw performance data is not the same thing as information, and the typical time-series representation is almost the worst way to glean information. Neither your brain nor that of your audience is built for that (blame it on Darwin). To extract pertinent information, you need to transform your data and that's what the R statistical computing environment can help you do, including automatically.

Topics covered will include:

  • Introduction to R using RStudio
  • Descriptive statistics
  • Performance visualization
  • Data reduction techniques
  • Multivariate analysis
  • Machine learning techniques
  • Forecasting with R
  • Scalability analysis

Invited talk: Hadoop Super Scaling, Wed Nov 4, 5-6pm

The Hadoop framework is designed to facilitate parallel-processing massive amounts of unstructured data. Originally intended to be the basis of Yahoo's search-engine, it is now open sourced at Apache. Since Hadoop has a broad range of corporate users, a number of companies offer commercial implementations or support for Hadoop.

However, certain aspects of Hadoop performance---especially scalability---are not well understood. One such anomaly is the claimed flat scalability benefit for developing Hadoop applications. Another is that it's possible to achieve faster than parallel processing. In this talk I will explain the source of these anomalies by presenting a consistent method for analyzing Hadoop application scalability.

CMG-T: Capacity and Performance for Newbs and Nerds, Thur Nov 5, 9-11am

In this tutorial I will bust some entrenched myths and develop basic capacity and performance concepts from the ground up. In fact, any performance metric can be boiled down to one of just three metrics. Even if you already know metrics like, throughput and utilization, that's not the most important thing: it's the relationship *between* those metrics that's vital! For example, there are at least three different definitions of utilization. Can you state them? This level of understanding can make a big difference when it comes to solving performance problems or presenting capacity planning results.

Other myths that will get busted along the way include:

  • There is no response-time knee.
  • Throughput is not the same as execution rate.
  • Throughput and latency are not independent metrics.
  • There is no parallel computing.
  • All performance measurements are wrong by definition.

No particular knowledge about capacity and performance management is assumed.

See you in San Antonio!

Monday, August 24, 2015

PDQ Version 6.2.0 Released

PDQ (Pretty Damn Quick) is a FOSS performance analysis tool based on the paradigm of queueing models that can be programmed natively in

This minor release is now available for download.

Wednesday, July 29, 2015

Hockey Elbow and Other Response Time Injuries

You've heard of tennis elbow. Well, there's a non-sports, performance injury that I like to call hockey elbow. An example of such an "injury" is shown in Figure 1, which appeared in a recent computer performance analysis presentation. It's a reminder of how easy it is to become complacent when doing performance analysis and possibly end up reaching the wrong conclusion.


Figure 1. injured response time performance

Figure 1 is seriously flawed for two reasons:

  1. It incorrectly shows the response time curve with a vertical asymptote.
  2. It compounds the first error by employing a logarithmic x-axis.

Sunday, July 26, 2015

Next GCaP Class: September 21, 2015

The next Guerrilla Capacity Planning class will be held during the week of September 21, 2015 at our new Sheraton Four Points location in Pleastaton, California. Early bird rate ends August 21st.

During the class, I will bust some entrenched CaP management myths (in no particular order):

  • All performance measurements are wrong by definition.
  • There is no response-time knee.
  • Throughput is not the same as execution rate.
  • Throughput and latency metrics are related — nonlinearly.
  • There is no parallel computing.

No particular knowledge about capacity and performance management is assumed.

Attendees should bring their laptops as course materials are provided on CD or flash drive. The Sheraton provides free wi-fi to the internet.

We look forward to seeing you there!

Monday, March 23, 2015

Hadoop Scalability Challenges

Hadoop is hot, not because it necessarily represents cutting edge technology, but because it's being rapidly adopted by more and more companies as a solution for engaging in the big data trend. It may be coming to your company sooner than you think.

The Hadoop framework is designed to facilitate the parallel processing of massive amounts of unstructured data. Originally intended to be the basis of Yahoo's search-engine, it is now open sourced at Apache. Since Hadoop now has a broad range of corporate users, a number of companies offer commercial implementations of Hadoop.

However, certain aspects of Hadoop performance, especially scalability, are not well understood. These include:

  1. So called flat development scalability
  2. Super scaling performance
  3. New TPC big data benchmark

See "Hadoop Superlinear Scalability: The Perpetual Motion of Parallel Performance" for a more detailed discussion.

Friday, March 20, 2015

Performance Analysis vs. Capacity Planning

This question came up in a (members only) Linkedin discussion group:
Often found a misconception about these terms. I'm sure this must be written in a book, but for informal discussions is always preferable to cite sources from standardization institutes or IT industry referents.

Thanks in advance


Gian Piero
Here's how I answered it.

Monday, March 9, 2015

Guerrilla Training: New Location

Finally! We have a new location for our Guerrilla training classes in Pleasanton, California: Sheraton Four Points.

We had some complaints last year about noise from the car parks of surrounding restaurants during the night at the previous location. Four Points is much more secluded. It also has its, own restaurant, which some of you will recognize if you've attended previous Guerrilla classes (more than likely, we did lunch and/or dinner there).

The current 2015 schedule and registration page is now posted. The classroom is intimate and only holds about 10-12 people, so book early, book often.