Friday, January 18, 2013

Linux Per-Entity Load Tracking: Plus ça change

Canadian capacity planner, David Collier-Brown, pointed me at this post about some more proposed changes to how load is measured in the Linux kernel. He's not sure they're on the right track. David has written about such things as cgroups in Linux and I'm sure he understands these things better than I do, so he might be right. I never understood the so-called CFS: Completely Fair Scheduler. Is it a fair-share scheduler or something else? Not only was there a certain amount of political fallout over CFS but, do we care about such things anymore? That was back in 2007. These days we are just as likely to run Linux in a VM under VMware or XenServer or the cloud. Others have proposed that the Linux load average metric be made "more accurate" by including IO load. Would that be local IO, remote IO or both? Disk IO, network IO, etc., etc?

I can't say I completely follow the latest per-entity load proposal, but I would suggest that change for the sake of change is not usually a good thing. And when it comes to metrics like load average, there are plenty of automated tools, as well as sysadms and cap planners, that rely on it as an historical measure of server run-queue length for trend analysis.

Without having thought about it too deeply, and if I were actually involved with Linux kernel development (which I'm not), my response would probably go something like this:

Dear Linux kernel devs (or someone named Linus), if you are going to make changes to things like the load average metric, knock yourself out. But please make it a new metric collector that is separately accessible from the existing metrics. That way, historical performance data and CaP models will not get screwed by your latest (possibly evanescent) brainwave.

David's observation should also serve as a warning to all performance analysts and cap planners. It means that performance metrics can change underneath you. That's a very disturbing thought, which most of us remain blissfully unaware of. And even if you are aware of it, what's a body to do?

As I describe in Chap. 6 of my Perl::PDQ book, the load average is historically the first instantiation of performance instrumentation. Starting circa 1965, and continuing through the various releases of AT&T Unix and beyond, it must have been an extremely novel idea when it was introduced. But, because it's a metric defined in software, it's not guaranteed to be immutable. There are documented cases where the load average and other performance metrics have been broken by dint of some kernel dev's bright idea. I discuss some of them in the GCaP class.

Therefore, we performance analysts and cap planners need to independently verify that the metrics we collect (such as load avearge) have valid definitions and that they remain consistent over time. The most efficient way to do that is by regression testing the performance tools that we rely on. How to do that is a topic that I present in the GDAT class.

1 comment:

Anonymous said...

Linux Load Average already includes "IO" since processes blocked on disk are included. This is unlike the original definition and other Unix implementations that only include processes blocked on CPU. It means that disk intensive Linux systems show a much higher load average, even when their CPU is idle. It's just a broken definition and I found this and traced this all the way into the kernel code back in 2007.

It would be an improvement to have per entity load tracking, as long as CPU and disk are separated.