What is load average? I’ve heard all kinds of vague explanations over the years, and it bothers me to continue hearing all the absolutely wrong descriptions of the term and what are “high” values for this number. I’ve heard things like “anything higher than 3X your number of CPUs is bad”, or “as long as it’s under 10 everything should be fine.” Not so.
Some of the misconceptions come from other UNIX and Linux OS’s, which measure the value differently. So an incorrect definition doesn’t necessarily demonstrate a lack of knowledge, but some amount ignorance to the way Solaris does it. Linux for example, also includes in its calculation the threads waiting for I/O, not just threads waiting for CPU.
In previous versions of Solaris (2.3-2.9), load average was a simple calculation. It was the average number of runnable and running threads. In other words, it was the number of threads running on the CPUs, plus the number of threads in the run queue, waiting for CPUs, averaged over time.
In Solaris 10, load average is calculated slightly differently than in previous versions.
The calculation is made by summing high-resolution user time, system time, and thread wait time, then processing this total to generate averages with exponential decay.
This calculation is slightly more comprehensive (and complex), because it takes into account CPU latency – the time taken to move a thread from the run queue onto a CPU. However, the older way of calculating this will yield almost identical results, so either definition I’d call “correct”. I still use the older definition because it is just easier to understand.
So what is a “high” number for load average? Well, first it depends on how many CPUs you have on your system, since the calculations do not take that into account. If you have one CPU, then a load average of 1.0 would mean you are, on average, consuming exactly 100% of that one CPU over the measurement period. If your number climbs above 1.0, then you have threads in the run queue at some point, waiting for CPU time. Solaris actually handles CPU saturation very well, so this may not mean your performance will degrade; it just means your CPU is well-used.
On the other hand, if you have 8 CPUs and a load average of 32, you may be seeing a performance degradation, as your system is somewhat CPU-bound. Each CPU is, on average, 100% utilized by running threads, and there are, on average, 24 more threads in the run queue. Depending on the application, this may be acceptable – it just depends on the expected response-time or expected processing time for your application.
For the confusion they cause lusers, I sometimes wonder if load averages should only be visible to root.
And, sorry luser, your 100% CPU “problem” is not a problem, thanks for playing.
To complicate things… how would you apply that to a T2 – multi-threaded multi-core cpu?
this is a deeply interesting dispatch, acknowledgement you on the information. Contrite my english is not the sheer best. do you be versed if it is imaginable to forward this to the spanish language. that would be damned helpfull.
this is a really intriguing enter, as a consequence of you for the benefit of the information. Contrite my english is not the darned best. do you remember if it is tenable to despatch this to the spanish language. that would be sheer helpfull.
cmts@isport-media.com