What is Load Average in Solaris?

Solaris, UNIX

What is load average? I’ve heard all kinds of vague explanations over the years, and it bothers me to continue hearing all the absolutely wrong descriptions of the term and what are “high” values for this number. I’ve heard things like “anything higher than 3X your number of CPUs is bad”, or “as long as it’s under 10 everything should be fine.” Not so.

Some of the misconceptions come from other UNIX and Linux OS’s, which measure the value differently. So an incorrect definition doesn’t necessarily demonstrate a lack of knowledge, but some amount ignorance to the way Solaris does it. Linux for example, also includes in its calculation the threads waiting for I/O, not just threads waiting for CPU.

In previous versions of Solaris (2.3-2.9), load average was a simple calculation. It was the average number of runnable and running threads. In other words, it was the number of threads running on the CPUs, plus the number of threads in the run queue, waiting for CPUs, averaged over time.

In Solaris 10, load average is calculated slightly differently than in previous versions.

The calculation is made by summing high-resolution user time, system time, and thread wait time, then processing this total to generate averages with exponential decay.

This calculation is slightly more comprehensive (and complex), because it takes into account CPU latency – the time taken to move a thread from the run queue onto a CPU. However, the older way of calculating this will yield almost identical results, so either definition I’d call “correct”. I still use the older definition because it is just easier to understand.

So what is a “high” number for load average? Well, first it depends on how many CPUs you have on your system, since the calculations do not take that into account. If you have one CPU, then a load average of 1.0 would mean you are, on average, consuming exactly 100% of that one CPU over the measurement period. If your number climbs above 1.0, then you have threads in the run queue at some point, waiting for CPU time. Solaris actually handles CPU saturation very well, so this may not mean your performance will degrade; it just means your CPU is well-used.

On the other hand, if you have 8 CPUs and a load average of 32, you may be seeing a performance degradation, as your system is somewhat CPU-bound. Each CPU is, on average, 100% utilized by running threads, and there are, on average, 24 more threads in the run queue. Depending on the application, this may be acceptable – it just depends on the expected response-time or expected processing time for your application.

Share

Restoring File Permissions in Solaris

Solaris, UNIX

Have you ever done something like this accidentally?

# chmod -R 777 /usr
^C^C^C

Oops – You just changed some files in /usr to 777 before you were able to cancel. You don’t know how many or which files were affected. pkgchk can save you here.

In Solaris, there is a software “registry” file, /var/sadm/install/contents, which gives us information on every file installed on the system, or at least every file associated with a Solaris package. This file includes information about file permissions, owner and group information, file size and checksum. Here’s an excerpt from the contents file:

/etc/opt/SUNWexplo/t3files.txt f none 0444 root bin 123 11760 1208943439 SUNWexplu
/etc/opt/SUNWexplo/t3input.txt e build 0400 root bin 590 45160 1208943439 SUNWexplu
/etc/opt/SUNWexplo/tapeinput.txt e build 0400 root bin 885 6403 1208943439 SUNWexplu
/etc/opt/SUNWexplo/xscfinput.txt e build 0400 root bin 758 60717 1208943439 SUNWexplu
/etc/pam.conf e pamconf 0644 root sys 3103 17166 1219679093 SUNWcsr
/etc/passwd e passwd 0644 root sys 672 56039 1219679093 SUNWcsr
/etc/patch d none 0755 root sys SUNWppror
/etc/patch/patch.conf v preserve 0644 root sys 365 31670 1186005379 SUNWppror
/etc/patch/secret.conf v preserve 0600 root sys 207 17050 1186005379 SUNWppror
/etc/path_to_inst v preserve 0444 root root 26 2566 1106347450 SUNWcsd
/etc/power.conf e powerconf 0644 root sys 488 40965 1106350205 SUNWpmr
/etc/printers.conf e preserve 0644 root sys 162 13902 1106350198 SUNWpcr
/etc/profile e etcprofile 0644 root sys 712 51625 1219679093 SUNWcsr

The pkgchck command can be used to fix the file attributes (owner and group) of any or all package-installed files.

# pkgchk -f

This command would check every file listed in /var/sadm/install/contents, and if needed, change the owner and permissions of the files on the system to match the registry. Yes, this is sort of a shotgun approach, and you may not want to invoke changes this widely across the system.

In my example, only /usr was affected, so you can narrow down the criteria with a find command:

# find /usr -perm 777 -exec pkgchk -f -p {} \;

Or if you only changed one file, and now want it to change it back to whatever it was, just tell pkgchk to only work on the file you specify:

# pkgchk -f -p /etc/crypto/kcf.conf

Share

Silencing/Automating Solaris Package Installs

Solaris, UNIX

If you’ve ever been faced with the chore of installing many packages across many hosts, you’ve either 1) spent all day hitting the ‘Y’ key on your keyboard to pkgadd’s questions, 2) gotten someone else to hit the ‘Y’ key all day, or 3) you’ve given pkgadd the proper information so it can proceed without your input.

pkgadd takes a -n argument, that tells it to operate in non-interactive mode. However, this alone will not let you install much of anything, because if the pkgadd command needs any input from the user, it will just exit and your package will not be installed. To give pkgadd the information to act on its own and install your package, you have to provide the -a option and specify an “installation administration file”.

This “admin” file contains all the parameters pkgadd will need to operate. The default file exists in /var/sadm/install/admin/default. Copy it to your home directory and take a look at it.

mail=
instance=unique
partial=ask
runlevel=ask
idepend=ask
rdepend=ask
space=ask
setuid=ask
conflict=ask
action=ask
networktimeout=60
networkretries=3
authentication=quit
keystore=/var/sadm/security
proxy=
basedir=default

You can get information on all of the parameters in the file with:

# man -s 4 admin

What I usually do, to forcefully install the packages without asking anything, is just replace all the occurences of “ask” to “nocheck”. This will take the default file, and create a new one, changing ask to nocheck.

# sed 's/ask/nocheck/' < /var/sadm/install/admin/default > /home/user/admin.file

Now you can do your pkginstall without any questions being asked:

# pkgadd -n -a admin.file SUNWblah

Another handy parameter in the admin file, especially when you are installing packages across multiple hosts, is the “mail” parameter. When you set this with your email, you will be notified when the package installs on each system.

Share

Solaris Security Tip: inetd Connection Logging

Solaris, UNIX

It’s maybe not the first thing I’d do to lock down a server, but this is a worthwhile bit to change if you use any inetd services (ftp, telnet, remsh, finger, talk, etc). In addition to the OS-related inetd services mentioned, many applications will add their own, broadening your exposure to vulnerabilities across different vendor products.

When any type of network connection is made to your servers, it’s important to know the source of the connection – where that connection originated. Yes, many hackers will use a proxy or bounce host or hosts to hide their true IP, but at least this information can give you a place to start if you needed to track them. This becomes even more useful in company-internal incidents where users are less able to hide.

TCP Wrappers has been around for ages. It’s a mechanism to allow or deny access to any inetd service, based upon the connecting IP address or host name. It used to be a more difficult to use – one had to download source, compile, install, configure, etc. But these days it’s built into many inetd variants, including Solaris.

Just for connection logging, we don’t necessarily need to set up TCP Wrappers to deny/allow hosts to connect based on IP or host name, so we’ll skip that part. If you want to go the extra mile and set this up, you configure the hosts.allow and hosts.deny files in /etc. Google around, it’s easy to find a howto.

With the SMF-based inetd in Solaris 10, it’s easy to turn on TCP wrappers for just one service or all services at once. If you just wanted to enable wrappers/logging for the FTP service, you’d change the properties of the FTP inetd service with inetadm:

# inetadm -m ftp tcp_wrappers=true

Or, to change the default value for ALL inetd services, you’d use the -M option:

# inetadm -M tcp_wrappers=true

When this change is made, a log entry will be made, usually in /var/log/syslog, unless you’ve changed your syslog configuration:

Jan 13 08:56:52 waters vnetd[26111]: [ID 927837 daemon.info] connect from rocky
Jan 13 09:00:26 waters in.rshd[28426]: [ID 927837 daemon.info] connect from hungryhippo
Jan 13 09:17:25 waters in.rshd[8174]: [ID 927837 daemon.info] connect from penta
Jan 13 09:24:19 waters in.telnetd[12414]: [ID 927837 daemon.info] connect from 192.168.151.95
Jan 13 09:35:17 waters in.ftpd[23954]: [ID 927837 daemon.info] connect from mercury

In Solaris 8, you’d accomplish this same goal by altering your inetd start script, /etc/init.d/inetsvc:
just add the -t option to the last line:

/usr/sbin/inetd -s -t &

Share

Solaris 10 Update/Release Date Chart

Solaris, UNIX

About once a week I have this issue – figuring out when a particular feature of Solaris 10 was made available. And if someone says “update 4″, is that 08/07 or 11/06? Here’s a chart that may help:

Release Date Update # Release Notes
01/06 Update 1 Solaris 10 01/06 Release Notes
06/06 Update 2 Solaris 10 06/06 Release Notes
11/06 Update 3 Solaris 10 11/06 Release Notes
08/07 Update 4 Solaris 10 08/07 Release Notes
05/08 Update 5 Solaris 10 05/08 Release Notes
10/08 Update 6 Solaris 10 10/08 Release Notes


Or, here’s a PDF from Sun that shows all the releases and the notes for each:
Solaris 10 What’s New

Share

VCS Heartbeats must be on separate VLANs

Solaris, UNIX, vcs

It’s not entirely clear from the documentation, but Veritas Cluster heartbeat links need to be on separate VLANs. They mention the requirement of different switches, but say nothing about VLANs. Do not use one big VLAN for all your private heartbeat links – you need two. Your different clusters can share these two VLANs, but if you have two heartbeat connections for your cluster, they need to be isolated from each other in hardware or in VLANs. If you do put them on the same VLAN or cross your links so they can see each other, you’ll get something like:

Dec 11 16:39:20 server llt: [ID 525299 kern.notice] LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 0 on the same network

Share
« Older Posts
Newer Posts »