Browsing the archives for the Solaris category

Backing out Patches from an Unbootable Server

patching, Solaris, UNIX

Sometimes when patching gets interrupted – by a user, power outage, hardware failure, etc, you can end up with incomplete or mis-installed patches. If these patches are important ones – like kernel patches, your system may not even boot from disk. Many times this will cause an endless reboot cycle of kernel panics.

Of course you could have prevented this by breaking your root mirror before installing the patches, or by using LiveUpgrade. But I know sometimes we just don’t do these things, for various reasons.

One solution to this is to boot from the network into single-user mode, mount your root disk, and back out the patch or patches on disk, hopefully repairing the damage and returning you to a bootable state. Of course this is assuming you have a jumpstart server on your network as well. NOTE: I tested this with a Solaris 10 06/06 boot image on the jumpstart server – I haven’t tested earlier versions)

Where I am, all the root disks are mirrored with SVM. An procedure I’d used in the past was to boot from the network, run patchrm on the first disk in the mirror to back out the patch, and then disable the mirror, so the second disk would not be used when rebooted. Then re-mirror later. This was rather tedious and error-prone, especially with multiple metadevices, soft partitions, etc. I found a new way to accomplish this task: keep the mirror intact and back out the patches while booted from the network. Our systems also use zones, and their zonepaths are on soft partitions. This procedure will also back out the patches from the zones. Here is the exact procedure:

1. Boot from network into single-user mode
ok> boot net -s
2. Mount root file system READ ONLY from the first disk in the mirror:
# mount -o ro /dev/dsk/c1t0d0s0 /mnt
3. Copy the SVM configuration to the running OS:
# cp /mnt/kernel/drv/md.conf /kernel/drv/md.conf
4. Unmount the root disk
# umount /mnt
5. Update the SVM driver to load the new configuration (ignore error messages)
# update_drv -f md
6. Set up metadevices in configuration
# metainit -r
7. Run metasync on root mirror metadevice
# metasync d10
8. Mount root metadevice on /mnt
# mount /dev/md/dsk/d10 /mnt
9. If the system has zones, run metasync on the metadevice containing the soft partitions, and mount all zone root file systems
# metasync d40
# mount /dev/md/dsk/d53 /mnt/zones/zonepath1
# mount /dev/md/dsk/d56 /mnt/zones/zonepath2
10. Rollback the failed patch.
# patchrm -R /mnt $patch 2>&1 | tee -a /mnt/backout.log
11. umount /mnt and reboot server

Share

Oracle Dynamic SGA in Solaris Zones

performance, Solaris, UNIX

Since 9i, Oracle has included a feature to dynamically resize the SGA of the database when needed, without the need to restart the database. It utilizes Solaris “Dynamic Intimate Shared Memory” (DISM) to accomplish this.

DISM provides dynamically resizable shared memory. Any process that uses a DISM segment can lock and unlock parts of a memory segment, and by doing so, the application can dynamically adjust to the addition (or removal) of physical memory from a server.

In the initial releases of Solaris 10, DISM was unavailable within Solaris Zones, because the ability for processes to lock memory segments was not available. If you do try to run Oracle with DISM in Zones before 11/06, you’ll see completely awful database performance (I’ve seen it). The fix for this was to disable DISM in Oracle, by setting the Oracle parameters sga_max_size and sga_target to the same value, so the SGA would not resize at all.

Solaris 10 update 3 (11/06) introduced a new zone privilege: proc_lock_memory, which gives processes within the zone the ability to lock memory segments. So DISM will now work if this privilege is enabled. To enable it, just turn it on in the zone config and reboot the zone:

# zonecfg -z oraclezone
zonecfg:oraclezone> set limitpriv=default,proc_lock_memory
zonecfg:oraclezone> commit
zonecfg:oraclezone> exit
# zoneadm -z oraclezone reboot

If you see an error after the “set limitpriv” line when you try, make sure you have Solaris 11/06 or later (or the patched equivilant).

Share

Cleaning out /var in Solaris

Solaris, UNIX

Since your Solaris 10 installation, your data in the /var file system will grow each time you apply a patch. Depending on your patching strategy, over time you could find yourself running out of space, if you use a dedicated /var partition. This, in addition to mail, and logging from all kinds of applications can worsen the problem.

I’d say the best strategy is to increase the size of /var. If you’re using the standard UFS file system with no volume management, this means backing up, re-creating the partition, and restoring the data. If you do have some sort of volume management, sometimes the answer is a simple metattach/growfs or vxresize command.

If you want another option, just to get your by until you have the time to increase /var, there is another easy method. When patchadd adds any patch to the system, the files being replaced get saved off in case you need to remove the patch later, restoring these files. These files are compressed and stored in /var/sadm/pkg/ /save/ and in /var/sadm/pkg/ /save/pspool/ /save/. The files are called undo.Z.

Note: It is completely safe to delete these .Z files, as long as you are sure you will never need to back out its associated patch! Doing this can free up significant space.

I’ve even done things like this in a pinch: (the shotgun approach)

#find /var -name undo.Z -exec rm {} \;

Share

Link-based IPMP setup with VCS

network, Solaris, vcs

With Solaris 10 came a nice feature – Link-based IP Multipathing (IPMP). It determines NIC availability solely on the NIC driver reporting the physical link status – UP or DOWN. Previous versions used “probe-based” IPMP, where connectivity is tested by pinging something on the network from each interface. While probe-based is actually a more thorough test (tests network layer 3 as well as 2), it is much more cumbersome to configure, and you need an extra IP address for each interface for “test” addresses. IMO Link-based IPMP is sufficient for most applications.

For some reason, configuring link-based IPMP in VCS is somewhat tricky, and the documentation doesn’t seem to help much. It seems all the default values for VCS are for probe-based IPMP only.

To achieve link-based IPMP, here’s how I’ve configured my MultiNICB resource:

Link-based IPMP MultiNICB properties

Link-based IPMP MultiNICB properties


These are the values you must change from the defaults:

UseMpathd: 1
Tells VCS to use mpathd for network link status
MpathCommand: /usr/lib/inet/in.mpathd -a
The default, /usr/sbin/in.mpathd is just incorrect – it doesn’t live there.
ConfigCheck: 0
If you leave this at 1, it will overwrite your /etc/hostname.xxx files with probe-based IPMP configuration
Device: (your IPMP interfaces here)
The “interface alias” for each device is not needed, leave them blank.
IgnoreStatus: 0
You want VCS to NOT ignore link status, since this is how link-based IPMP works.
GroupName:
Do not use your IPMP group name here, it’s not needed. VCS is not monitoring the group, mpathd is.

Here’s how it looks in main.cf:

MultiNICB csgmultinic (
UseMpathd = 1
MpathdCommand = “/usr/lib/inet/in.mpathd -a”
ConfigCheck = 0
Device = { ce0 = “”, ce4 = “” }
IgnoreLinkStatus = 0
)

Share

Changing timeouts on SMF services

Solaris, UNIX

I’ve run into an issue where the default timeout value (120 seconds) was not long enough for the start methods to run for my system services. In particular, the psncollector service.

The psncollector service runs a ‘prtfru -x’ command, which can take several minutes to complete on a large server like an E2900. With the 120 second timeout, the start method fails:

# svcs -x
svc:/application/psncollector:default (Product Serial Number Collector)
State: maintenance since Sun 25 Jan 2009 10:01:34 AM PST
Reason: Start method failed repeatedly, last died on Killed (9).
See: http://sun.com/msg/SMF-8000-KS
See: /var/svc/log/application-psncollector:default.log
Impact: This service is not running.

# tail /var/svc/log/application-psncollector:default.log
[ Jan 25 08:59:51 Executing start method ("/lib/svc/method/svc-psncollector") ]
Using /var/run
[ Jan 25 09:02:01 Method or service exit timed out. Killing contract 48 ]
[ Jan 25 09:02:05 Method or service exit timed out. Killing contract 48 ]
[ Jan 25 09:02:18 Method "start" failed due to signal KILL ]

The easy fix was to increase the service start timeout value:

# svccfg -s psncollector setprop start/timeout_seconds=480
# svccfg -s psncollector setprop restart/timeout_seconds=480
# svcadm refresh psncollector
# svcadm clear psncollector

Once cleared, the service started up, taking its usual 3+ minutes.

Share

What is Load Average in Solaris?

Solaris, UNIX

What is load average? I’ve heard all kinds of vague explanations over the years, and it bothers me to continue hearing all the absolutely wrong descriptions of the term and what are “high” values for this number. I’ve heard things like “anything higher than 3X your number of CPUs is bad”, or “as long as it’s under 10 everything should be fine.” Not so.

Some of the misconceptions come from other UNIX and Linux OS’s, which measure the value differently. So an incorrect definition doesn’t necessarily demonstrate a lack of knowledge, but some amount ignorance to the way Solaris does it. Linux for example, also includes in its calculation the threads waiting for I/O, not just threads waiting for CPU.

In previous versions of Solaris (2.3-2.9), load average was a simple calculation. It was the average number of runnable and running threads. In other words, it was the number of threads running on the CPUs, plus the number of threads in the run queue, waiting for CPUs, averaged over time.

In Solaris 10, load average is calculated slightly differently than in previous versions.

The calculation is made by summing high-resolution user time, system time, and thread wait time, then processing this total to generate averages with exponential decay.

This calculation is slightly more comprehensive (and complex), because it takes into account CPU latency – the time taken to move a thread from the run queue onto a CPU. However, the older way of calculating this will yield almost identical results, so either definition I’d call “correct”. I still use the older definition because it is just easier to understand.

So what is a “high” number for load average? Well, first it depends on how many CPUs you have on your system, since the calculations do not take that into account. If you have one CPU, then a load average of 1.0 would mean you are, on average, consuming exactly 100% of that one CPU over the measurement period. If your number climbs above 1.0, then you have threads in the run queue at some point, waiting for CPU time. Solaris actually handles CPU saturation very well, so this may not mean your performance will degrade; it just means your CPU is well-used.

On the other hand, if you have 8 CPUs and a load average of 32, you may be seeing a performance degradation, as your system is somewhat CPU-bound. Each CPU is, on average, 100% utilized by running threads, and there are, on average, 24 more threads in the run queue. Depending on the application, this may be acceptable – it just depends on the expected response-time or expected processing time for your application.

Share
« Older Posts