Problems with Patching
So you think that patching a Linux server is pretty straight-forward ? You’re probably 90% right in that assumption, but there are several issues that need to planned for or addressed when you are considering running updates on your servers.
Networks : The Bigger They Are…
If you’re running a small shop, patching your 5 to 10 servers will probably not be a big problem. Given a particular span of time, you should be able to have them all updated before the next big patch cycle. Still, it will take some coordination among your users, developers and management to minimize the effects of scheduled downtime. You’ll have to help specify how long the machines will be unavailable, have an idea of the number of reboots involved, and make sure that someone knowledgeable enough with the machine is ready to test whatever application the machine is running once the patching is done.
Now, take that same process and multiply the number of those machines by 10—or even 20. Big shops will not only have a large number of machines to patch, but the number of those affected by downtime will surely rise as these enterprise level services will now affect users numbering in the hundreds. You will also have to deal with larger scale development schedules, and the fun politics that can come from the management and community.
There is no perfect solution for this: strategies and non-strategies will dictate the tools you’ll need to get all these servers patched. Concepts from clustering servers to social engineering will help get you around the various roadblocks you’ll encounter as you try to coordinate and compromise on reboot times ranging across all hours of the day and night. You’ll need to provide assurances about when the systems will come back up, since the word “patching” can be synonymous with “un-scheduled downtimes” due to either incompatibilities between vendor software and OS updates, or just human error. You’ll also need to prepare for such occasions by having reliable backups available, prepped fail-over systems that are ready to go if the primary system doesn’t come back up, and the experience of a well-educated sys admin who can help troubleshoot problem updates after their installation.
These are just some of the issues I run into when trying to manage a large network of Linux servers in an enterprise environment. Without the right size staff, a lot this would be difficult to get done, making the management of a small IT shop look like a piece of cake by comparison.
Another problem with big shops is that they tend to have a variety of configurations when it comes to their OS installs. Because of this, it’s hard to pick and choose what patches should be applied to your machines—you don’t really have the time to sit there and pick and choose which patches need to be installed first. In response to this dilemma, you may opt to just install all of the available updates you need. The world would be perfect if we could have servers running a single upgradable app, but it just isn’t so. Instead, you have machines running Tomcat or Apache HTTP Server as your primary app, yet you also rely on the services provided by ntp, openssh or xinetd. These services, as well as many others, are updated over time, and should be upgraded when given the chance. So, don’t just update that one important app—update them all. It may save you potential headaches in the future.
The only exception to this “patch everything” rule is kernel patches. They usually require a restart of the system, whereas other patches require a daemon restart or no restart at all. If updating the machine means not having to reboot the server, then that’s just one less thing to worry about, so save the kernel patching for a better time.
One of the problems I’ve run into when it comes to patching a Linux OS is disk space. Hard drive capacity to cost ratios are getting better all the time: the amount of money spent on a 20 GB hard drive a few years ago will now buy you a 500 GB drive, so it’s never a bad idea to have too much disk space. Application logs, home directories, dump files and third party software will always be there. This requires you to keep track of the available disk space you have on your machine, though; remember to always have enough space when you download your patches. Thanks to the large disk capacities of today, running out of disk space is probably the last thing on your mind, but in the cases of virtualized machines or older hard drives, capacity can be an issue. If a machine is in the process of downloading a large number of patches, it’s going to need the space to store these packages so that the installation process can access the files and execute the install.
The contents of the /var/cache/yum sub-directories can show lots of valuable space wasted away holding old RPMs you don’t need anymore.
In SuSE’s Enterprise Server, for example, all of these files are stored in organized local directories on the updated server itself. Once YaST Online Update has been executed and these RPMs are used, they can be manually removed if needed. In older versions of YaST’s Online Update, you could check the “Remove installation files” box at the end of the update, which would remove the downloaded RPM files once the machine was done installing them.