Article

Questions and Answers

Bjorn Satdeva

I'm interested in obtaining a copy of Elizabeth Zwicky's paper, "Torture-Testing Backup and Archive Programs: Things You Ought to Know But Probably Would Rather Not", as described in your column in the May 1999 issue of Sys Admin. I've tried the URL you listed, but it's not there.

The paper disappeared from the server shortly after that issue of the magazine was published. As far as I know, it is not available anywhere on the Internet right now. Drop me an email directly (questions@sysadmin.com), and I will send it to you (but please be patient, I only check this email address a few times a month).

I am installing a second sbus network card and I don't believe that the system sees the new interface card on startup. I have the card plugged in. When I do an ifconfig -a, it shows one interface hme0. In the var/adm/messages on bootup, it references le0, but when I do an ifconfig le0 plumb, then do an ifconfig -a, the MAC address is the same for both interfaces. I am confused; please help!

Using the same MAC address on all interfaces is a special Sun feature. You can change this behavior by setting the local-mac-address?=true from the OK boot prompt or from the command line: eeprom local-mac-address?=true (remember to escape the ?). Next time you reboot the machine, the MAC addresses will be different on the various interfaces.

Also, it is not clear from your question whether the kernel was reconfigured after you added the sbus card. To do so, reboot the machine with boot -r. The system will then probe all the hardware devices and will create device files as need.

I am a current subscriber to Sys Admin, and I read your column regularly. I have asked this question before of the IBM AIX support team and gotten the politically correct answer that in order to be Posix compliant, the limitations of the kill command are the way they are. As I understand the kill command, it simply "signals" the designated processes (by PID) and conveys the signal level specified. This means that the kill command must talk to a process that is willing to participate in its own demise. The vanilla UNIX kill command may simply be likened to Dr. Kevorkian, in that it performs assisted suicide. It is even less virulent than Dr K., in that the targeted process must be 1) awake (able to receive and understand signals), and 2) not waiting for any other event to occur, so the intended action can be performed by the intended victim himself. As a sometime-frustrated UNIX systems admin, I would like a real kill command. I simply specify the targeted PID and viola...regardless of its state, it is blown away without knowing what hit it. The designated victim need not participate in its own demise. kill should be smart enough to interrupt it and insert an exit request in its instruction stream. It could also run the systems process table and disconnect it. It really doesn't matter to me how this gets done, just that it gets done. Thanks in advance.

A kill -9 is almost always a sure kill, as the processes are not given the chance by the kernel to intercept the signal. The one exception is when the process is, for some reason, hanging in a system call. That only occurs when there is something wrong with the system, and the process cannot return from the system call it is in the process of executing. It is admittedly many years since I have done any serious kernel hacking, but I do not believe you can change that without major rewrites of the kernel and major side effects on how the system operates. Another case where you cannot kill a process is when it is defunct. When that occurs, the process has really gone away, but its parent has not noticed it yet (possibly due to poor programming), and has therefore not reclaimed it. A defunct process is really just a ghost, where only the entry in the kernel process table continues to exist, while the rest of the process is long gone.

What is the difference between grep and egrep?

egrep understands regular expressions, which allows you to grep for more complex strings. However, the normal grep is more efficient when you do not need the capacity of regular expressions. Both commands were written in the early days of UNIX when the hardware was much, much slower than it is today and the differences between the two were more important.

We're running Solaris 2.5 on Sparcstation-20. My impression is that something is wrong with cron - it simply ignores my crontab file. What are possible reasons for such a behavior?

My guess is that you probably have edited the crontab files directly. Instead, use the crontab -e command. When this command is used to edit the crontab file, the crontab command will notify cron that one of its configuration files has been updated and should be re-read. As a side note for crontab, you will need to set the EDITOR environment variable before you call crontab, as it will default to the old "ed" line editor, which I am almost sure that you will not like very much.

We have a PC drive NFS-mounted on our UNIX system. A transaction completed instantly when it was mounted on /usr1, but, now that the PC drive is mounted on /home the transaction takes 20-30 seconds. /home is 69% full, but the system is only used by developers. The problem still occurred even when uptime was low (.95) or high (4.38). Does moving the mount point matter in terms of performance?

The location of the mount point should not have any impact on your system performance. This is easy to test, as you can simply move the mount point back to its previous location. The problem could be in your network, but my guess is that your PC has some problems. I do not understand why the drive is located on a PC when you would get better reliability and most likely better performance if you move it to the UNIX system.

I would also like to clear up some misunderstanding: "uptime" refers to the amount of time the server has been online since the last reboot. The uptime command reports this information together with completely different type of information, referred to as the "load factor" or "load average". The load factor is a relative value that gives information about how loaded the system is.

How would I go about setting up a chrooted environment, where a user on my server would have his own filesystem, his own system daemons, and his own namespace?

The answer to this is very simple: you can't! The chroot command changes the root of the file system, limiting the user to accessing only the part of the file systems that are under the new root. The chroot does not create a new virtual machine for you. Any resources that are used by running processes will not magically become available in the new chrooted environment. You can have some processes, such as Sendmail or certain Web servers, supporting virtual environments, but that happens entirely within that application and has no relation whatsoever with the chroot command.

One more note on the chroot command. If you are setting up a restricted environment, every command that is needed by the users in that restricted environment must be present within that file system hierarchy. For example, an ftp server might be doing a chroot to /restricted before starting up. In that case, you will need an etc and a bin directory in /restricted with all the necessary executables and configuration files needed by the ftp server process.

I lost my root password. How do I change it?

You should be able to boot from CD-ROM or become single user without needing the root password. Some UNIX systems require you to provide the root password before going into single-user mode - Solaris is one example of this - but it allows you to boot from CD-ROM with the command boot cdrom -s. When you have the system in single user, you can mount any necessary file systems, and remove the password from the root entry in the password (or shadow) file. You can then reboot the system (remember to sync the file systems) and enter a new root password.

I am looking for a script to do log rotations for an application. I want the script to read a text file and extract the hostname off of the first field and then redirect it to another file with the ext of .hostname. The tricky part is that the file has numerous hostnames, and that's where I am stuck on resolving this. Could you point me to an existing script that has these features built in? I could then customize it for my purposes.

I don't know of any script that does this. How difficult this problem is to resolve will depend on the size of the logfile and the number of different hosts you will need to support. If you just need hostnames extracted, you can do this easily with sed and sort. Something like the following may do what you want:

sed -e 's/ .*//' < logfile | sort -u

This will give you a sorted list of all the hostnames.

I would like to make a Web-based system providing the capability to add and remove users. Is there a safe and secure way of doing this? Since you need to be root to execute this command, should I just use a simple shell script to execute from a php3 script or is there something better?

I'll answer the simplest part of the question first: Is there a safe and secure way to do this? Unless you are willing to spend the time and money to implement the infrastructure to do this, probably not. It will certainly not be simple, and unless you have a good understanding of UNIX and network security, it is probably a very bad idea to implement this using Web technology. Most systems now have some facility to add and delete users. You are probably better off using shell scripts that are wrapped around those commands, and relying on the password authentication to ensure that only authorized users will be allowed to do this.

We have two Sun Solaris servers at our site. One is A and the other is B. A is the mail server, the DNS primary server, and the main file server. I want to maintain B as a mirror server to A. B will have filesystems on it that are exact replicas of A. Any changes in filesystem on A should be reflected on B, so that if A goes down B should take over in a short time. My questions are: 1) What is the best way to achieve this? 2) Is there any public domain software for this?

This is a very difficult problem to solve, and if you truly want to do this, you will need to use a commercial product such as Veritas FirstWatch. If you can live with some time delays (for example one hour), and if your environment is fairly small, you might be able to use something like rdist to update B. It will, however, not be a solution that is anywhere near being a mirror.

About the Author

Bjorn Satdeva is the president of /sys/admin, inc., a consulting firm which specializes in large installation system administration. Bjorn is also co-founder and former president of Bay-LISA, a San Francisco Bay Area user's group for system administrators of large sites. Bjorn can be reached at questions@sysadmin.com.