Cover V11, I03

Article

mar2002.tar

Questions and Answers

Amy Rich

Q Our company develops a number of software products, and they've asked me to write up man pages and HTML documentation. The HTML part is easy, but I'm not certain where to start with the man pages. Is there any good guide about how to write man pages for UNIX systems?

A If you're writing documentation in more than one format, you may want to write the documentation in one form and use a program to translate it into the end forms that you need. It's much easier to keep track of one set of sources than two or three, and with one centralized source, you can later easily expand into doing additional forms of documentation (e.g., info, LaTeX, postscript, etc.). Your best bet is to start with something like XML or SGML and go from there. The LinuxDoc project uses DocBook (XML), for example, to write up its information.

If you're just looking for a primer on how to write roff man pages, look at the Linux Man-Page HOW-TO:

http://www.ibiblio.org/pub/Linux/docs/HOWTO/mini/other-formats/html_single/Man-Page.html
Q I'm having an issue getting my E250 to recognize some new 256-MB DIMMs. I just upgraded the PROM to release 3.22, but I'm still not having any luck. The machine has 4 64-MB DIMMs in the first bank and 4 256-MB DIMMs in the second bank. I've tried moving them around to put the 256-MB DIMMs first, and I've tried taking out the 64-MB DIMMs. Still no luck. The 256-MB DIMMs are actual Sun memory (part number 501-6056-7821-3370), and I've tried swapping them out for other DIMMs of the same part number. The machine only ever recognizes these as 128 MB, though. What am I doing wrong?

A The E250 only accepts up to 128-MB DIMMs, with a maximum of 2 GB. If you want more then 2 GB of RAM, you need to upgrade to a different model machine.

Q My HP-UX 10.20 machine has suddenly developed a weird problem with uname that is breaking a number of scripts. Now when I type in uname, I get the following output:

ZZ
I'm not sure what this is or where it came from, and none of the other administrators seem to have a clue, either. Is this indicative of some sort of operating system corruption? We'd do a restore from tape, but we're not sure is broken.

A It sounds like /etc/rc.config.d/netconf has been modified, possibly by someone typing in vi commands (since ZZ is save and exit). Take a look at the modification date on /etc/rc.config.d/netconf and see if that matches up with the time period that you started to notice things breaking. There should be a variable called OPERATING_SYSTEM, and the value should be HP-UX:

OPERATING_SYSTEM=HP-UX
If you see OPERATING_SYSTEM=ZZ, you've found your culprit. If none of you have edited this file recently, but the changes are there, I'd start looking at audit trails to see if you've been broken into.

Q I'm using iptables on my Linux box. It's acting as a firewall/router/nat gateway. I thought that I had all UDP blocked unless it was originating from the machine itself or from the machine's internal interface, but when I run nmap against the external interface, it claims that UDP stealth scans work and that all UDP ports are open. What iptables rules would I use to close this hole? I don't want people to be able to initiate UDP sessions from the external interface at all.

A If you've already disallowed initiating UDP connections from the external interface, you should be safe, even though nmap reports differently. Because of the way UDP works, nmap cannot differentiate between open UDP ports and UDP ports that were blocked and silently dropped by a packet filter. UDP is connectionless, which means that nmap relies on receiving an ICMP reply to indicate whether or not the port is open. This is why nmap shows all of your UDP ports as open. It would be less confusing if nmap reported that it could not determine whether or not the port was open in these instances, rather than giving a false positive.

Q With the current downturn in the market, many sysadmins seem to be looking for jobs. I've been in a situation where I've been off the market for quite some time now, and I'm not entirely certain how to classify myself in skill level or what sort of salary I should be negotiating for. Is there any sort of guide that would help?

A The question of your skill level often depends on what sort of responsibilities you expect your new job to have and what sort of qualifications (e.g., certifications, college degree, work experience, etc.) the HR department at that company values. As a general rule of thumb, though, I tend to go by the descriptions given in the the System Administrators Guild (SAGE) job descriptions page:

http://www.usenix.org/sage/jobs/jobs-descriptions.html
As for salaries, however, that's a tougher question. In previous years, the market was booming and sysadmins could ask for the sky and some companies would actually give it to them. As the economy tightens its belt, though, those outrageous salaries are returning to some semblance of normalcy, and companies are mostly looking to hire senior sysadmins and network admins. The most common places that are still hiring and offer some stability appear to be institutions of higher learning. If you're interested in looking at statistics on previous year's salaries, take a look at the SAGE salary surveys:

http://www.usenix.org/sage/jobs/salary_survey/salary_survey.html
If you're interested in participating in the salary survey, SAGE opens the polls at the end of each year.

Q I work for a small ISP, and, now that budget cuts are hitting everyone, we're having a hard time keeping up with customer requests. Sometimes people are too busy and things fall through the cracks. Sometimes the person who took the call or got the email has left the company. I know there are large expensive packages out there to do ticket tracking, but we can't afford that kind of solution. Is there anything out there that's inexpensive or free that would help us keep track of user requests and who's handling what?

A There are bug tracking-oriented software packages like GNATS, but you'd probably be better off with a full-blown tracking system. There's an excellent free GPL licensed package called RT (Request Tracker):

http://www.fsck.com/projects/rt/
Some of the features of RT include:

  • Multiple queues
  • Fine-grained access control (including public and private queues)
  • Both a Web and a CLI interface
  • A backend SQL database (MySQL and Postgress are popular choices) for data storage
  • Tools for auditing and reporting ticket statistics
  • Object orientation, which allows the ability to drop in customized scripts
  • Third-party contributed scripts/tools
  • Support through the mailing list and the ability to engage the author in contract work
Q A contractor was here doing some work for us, and I noticed him type cd -. I have never seen this syntax before and wondered exactly what it does. Is it an abbreviation for cd .. ? Whatever it is, the Bourne shell doesn't appear to have it.

A cd - is shell specific, as you discovered. In ksh and tcsh (possibly others) it switches the working directory to the previous working directory ($owd in tcsh). If you start off in /usr/local/src, for example, and then cd to /tmp, doing a cd - will take you back to /usr/local/src. A subsequent cd - will go back to /tmp. You can alternate like this indefinitely. This is somewhat akin to tcsh's pushd and popd, but with only the last item on the stack instead of having an arbitrarily grow-able stack. You can also use pushd - to indicate the previous working directory.

Q In the process of changing jobs, I've gone form an AIX environment to a Solaris and HP-UX environment. I'm an experienced AIX sysadmin, but things seem to be done quite differently in Solaris and HP-UX. Is there a reference that will help me apply my AIX knowledge to my new environment -- some sort of translation guide?

A The UNIX Rosetta Stone does a fairly decent job of mapping commands from one UNIX-like OS to another on a basic grid:

http://bhami.com/rosetta.html
The UNIX Rosetta Stone includes tasks for AIX, Darwin, DG-UX, FreeBSD, HP-UX, Irix, Linux, NetBSD, OpenBSD, SCO OpenServer, Solaris, SunOS, and Ultrix. The basic categories of tasks include:

  • Hardware, firmware, and devices
  • Disks
  • Kernel, booting, and swap
  • Files and volumes
  • Networking
  • Security and backup
  • Software, patching, tracing, and logging
  • Other references

Submissions for tasks and additional OSs are accepted by the maintainer if you have information to add.

Q We process a large number of logs per day, and syslog just isn't cutting it anymore. We're looking for a replacement that will allow us to have better control of where things go instead of just having the facilities and priority levels of syslog. Is there anything better, or are we stuck with writing something homegrown? I'd hate to waste time reinventing the wheel.

A Depending on the OS you're using, take a look at syslog-ng:

http://www.balabit.hu/en/downloads/syslog-ng
It supports Solaris-, Linux-, and BSD-like OSs. This is a syslog replacement with regular expression matching as well as the priority/facility pairing. As an added benefit, syslog-ng makes using a centralized log host easier if you're going between firewalled segments, and the config file is extremely customizable. There's also a support mailing list at:

http://lists.balabit.hu/mailman/listinfo/syslog-ng
Q I'm trying to back up the mail spool (/dev/md1) on our Red Hat 6.2 machine to tape every night at 3:00 AM via cron. Some nights this works fine, but sometimes I get the following error in the mail after the dumps have finished:

DUMP: bread: lseek fails
<several more of these>
DUMP: short read error from /dev/md1: [block -1457973192]: count=4096, got=0
DUMP: bread: lseek2 fails!
<many more short read errors with varying negative block numbers>
Things go rapidly downhill from there and then the whole dump is aborted. Is this some sort of disk corruption? I've done some read and write exercises on the disk and there doesn't appear to be any problems except occasionally when dumps happen. Any insight as to what's going on?

A You mentioned that you're backing up your mail spool. My guess is that your mail spool sees a lot of activity when you're performing the dump. Dump makes several passes on the disk, first making a list of inodes (files, directories, devices, etc.) to back up. Only then does dump actually back up the real data. If you've got a very busy filesystem, things change on disk between the time when dump makes up it's list and the time when the data is actually read from disk. For example, if you have a user reading mail in a monolithic mail box, that user gets new mail and deletes some old messages. When dump made the initial passes, it recoded the inode as being 25631 bytes. When it came time for the data to be copied off to tape, the file was only 23715 bytes. Dump attempts to seek past the end of the file as it now exists.

To remedy this situation, do your dumps on a quiet system as the man page suggests. If your system is never really quiet, you can try using mirrored disks, break the mirror at 3:00 AM, and then dump the broken mirror instead of the live disk. You'd then re-attach the mirror and let it sync up after the dump was finished. Using individual files for each message (maildir format) instead of using monolithic mail boxes (mailbox format) might also help. The problem is less likely to appear because you'll be adding and removing entire files instead of changing the sizes of larger monolithic files.

Q I'm running AIX 4.3.3.0, and I'm trying to use mksysb to create a bootable disaster recovery tape. I'm doing this via smit, and I get the following output:

Creating information file (/image.data) for rootvg....

Creating tape boot image..
0301-154 bosboot: missing proto file: /usr/lib/drivers/bbldd

0512-016 mksysb: Attempt to create a bootable tape failed:
bosboot -d /dev/rmt0.1 -a failed with return code 41.
I went looking for the bbldd prototype file it can't find and, indeed, it's not there. Obviously I'm missing a package or something is misconfigured. Is there an easy fix for this?

A You can start by checking for fileset corruption with lppchk. It sounds like you're at least missing the bbldd device driver. You might want to try installing or reinstalling that fileset:

devices.buc.00004001.com
You can also try running the config manager to see if you're missing other drivers. If you do wind up putting more device drivers on the machine (use cfgmgr -i /dec/cd0 after putting in the first AIX install CD), you may have to reboot for them to take effect. You probably want to apply the latest maintenance level after loading new filesets from the CD-ROM, as well.

Q I have a user who is running FreeBSD 4.4 at home and is attached to the Internet via a cable modem. His service provider requires the MAC address of the machine to hand out DHCP leases, but has the MAC address of the user's Windows machine instead. The user wants to switch between the two, but the provider will only register one MAC address. Is there a way that the user can convince FreeBSD to lie about its MAC address so that switching between the machines is painless?

A The best solution to your user's problem would be not to try and steal the MAC address at all, but to network the two boxes and have one of them act as a router and do IP masquerading (NAT). FreeBSD and Windows are both quite capable of this, and it's a cleaner solution.

If you must have the FreeBSD machine take on the MAC address of the Windows machine, there is a caveat. MAC addresses should be unique worldwide, but absolutely need to be unique on the same network. If your user networked the FreeBSD machine and the Windows machine and has them both on at the same time, he's going to have issues. If only one machine is going to be up at a time, you can create a file called /etc/start_if.<interface name>. In this file, you can specify the argument to ifconfig such that it will change the interface's MAC address while it's being brought up for the first time:

ifconfig <interface> lladdr <new MAC address in hex>
If the Windows machine's MAC address were ff:20:d3:f4:6e:f8 and the FreeBSD machine's interface were fxp0, the contents of /etc/start_if.fxp0 would look like:

ifconfig fxp0 lladdr ff:20:d3:f4:6e:f8
Amy Rich, president of the Boston-based Oceanwave Consulting, Inc. (http://www.oceanwave.com), has been a UNIX systems administrator for more than five years. She received a BSCS at Worcester Polytechnic Institute, and can be reached at: arr@oceanwave.com.