Cover V08, I04
Article

apr99.tar


Questions and Answers

Bjorn Satdeva

I first want to thank the many readers who reminded me of the existence of lsof. This was in response to my statement that I did not know of a program that would associate open ports with processes. Thank you to everybody who responded with this information.

One reader (whose name I unfortunately lost) wrote:

Someone asked if it was possible to change the system variable NGROUPS_MAX in Solaris. The answer given was that it is not possible to change this value unless you have source code and plenty of debugging time for the kernel and utilities, because the value for NGROUPS_MAX is compiled into the kernel.

In Solaris, it is not only possible, but also easy to customize kernel operations without major kernel and software problems. These changes are made in the file /etc/system. For instance, adding the following line to /etc/system will change the default number of groups to 48:

set ngroups_max=48

For this setting to take effect, from the ok prompt, type boot -r.

I was not aware of this, but I have more than once been very frustrated with the briefness of the man page for this file. Does anyone know a better source of configuration information for this file?

 Q I created a new account, but when I attempt to change the password and prompt, I get this:

passwd couldn't change entry (rcp call failed)

What does "rcp call failed" mean?

 A I would guess that rcp above should be "RPC" instead. RPC stands for Remote Procedure Call, which is a set of library routines that allow C language programs to make procedure calls on other machines across a network. RPC calls are used by the passwd command to connect to the NIS server to change the password on that machine. The message indicates some kind of communication problem between your client machine and the NIS server.

 Q Where can I find or learn email basics for UNIX system administrators? Thanks.

 A The standard UNIX mail delivery agent is a program called Sendmail. It's a very large program with a very complex configuration file. The size and complexity of Sendmail results from the many different kind of "jobs" it has to do. You can learn about Sendmail by reading Bryan Costales & Eric Allman's book called Sendmail, published by O'Reilly and Associates. Be forewarned that the book has about 1000 pages, however, you will probably not need to understand the rewrite rules in details. Instead, focus on the M4 configuration files. Still, there will be plenty to learn. If you can avoid getting discouraged, you will be able to learn it all; it will just take time. Hang in there, and good luck.

 Q In the December '98 issue of Sys Admin, you state: "By definition, any host that receives mail should have at least one MX record."

Could you please site a source for this claim or give an explanation of this statement? This statement essentially contradicts DNS and BIND's (2nd Ed. by Paul Albitz and Cricket Liu, O'Reilly and Associates) introductory paragraph of the "DNS and Electronic Mail" chapter. It also contradicts real world experience: MX records are nice for re-routing SMTP mail, providing redundant SMTP email delivery and the like, but they are certainly not necessary for email delivery.

Given the DNS bible's attitude and the reality of email transport, I'm not inclined to add an extra 2200-some DNS records to the 400-some DNS zones we host. My experience indicates they are simply not necessary.

 A Please read my answer again: "By definition, any host that receives mail should have at least one MX record." RCC974, which defines mail routing, does not mention any fallback to usage of A records. In addition, I have heard Eric "Sendmail" Allman on several occasions state that not using MX records is bad practice. To me, not including the MX records is simply sloppy maintenance of the DNS server. However, you are right that most SMTP mail transfer agents will manage to deliver mail in spite of the missing MX records. If the destination host does not have an MX record, then Sendmail will attempt to deliver to the respective A record address. It, however, will require DNS lookup to get that information. As for your 2200 DNS entries, I seriously doubt that they all are capable of receiving mail, as I would think that some of these entries must be routers, printers, or PCs that cannot receive mail though SMTP. If you have anywhere near that number of UNIX hosts, which all are capable of receiving mail (nullclients do not count), then you have a major maintenance problem at hand.

 Q I am wondering if anyone knows of articles addressing the pros and cons of purchasing a maintenance contract for a UNIX server. I'd like to find some objective help as to whether a maintenance contract is necessary, and if so, what kind of maintenance contract should be purchased.

 A The answer depends on how important the server is for your organization. If you do not have a hardware service contract, then it will normally take more time to get a response to a hardware service request. First of all, you will need to look at how long you can tolerate the server being down. If it is no big deal to have the server down for two or three days, service using time and materials billing may be sufficient. However, for most organizations, such downtimes are unacceptable. Also, if the server does break, it can cost a lot more to get it fixed compared to the price of a multi-year service contract. Without knowing all the involved facts, I still think that I would not want to be without a hardware service contract for such a system.

For workstations and smaller servers, the answer can be very different. In this case, it comes down to a numbers game. If you figure out the cost of placing all of your workstations on a service contract, you might learn that you can purchase a number of brand new systems each month (or year), and use some of those systems as cold spares, while others could be used to replace older equipment.

 Q How do I configure a dual operating system (Solaris 2.5.1) on an Ultra 2 machine with two internal disks in it? Your help is greatly appreciated.

 A You are asking about a dual operating system, which usually means two different operating systems on the same hardware platform, allowing you to boot one or the other (for example UNIX and NT). However, as the question is for a Ultra 2 machine, I think that you might mean how to install a single operating system utilizing both disks.

In this case, you install the OS on the first disk, using the usual installation procedure. As far as I remember, the Solaris installation program will ask you whether you want to install (partition and newfs) both disks. If it does not, then complete the installation on the root dist and use the format program to partition the second disk drive (see the man page for format). Make sure that you are partitioning the correct disk. (The software should protect you from partitioning the root disk, but I always like to be really, really sure that I am not working on the wrong drive.). After the disk has been partitioned, you can build a file system on each partition using the newfs command. When all the partitions have a file system, you can add each file system to the vfstab file, so they will be mounted automatically on reboot.

If you really meant that you wanted a dual OS system (e.g., SunOS and Solaris), then you install one OS on each disk. However, there is (to my knowledge) no software that, like in the Intel world, will prompt you for which OS you want to boot. You will therefore need to specify at boot which OS you want to boot from.

 Q You had a question about dial-up PPP in the January '99 Sys Admin from a user who could not access other systems on the LAN after dialing in with PPP. You suggested it was because of problems with the remote system default router.

I would like to offer another suggestion. With version 2.6, Solaris changed the default value of the IP parameter ip_forwarding to off. This caused me to be able to dial into the server but not to use the remote client to access anything beyond the server with that PPP connection. My fix was to run the command:

/usr/sbin/ndd -set /dev/ip ip_forwarding 1

I put it in one of the high number Sxx files in /etc/rc2.d.

 A You are right, I did not mention IP forwarding, which was a mistake on my part. If IP forwarding is not turned on, then the machine will not forward packages between the two networks. I alluded to this in making a reference to the routing being working on the server, but I should have been more explicit.

 Q We are starting to have soft errors (data retries) on our /usr disk. Since the disk is obsolete, is there a tool that would let me plug-in the address of the soft block and have it tell me what file owns it? That way I can move the file somewhere else. Thanks.

 A If you know what inode is affected (fsck, among others should tell you that), then you can use ncheck to generate a cross reference between inode numbers and path names for the disk partition in question. However, you are better off finding a new disk that will work in your system, because soft errors often are the first sign of a failing disk. Do not be surprised if the soft errors over time turn into hard errors and then a complete disk failure.

 Q How can I become a superuser without root password other than uid=0& setuid.

 A On some versions of UNIX, you can become root by booting the system into single-user mode (while other systems will prompt you for the root password when you try to do this). If your system does not allow you to boot it into single-user mode without the proper root password, you are probably out of luck and will need to reinstall the system. Your only other option is to break into the system using one of the many "black hat" tools floating around on the Internet; however, you will need to get information on those elsewhere.

 Q How do you gracefully exit and shut down Solaris 2.6 in the CDE environment? Is it possible to just exit CDE and return to the command line?

 A If your system is configured to login from the CDE or Openwin login screen, then you have no choice other than to shut down the system from that environment unless you can locate the program that is used for this purpose (dtlogin for Openwin). If you kill that program, you should get a normal UNIX login after exiting your Windows environment.

You can disable this default behavior by locating the startup file in /etc/rc2.d (for Openwin, it is S99dtlogin).

 Q Is there a shell/awk/sed or Perl script out there that can remove the non-printable characters from the output of the man command? I need to be able to save the outputs of the man command into an ascii text format.

 A The man command uses nroff to format the man pages. They do not really contain any non-printable characters. nroff is a very old UNIX command; it was written back in the days when printers were much more like typewriters. Those printers did not have the ability to print bold or italic typefaces. Instead nroff used the backspace character to cause the printer to print multiple characters in the same position (e.g., by underlining characters that were supposed to be italic). You can easily remove this from your output by some simple editing, however UNIX already provides the tool to do this. If you pipe the output from man to col -b, then the output will be in pure ascii format.

About the Author

Bjorn Satdeva is the president of /sys/admin, inc., a consulting firm which specializes in large installation system administration. Bjorn is also co-founder and former president of Bay-LISA, a San Francisco Bay Area user's group for system administrators of large sites. Bjorn can be reached at questions@sysadmin.com.