Cover V11, I12

Article
Table 1

dec2002.tar

Questions and Answers

Amy Rich

Q I occasionally need to delete specific DNS cache entries from our corporate BIND servers. I’ve read through the documentation, but can’t seem to find a way to do this. Can you point me to any supplemental documentation that might help, or is this just not possible?

A You don’t mention which version of BIND you’re running, so I’ll take a stab and guess that you’re running either BIND 8 or BIND 9. You can’t flush individual cache entries with BIND 8, but you can flush the entire cache using:

ndc restart
With BIND 9 up to version 9.1.3, you need to kill and restart named.

The rndc command replaced the ndc command from BIND 8, so in versions 9.2.0 and 9.2.1, you can run:

rndc flush
In 9.3.0, you’ll finally be able to flush individual entries with rndc by using the syntax:

rndc flushname <name>
Q According to RFC 2821, non-resolvable/non-routable addresses must not appear in any SMTP transactions:

    4.1.1.1 Extended HELLO (EHLO) or HELLO (HELO)

    These commands are used to identify the SMTP client to the SMTP server. The argument field contains the fully qualified domain name of the SMTP client if one is available. In situations in which the SMTP client system does not have a meaningful domain name (e.g., when its address is dynamically allocated and no reverse mapping record is available), the client SHOULD send an address literal (see section 4.1.3), optionally followed by information that will help to identify the client system.

    ...

    4.1.3 Address Literals

    Sometimes a host is not known to the domain name system and communication (and, in particular, communication to report and repair the error) is blocked. To bypass this barrier a special literal form of the address is allowed as an alternative to a domain name. For IPv4 addresses, this form uses four small decimal integers separated by dots and enclosed by brackets such as [123.255.37.2], which indicates an (IPv4) Internet Address in sequence-of-octets form. For IPv6 and other forms of addressing that might eventually be standardized, the form consists of a standardized “tag” that identifies the address syntax, a colon, and the address itself, in a format specified as part of the IPv6 standards.

We have our mail hub (running sendmail) behind our NAT gateway, and it connects directly to machines on the Internet to deliver mail. The mail hub is clearly violating RFC 2821 by initiating SMTP transactions with its internal name that only resolves within our RFC 1918 address space. How do I modify our setup so that it’s not breaking spec?

A Presumably your NAT gateway has valid forward and reverse addresses and its own FQDN that allows it to talk to other machines. If you’re just doing a port 25 redirection to your internal mail hub, then you can modify the sendmail configuration on the mail hub to talk to outside hosts using the FQDN of the NAT box (this assumes that your NAT box never sends mail to your mail hub). Add the following line to your .mc file and rebuild your sendmail.cf on the mail hub:

define('confDOMAIN_NAME','your.nat.box')dnl
If your NAT box is also a machine that will be sending mail, then you’ll cause a loop if both the NAT box and the mail hub use the same name in the SMTP transaction with each other. In this case, you can create another A record that resolves to the same IP as your NAT box. Use this new FQDN on your mail hub:

define('confDOMAIN_NAME','not.the.nat.box')dnl
This way both names will be resolvable and routable, but will not conflict when the two machines are talking to each other.

Q I’m writing a Perl program which, among other things, needs to sort software revision numbers. I have a list of revision numbers like:

2.5.1 2.10.3 3.7.11 2.1.1

When sorted, they should appear in the following order:

2.1.1 2.5.1 2.10.3 3.7.11

I’m having difficulty coming up with an algorithm that does this correctly and reasonably quickly. Any help you can offer would be appreciated.

A To efficiently sort this type of data, use the packed default sort, a modification of the Schwartzian Transform. Uri Guttman and Larry Rosler do a fantastic job of covering the Schwartzian Transform and the packed default sort (as well as others) in the following paper:

http://www.sysarch.com/perl/sort_paper.html
Here are the relevant excerpts:

    The significant invention in the ST is the use of anonymous arrays to store the records and their sortkeys. The sortkeys are extracted once, during a preprocessing pass over all the data in the list to be sorted (just as we did before in computing the cache of sortkeys).

    @out =
      map  $_->[0] =>
           sort { $a->[1] cmp $b->[1] }
      map  [ $_, KEY($_) ] => @in;
    
    The ST doesn’t sort the actual input data. It sorts the references to anonymous arrays that contain the original records and the sortkeys. So we have to post-process to retrieve the sorted records from the anonymous arrays.

    Using the ST for a multi-subkey sort is straightforward. Just store each successive extracted subkey in the next entry in the anonymous array. In the sortsub, do an “or” between comparisons of successive subkeys, as with the OM and the naive sorts.

    @out =
      map  $_->[0] =>
      sort { $a->[1] cmp $b->[1] ||
             $b->[2] <=> $a->[2] }
      map  [ $_, KEY1($_), KEY2($_) ]
        => @in;
    
    The packed-default sort

    Each of the advanced sorting techniques described above saves the operands to be sorted together with their sortkeys. (In the cached sorts, the operands are the keys of a hash and the sortkeys are the values of the hash; in the Schwartzian Transform, the operands are the first elements of anonymous arrays, the sortkeys are the other elements of the arrays.) We now extend that idea to saving the operands to be sorted together with packed-string sortkeys, using concatenation.

    This little-known optimization improves on the ST by eliminating the sortsub itself, relying on the default lexicographic sort, which as we showed earlier is efficient. This is the method used in the new Sort::Records module.

    To accomplish this goal, we modify the ST by replacing its anonymous arrays by packed strings.

    First we pack the subkeys into a single string. Multiple subkeys are simply concatenated, suitably delimited if necessary. Then we append the operand to be sorted.

    Several methods can be used, singly or in combination, to build the packed strings, including concatenation, pack, or sprintf. Techniques for computing subkeys of various types are presented in Appendix B.

    Then we sort lexicographically on those strings, and finally we retrieve the operands from the end of the strings.

    Several methods can be used to retrieve the operands, including substr (shown here), which is likely to be the fastest, split, unpack or a regex.

    @out =
      map  substr($_, 4) =>
      sort
      map  pack(’C4’ =>
        /(\d+)\.(\d+)\.(\d+)\.(\d+)/)
          . $_ => @in;
    
    Benchmark [Table 1] compares the two most advanced general-purpose sorting techniques, ST and packed-default. These multi-stage sorts are measured both as individual stages with saved intermediate data and as single statements.

    The packed-default sort is about twice as fast as the ST, which is the fastest familiar Perl sorting algorithm.

    Earlier, we showed trivial sorts using the substr or lc function. Even for those cases, the packed-default sort performs better when more than a few data items are being sorted. See Benchmark A5, which shows quasi-O(N) behavior for the packed-default sort over a wide range of input sizes, because the sorting is much faster than the sortkey extraction.

So, to sort based on the three numeric fields in your software revision numbers, you’d want:

@sortedrev = map (substr $_, 3) =>
   sort
   map pack(’C3’ =>
     /(\d+)\.(\d+)\.(\d+)/)
            . $_ => @revisions;
			
Q I use procmail to catch a lot of spam that makes its way towards my account. Every once in a great while, I’ll get a false positive, but I have so many rules and regular expression matches, that it’s sometimes hard to track down what rule was triggered. Of course I can always turn on verbose, but that, too, shows much more than I’m interested in, and it isn’t useful for when the mail that triggered the mystery rule has already been sent to /dev/null. Is there any way to just have procmail selectively log what rule was matched when it triggers one of my spam catchers?

A You can selectively modify the standard log message that gets written out. First, make sure that you’ve specified a log file in your .procmailrc:

LOGFILE=$HOME/mail.log
Second, create newline variable (again in your .procmailrc) for ease of use later in your recipes:

NL="
"
For each of your spam-catching rules, use the MATCH variable to store the regular expression match. Here’s a snippet from the procmailrc:

MATCH This variable is assigned to by procmail whenever it is told to extract text from a matching regular expression. It will contain all text matching the regular expression past the ‘\/’ token.

Let’s take an example. You want to throw all mail from the user nobody@host.name into /dev/null. The rule would look like:

:0
* ^\/From:.*nobody@host.name
{
  LOG="Match = ${MATCH}${NL}"
  :0
  /dev/null
}
Instead of the standard log message:

From nobody@host.name  Thu November 14 20:42:27 2002
 Subject: spam
  Folder: /dev/null                              444
The log message would appear with the additional “Match” line at the top:

Match = From: nobody@host.name
 From nobody@host.name  Thu November 14 20:42:27 2002
  Subject: spam
   Folder: /dev/null                              444
The \/ token can be used with any sort of header or body match. To get the most information, you’ll want to put it as close to the beginning of the regular expression as you can. For headers, this means putting it after the beginning of line token, ^. For matches on message bodies, I suggest writing the beginning of the regular expression line as:

* .*\/
Q I recently cvsuped to FreeBSD 4.6-STABLE and built and installed the new OS. After I rebooted the machine, I started getting (and keep getting) messages like the following:

Nov 11 10:25:00 kosmos /usr/sbin/cron[167]: getting vmemoryuse resource limit: Invalid argument
Nov 11 10:30:00 kosmos /usr/sbin/cron[170]: getting vmemoryuse resource limit: Invalid argument
Nov 11 10:35:00 kosmos /usr/sbin/cron[178]: getting vmemoryuse resource limit: Invalid argument
I’ve tried changing /etc/login.conf to set vmemoryuse to various different values, including unlimited. I rebooted the machine and still get errors like the above. Did I miss something when running mergemaster?

A Rather than missing something in mergemaster, it sounds like you did not install and boot off a new kernel. If you build a new kernel with the same sources that you used to build your user environment and you’re still having issues, try searching through the FreeBSD mailing list archives or posting to the stable mailing list. Information on reading and subscribing to the freebsd-stable mailing list can be found at:

http://www.freebsd.org/support.html#mailing-list
Q I have a fresh installation of Solaris 9, which I just configured. I created an /etc/resolv.conf file and listed our domain and our three DNS servers. /etc/nsswitch.conf has the line:

hosts: files dns
I am able to use nslookup and dig without any problems, but client-oriented programs, like ping, ssh, and a small C program I wrote to test gethostbyname(), do not successfully resolve an IP when given a hostname. I am able to use the IP address instead of the hostname for each of these services. As a last ditch effort, I changed /etc/nsswitch.conf to use just dns and everything started working fine. I then added files back in, so it exactly matched the contents of the file when it was failing, and it still continued to work. Why would modifying /etc/nswitch.conf (and then changing it back) cause name resolution to suddenly start working?

A Are you running the name service cache daemon (nscd) on this machine? My best guess is that you are, and changing the modification time on /etc/nsswitch.conf triggered nscd and made it reread your /etc/resolv.conf file. I choose not to run nscd on the machines I administer because it often causes more hassles than it’s worth. If you don’t want to run nscd at all, remove the following files:

/etc/rc0.d/K40nscd
/etc/rc1.d/K40nscd
/etc/rc2.d/S76nscd
/etc/rcS.d/K40nscd
If you want to run the caching daemon but not use it for hosts, or if you want to tweak some of the parameters that nscd uses, take a look at the nscd and nscd.conf man pages. You can experiment with the values in /etc/nscd.conf until you are happy with the TTL parameters.

Amy Rich, president of the Boston-based Oceanwave Consulting, Inc. (http://www.oceanwave.com), has been a UNIX systems administrator for more than five years. She received a BSCS at Worcester Polytechnic Institute, and can be reached at: qna@oceanwave.com.