Cover V08, I10
Article

oct99.tar


File Transfer and Verification Between Non-Connected Networks

Robert Blader

Our users routinely need to transfer applications from a development (source) network to a production (target) network that has a higher security classification. Although the applications are the same, the data processed on the target network is received from another agency and is sensitive in nature. Due to the differences in the security levels of the networks, any physical connectivity is prohibited. The requirement was to write a utility to transfer files from one network to another. All files were to be copied to the same absolute path on the target network as on the source network, with the same access modes, ownership, and group as on the source side. Dates were also to be preserved on regular files but not necessarily for the directories.

Complications

There was no wire connecting the networks, which ruled out directory replication schemes such as the one described by Jim McKinstry in “File Replication” (Sys Admin, Feb. 1998, Vol. 7, No. 2) or adaptations of it. Initially, I tried a simple script that tar'ed files to a tape on a workstation connected to the source network. Then, I copied them onto the target network using a tape drive connected to the target network. Unfortunately, ownership and permissions were not consistent. I verified that I had used the -p flag in the tar commands, and that the files on tape showed the correct owner and access information. When I transferred a file whose path was something like /dir1/dir2/dir3/dir4/file.out on the source network, and the target network only had /dir1/dir2, directories dir3 and dir4 were being created with root as the owner and root's umask (026) determined the access mode. The -p only seemed to apply to file.out but not to the directories in its path. cpio had even worse side affects, so I went back to the drawing board using tar.

Solution

The Korn shell script, net_xfer (Listing 1), was written to have a simple user interface. The user types:

net_xfer  [-f | -l | -help] filename

where -f indicates that “filename” is the name of the file to be transferred, and -l indicates that “filename” is a listing of the files (absolute paths) that are to be transferred. The listing can be generated with a find command. The only constraint on the value for “filename” is that it must be an absolute path, which I can verify by seeing that the first character is a slash. -help displays a usage message. The requests are queued up in a file called requests.xfr, and this file is appended with each user request and removed once the request is satisfied. We track who requested files in xfer_verify.$USER. The $USER suffix is really the user's login name. This is used later to send a verification notice on the target system. When the user makes a request, a function called dir_parse() is invoked. It takes the absolute path of the files in xfer_verify.$USER, breaks it into its component parts, and appends it to a /usr/bin/ls -ld command. We execute this script and save the output in xfer_verify.$USER.out. Both the xfer_verify.$USER script and its output are added to the list of files to transfer, requests.xfr. (All listings for this article are available from the Sys Admin Web site at: www.sysadminmag.com or from ftp.mfi.com in /pub/sysadmin.

dir_parse()
{

FILENAME=$1             # file whose path is to be broken down
VERIFY_SCRIPT=$2    # file that will contain ls -ld commands

DIR=`dirname $FILENAME `

# Run dirname on the path until we are down to "/"
while [ $DIR != "/" ];
do
echo "ls -ld $DIR" >> $VERIFY_SCRIPT
     DIR=`dirname $DIR`
done
# Sort and remove duplicate entries
sort -u -o $VERIFY_SCRIPT $VERIFY_SCRIPT }

So, for example, if a user enters net_xfer -f /a/b/c , /a/b/c is written to requests.xfer, and a script is created that contains:

#!/bin/ksh
ls -ld /a
ls -ld /a/b
ls -ld /a/b/c

This script is run and its output is saved in xfer_verify.$USER.out. Then, the script and the output are added to the requests file, request.xfer. We also log the requests to a file that is group writeable. net_xfer is setgid. This way the log file does not have to be world-writeable, and we do not need a setuid script.

Once a day in a cron job, net_xfer is run as root with a -now flag. This flag is only intended for use as root, which is why it was not discussed earlier. If there is a requests file, the files in it are tar'ed to an 8-mm tape using tar's -I flag (which says do not archive the requests file, but rather archive the files in it) and the tape is ejected. Operators know to check for an ejected tape at the scheduled time. If the tape has not ejected, there were no requests for that day. If it did eject, the tape is write-locked and carried over to the target network. A cron job on that system runs as root and reads the tape. Because relative pathnames are not allowed, files are written where they belong. We then run the same verification script that was run on the source network and compare the results from both systems. We strip access mode, owner, and group from each line of both files, and compare the fields.

#
# Use tr to remove multiple blanks so that we can use cut on 
# fields delimited by blanks.
#
  TARGET_SHORT=`echo $TARGET | tr -s `
  SOURCE_SHORT=`echo $SOURCE_LS_LINE | tr -s `

  target_mode=`echo $TARGET_SHORT | cut -f1 -d" "`
  target_owner=`echo $TARGET_SHORT | cut -f3 -d" "`
  target_group=`echo $TARGET_SHORT | cut -f4 -d" "`
  target_file=`echo $TARGET_SHORT | cut -f9 -d" "`

  source_mode=`echo $SOURCE_SHORT | cut -f1 -d" "`
  source_owner=`echo $SOURCE_SHORT | cut -f3 -d" "`
  source_group=`echo $SOURCE_SHORT | cut -f4 -d" "`
  source_file=`echo $SOURCE_SHORT | cut -f9 -d" "`

If the filenames don't match, we exit because something is out of sync, and we provide a message to the operator. When any of the other fields do not match, we use the values from the source network and run the appropriate chmod, chown or chgrp command. This way any differences are resolved as shown below:

  #
  # Compare the modes
  #

  if [[ $source_mode != $target_mode ]]; then
  chmod 000 $target_file

#Just in case the "lock bit is set, make sure to unset it"  
# If a regular file is setgid and not executable, mandatory locking 
# will be in effect for the file.  See chmod (2) for more details.
#

  if [ -d $target_file ]; then
             chmod g-l $target_file
  fi

# skip the first bit that indicates if the file is a link, 
# directory, or plain file.<

  let index=2	
  while [ $index -le 10 ]<
  do


     MODE_BIT=`echo $source_mode | cut -c $index`
     if [ $MODE_BIT != '-' ]; then

     case $index in

     2|3)
          chmod u+$MODE_BIT $target_file
               ;;
     4)
          case $MODE_BIT in

          s) chmod u+x $target_file
             chmod u+s $target_file
               ;;
          x)
             chmod u+x $target_file
               ;;
          esac
               ;;

     5|6)
           chmod g+$MODE_BIT $target_file
               ;;     
#
# Some code to handle GID bit
#

     7)
     case $MODE_BIT in

#
# On directories, an l means setgid without group execution.
#
          s|l)
                    chmod g+x $target_file
                    chmod g+s $target_file
                         ;;
          x)        chmod g+x $target_file
                         ;;
     esac


                    ;;

     8|9)

               chmod o+$MODE_BIT $target_file
                    ;;
     10)
#
# Case statement to accommodate the sticky bit.
#
          case $MODE_BIT in
          
               x)

               chmod o+$MODE_BIT  $target_file
                    ;;
               t)

               chmod o+x $target_file
               chmod u+t $target_file
                     ;;

               T)

               chmod u+t $target_file
               ;;
          esac

     esac


fi
let index=$index+1
done

     fi # end mode comparison 

#
# Compare ownership
#
     if [[ $target_owner != $source_owner ]]; then
          chown $source_owner $target_file
     fi

#
# Compare group ownership
#
     if [[ $target_group != $source_group ]]; then
           chgrp $source_group $target_file
     fi
     

To summarize the process, users run net_xfer [ -l | -f} filename on the source network to queue up requests. Root runs net_xfer -now to tar the requests to tape and empty out the request queue. An operator write-protects the tape and loads it on the target system where a cron job runs read_xfer (Listing 2), which untars the tape and invokes file_verify to reconcile any differences between permissions on the files that were copied and the directories that are in their paths.

Verification of the transfer is mailed to the user and the process is complete. This utility can also be adapted by sites that need to distribute software across LANs to remote sites, or transfer published Web pages from a Web development site to a Web server. It could also be a component of a configuration management tool. file_verify (Listing 3), can be the basis of a tool to help recover from certain types of accidents. For instance, copying a large directory structure with tar and forgetting the -p, or issuing a command like chown username .* *, which would traverse the .. link. A listing from an old backup tar tape could be used as the xfer_verify.$USER.out file, and security modes could be reset without losing any data.

Enhancements to it might include a comparison of the file size, and perhaps a checksum to ensure that the files were copied correctly. A rule to handle the case in which a user that owns a file on the source network does not have an account on the target network might also be worthwhile. The code in its entirety may be downloaded from Sys Admin.

About the Author

Robert Blader has been a systems administrator since 1988 at the Naval Surface Warfare Center, Dahlgren Division. He recently began focusing in the areas of network security and risk management. He can be reached at bladerrg@nswc.navy.mil.