A Framework for Automated File Transfer
George Callaway
Lost systems administrators regularly have to transfer files from
machine. Writing scripts to perform this activity is part of the
nature of a good admin. As a consultant, I have had to make my scripts
portable because every client has a different environment with new
twists. One way to deal with this is to keep as much reusable functionality
as possible in ksh function libraries.
This article explores some functions that I have written to help
automate file transfers. I do not promise that my scripts are bug
free, or that they are the best approach in every circumstance.
My objective here is to provide some examples of methods that have
worked for me, and that might help.
Really, all that these file transfer functions do is to use ftp
or rcp to move files. I have wrapped them with ksh
code to make them a little more reliable, or a little more flexible,
for unattended operation. I also provide other useful functions
that appear in most of my production scripts. The complete syntax
for each function is provided as comments in the function library
admin_lib. (All listings for this article are available from:
www.sysadminmag.com.)
A ksh function library is really just a ksh script
that contains ksh functions. They could be contained in the
script itself, but keeping them in a separate file allows them to
be used by multiple scripts as well as allowing you to fix your
bugs in one place instead of many. A ksh function is a bit
of code that can be called to perform a specific task. For example,
one function that I wrote simply checks the exit value from the
last command, and exits if it is non-zero.
function OnErrorExit
{
typeset ErrorCode="$?"
if [[ "$ErrorCode" -ne 0 ]]
then
DebugMsg 1 Exiting with code of $ErrorCode
exit $ErrorCode
fi
}
The way that you would use this in your script would be to use this
function just like you would any other ksh command:
/usr/sbin/ping myhost
OnErrorExit
If the ping returns non-zero, the program will exit with ping's
exit value. This approach can help keep your code be more modular
and easier to read.
Once you have created your function library (or ksh script
containing functions), you simply need to "source" it
by typing "dot" followed by the ksh script name.
You can either provide a full pathname to the script, or it can
be in your PATH. Here is how it is done in the example provided:
. admin_lib
This article will introduce you to some functions that I have written,
along with an example script that uses the provided function library.
ReliableTransfer -- The first function of interest
is ReliableTransfer. This function was originally written
as part of a method for backing up a database to a remote machine.
I was using rcp, but found that I would do some silly things
like overwrite files and fill up filesystems. ReliableTransfer
accepts the basic syntax of rcp, and does some additional
checks. For example:
ReliableTransfer MyFileName yourmachine:/home/YourFileName
transfers the file MyFileName to a remote machine called yourmachine,
to a file /home/YourFileName using rcp. What I added
was a list of things that I found myself doing in various scripts,
with the option of printing a lot of debug information.
First, it determines whether the copy is local or remote. If the
copy is a local copy, the cp command is used instead of rcp.
For a remote copy, various network checks are performed to make
sure that the host is available and ready for an rcp. It
then checks the size of the file to be transferred against the available
space on the remote filesystem, and will exit with an error if there
is not enough space. It then ensures that it will not overwrite
a file on the target, and commences copying the file to a temporary
name. Once complete, it does a checksum against the original file
and, if correct, renames the remote file to its correct name, thus
alerting you if the script has failed, and why.
The ftp Functions
The next set of functions deals with ftp. They are not
as thorough as ReliableTransfer, but they could be extended
to use the same approach. The ftp command is preferred over
rcp when security is a concern (no .rhosts files or
service), or when you are transferring to a system where rcp
is not supported.
What these functions do is to move the existing .netrc
file (if there is one) to a backup name, then write a replacement
that has commands specific to what the function is designed to accomplish.
For this reason, these commands should be used with care. Another
important note here is that ftp behaves differently when
run in the foreground. Although these scripts work in the foreground,
they were designed to work in the background, such as being run
from cron. The worst case is that when you run them from the shell,
an ftp failure may leave you at an ftp prompt. Just
type "quit" and the script will continue.
- CheckRemoteLogin -- This function tries an ftp
login, and exits with an error value if the login does not work.
I used this to check for a machine that was only occasionally
there. This helped convince everyone that the other machine was
the problem, and not my code!
- GetRemoteList -- This function returns a list of
files with the extension passed to it. I use it in the script
to assign the returned list into a ksh variable. This list
can be used to control the remaining ftp sessions.
- GetRemoteFiles -- This function gets the files
in the file list, and returns them to the local location given.
It then renames the remote files with a .done extension
to prevent them from being transferred again.
- PutRemoteFiles -- This function transfers a list
of files from a local directory to a remote directory via ftp.
Other Useful Functions
The remaining functions are for supporting general script creation:
- DebugMsg -- This function prints a message to the
console. Messages print based on the setting of an environment
variable called "DebugLevel". Level 0 is supposed to
be almost silent, 1 is errors only, 5 is general info, and 10
is full debug. In the example script, this function defaults to
5 if it is not already set in the environment.
- CreateApplicationLock, RemoveApplicationLock
-- These functions are used to ensure that only a single copy
of a script is being run.
- OnErrorExit -- This function simply checks for
the exit value of the previous command ($?) and exits the
program with that value. The value of this function is that it
reads much better than putting an entire "if" statement
block after every command that you want to check.
- GetFileSize -- This function returns the size in
bytes of a file either locally or on a remote machine.
- GetFileSum -- This function returns the checksum
of a file either locally or on a remote machine.
- RemoteSpace -- This function returns the space
left on a remote filesystem.
- LogEvent -- This function writes a log entry to
a specified log file either locally or on a remote machine.
- Help -- This function prints out descriptions of
functions in the library. This was written to promote self-documentation
of the library by using comments.
Putting It All Together -- The Example Script
The scripts provided with this article are admin_lib and
frame. Admin_lib is the collection of ksh functions.
frame is an example script that uses many of the functions
described here.
Let's walk through frame and outline how admin_lib
functions are used. Of course, somewhere near the top of the script,
you will need to source the function library. In ksh, this
is done by sourcing admin_lib. I usually set up a function
in my script that will run when the script exits for any reason.
I then reference that function in a "trap". This function
is where I put notification email, logs, cleanup, etc. That way,
no matter how the script exits, it cleans up after itself.
function ExitGracefully # Exit the program gracefully
{
...
}
trap ExitGracefully EXIT
I like to make sure that there is not another instance of this program
running. To do this, I attempt to place a "lock" by creating
a special directory in /tmp. The function I use to do this
is CreateApplicationLock. If it fails, then the program is
already running and should not start up again. I clean up this lock
by running RemoveApplicationLock when I am done.
The next section of code checks the ftp connection. This
is done using:
CheckRemoteLogin $RemoteMachineName $RemoteUserName \
$RemoteUserPassword 2> /tmp/check$$
The function CheckRemoteLogin attempts a login based on the
passed parameters. The file /tmp/check$$ is used to verify
that a "good" login was achieved. It there is any output
to this file, it is assumed that the login failed.
Next, a list of files from the remote machine is created:
FileList='GetRemoteList $RemoteMachineName $RemoteUserName \
$RemoteUserPassword $LocalLocation $RemoteLocation html'
(Note that there is no line wrap in the script.) This returns a list
of files with an html extension into the variable FileList.
This will be used to get the files:
GetRemoteFiles $RemoteMachineName $RemoteUserName \
$RemoteUserPassword $LocalLocation $RemoteLocation $FileList
The remote files are transferred to the local machine, then have
.done appended to their names. After this, a list of local files
is built for transfer to the remote machine:
FileList='ls -1 *.html 2> /dev/null'
This list is then used to push files over to the remote system:
PutRemoteFiles $RemoteMachineName $RemoteUserName \
$RemoteUserPassword . $RemoteLocation $FileList
After PutRemoteFiles finishes copying the files over to the
remote system, it places them into a local directory called "archive".
At this point, the script has done all that it was designed to do,
and should now exit with a successful status:
# If program makes it here, then it was successful
ErrorFlag=0
echo >> $MailStatusFile
# Then go to ExitGracefully via trap
Setting the ErrorFlag at the end ensures that we actually made
it to the bottom of the script. After this normal exit, as with any
exit, control is passed to the ExitGracefully function defined
earlier.
Conclusion
I hope you find this set of functions and the example script helpful.
Please feel free to email questions, suggestions, and improvements
to me at the address provided.
George Callaway is a Senior Technical Architect with Emerald
Solutions, a professional services company based in Portand, OR.
His primary interests are systems and software architecture, UNIX
administration, Oracle database administration, and Java. He can
be reached at: george@georgecallaway.com.
|