Proxy
FTP without the Browser
Robert Chuba and Anthony Caruso
Consider a small unregarded little network, which is in protected
RFC 1918 IP space sitting behind a firewall. Direct access to the
Internet is not allowed. Instead, Web access is provided by an HTTP
proxy server. While this meets most of the office needs, downloading
files from an ftp site is only possible through the browser.
Writing an ftp script to periodically download files, such
as virus .dat files, is impossible. Instead of opening a
port on the firewall for ftp and breaking the rules just
for us, we decided to try something else.
Knowing we could access the ftp site via a browser, we
considered how the browser manages requests for ftp://ftp.somesite.com
when typed into the location text box. Further, we wondered if we
could use the same mechanism via a script to make HTTP GET
requests through the proxy. Listing 1, pftp.pl, or proxy
ftp, is the result of this exercise. (Listings are available
for download from the Sys Admin Web site: http://www.sysadminmag.com.)
Our script relies on the LWP (library for Web access in Perl)
UserAgent module. LWP is a collection of modules that provide an
API to Web functions and is part of the Linux standard distribution
that can be downloaded from: http://www.linpro.no/lwp/. The
UserAgent class implements simple WWW requests. Our script uses
the functions to make ftp requests instead of http
requests. Other than that, and some user interface options, the
script is essentially the same as the example in the UserAgent documentation.
Usage
The script has two modes of operation. It can be used from the
command line to ftp a file:
pftp.pl -u ftp.somesite.com/pub/docs/README
or to get a couple of files:
pftp.pl -u ftp.somesite.com/pub/docs -r README,freecode.tar.gz,morecode.tar.Z
pftp.pl -help lists the command-line options.
The script can also be used from a cron job. Though batching a
script is nothing new, we decided to use the data section to placed
defaults so the data stays with the script. That is, we wanted the
ability to change the name of the script to correspond with its
function. For example, we created a copy of the script named pftp-av.pl
and changed the data section to read:
__DATA__
proxyxport = 8080
ftpuser = anonymous
pftppwd=us@oursite.com
proxy = 10.1.1.1
url=ftp.antivirus.com/pub/newdatfiles
remotefiles = new.dat,update.txt,ver.txt
localpath=\\avserver\distribution\
This downloads the three files listed in the remote files parameter
from our fictitious antivirus distributor and places the files on
our antivirus server in the distribution share.
The Script
The first few lines of the script tell the interpreter which modules
we will be using. The only notable module we use is Cwd,
which is used for cross-platform independence. The next block of
code is simply for processing the command line. The usage message
is issued if the arguments to the script are incorrect.
The while() block that follows processes the __DATA__
section to determine default settings. The data section can
contain any of the named parameters listed in the GetOptions()
call in the command-line processing block. For example, -proxy
is an argument to the script, so the data section can hold a default
proxy with the entry:
proxy = 10.1.1.1 # default proxy
Lines beginning with a #, blank lines, and white space are ignored.
Comments can also follow an entry and are removed by the line:
my($right,$comment) = split /#/,$r,2;
where $r holds everything to the right of the character =
for that entry. All variables (left-sides) are converted to lowercase
so they can be processed later.
The subsequent block sets the script's options. Anything
set on the command line overrides the defaults. Finally, a bit of
processing standardizes the URL entry. Remote files to be downloaded
are put into the list @rfiles. If -remotefiles is
not specified, we assume the file name is appended to the end of
the URL. The file name will be stripped off the URL and loaded into
@rfiles.
Now that everything is prepared, we instantiate the UserAgent
object, and set the member variables env_proxy, and proxy
for http and ftp. Looping through the list of @rfiles, the
script appends the file name to the URL and creates a new HTTP::Request.
If all goes well, the files are downloaded into the specified directory.
Future Work
Although this script meets our current production needs, we still
want to improve it. Features to be added when our copious free time
comes to fruition include: improved error-checking, the ability
to download entire directories, and to include a mechanism for authenticating
to a proxy. Actually, the last item isn't difficult, we didn't
want to implement it because we couldn't test it -- our
proxy doesn't require logins. If you'd care to improve
our script, please drop us a copy -- we'd like to see what
you do.
Resources
Perl: win32 -- http://www.activestate.com
Tony Caruso has been Network/Systems consultant with MRE Consulting,
Inc. in Houston, TX for the last three years and has been hacking
code since he was 12 (on the Apple II). Tony has been working in both
NT and UNIX for 8 years. Tony holds an MS in Computer Science from
the University of Oklahoma and a BS in Electrical Engineering from
Louisiana Tech University.
Robert Chuba is a consultant for MRE. He is a systems analyst
with 2 years experience in Novell and 3 years in NT. Robert is working
on his BS at the University of Houston.
|