Linux Format 24 Perl tutorial [[ TYPOGRAPHY/LAYOUT -- text surrounded by _underscore_ characters like so should be italicized or emphasized. Text indented from the margin by two or more characters is a program listing: needs monospaced typeface, with indentation and word wrap preserved. Contact me if it needs changing to fit the page. ]] HEADING: Writing an FTP client Last month we looked at CPAN, the Combined Perl Archive Network, and some of the categories of modules stored on CPAN. This month, we're going to look at how you can use some of these modules to do something complicated and annoying: namely, write a special-purpose FTP client that can 'push' a tree of files and directories up to a server. You might find such a tool useful for updating web pages held on your ISP's web server; or you might want to back up some of your personal files to another machine this way. There are, of course, alternatives. The mirror.pl program developed by the SunSite UK repository at Imperial College and freely available was designed to do this mirroring process in reverse -- but more recent versions allow uploading of directory trees. And there's the old UNIX standby, rsync, which is probably better designed and implemented -- but uses the rsh and rcp tools to do the back-end copying, which isn't much use when you're talking to an FTP server. SUBHEADING: What does it need to do? Firstly, it needs to be able to connect to a strange FTP server (using Net::FTP) in a particular directory. It needs to list the files stored there, including their last modification times. Secondly, it needs to look at a local directory tree and list the files in it, including their modification times. Thirdly, it needs to compare these two lists, and build two more lists: one of files on the local host that are newer or not present at all on the FTP server (and which therefore want to be uploaded), and a list of files on the FTP server that are newer than their local copy, or not present on the local host at all (and which therefore want to be downloaded or deleted). (Note that the clocks on your local system and the FTP server may be badly adjusted -- so it is necessary to poll the FTP server's time service and add or subtract a correcting skew factor to local file modification times. Otherwise, if the server's time is a couple of hours fast you could end up downloading an old copy of a file and overwriting your local copy, instead of uploading the really-newer local copy.) Finally, using the existing FTP connection, LookingGlass needs to create any necessary new subdirectories on the server and send new files, then delete old files from the server. This is the core of what LookingGlass needs to do -- but it's not the entire picture by a long way. If we're going to make this a genuinely useful program to other people, we need to make it configurable. Wiring in hard-coded values for things like the destination FTP server's name is no good; it might be used by someone who doesn't know Perl. So we need to enable LookingGlass to accept command-line configuration parameters, like a normal UNIX command, so we can write shell scripts containing bits like this: lookingglass --source=/home/charlie/website \ --host=www.server.org \ --root=/home/httpd/htdocs/www.thingy.com/charlie \ --uid=charlie-web \ --pass=letmein \ --nodel \ --logfile=website.log (Note: the --nodel option is the negated version of the --del option, which means "permit the deletion of files on the server".) We probably don't want to go around including passwords and usernames in world-readable and executable scripts; a configuration file mechanism makes more sense, to LookingGlass wants to first check for a file called .lookingglass (owned by the user, in their home directory), then, if it doesn't exist, for /etc/lookingglass.conf. However, the command line arguments should override directives in the user and system configuration files. No program is complete without brief online help (lookingglass --help) and a man page; this is provided in POD format, after the __END__ directive marking the end of perl code in the file. (You can tell it to print the man page using the --man option.) SUBHEADING: A matter of style Now we've worked out roughly what we want to do, how do we start writing such a program? I'm a believer in writing code that other people can maintain. (Certainly if you're run over by a bus tomorrow, your boss won't thank you for writing all those elegant but incomprehensible Obfuscated Perl Contest entries.) I also try to work on the principle that not all programmers are familiar with all aspects of idiomatic Perl -- the Perl way is distinctive and eccentric by programming standards, so avoiding really obscure features and using copious comments are good habits to cultivate. This sort of command-line tool can be structured fairly cleanly around a standard skeleton that looks like this: 1. use() statements that load lots of CPAN modules 2. Compiler pragmas such as use "strict" (turn on strict scope checking). 3. Use my() or our() to predeclare every top-level variable you're going to use. Add a comment explaining what they're for. Give them meaningful names while you're about it. 4. Set up and read any configurable parameters from configuration files. 5. Override configurable parameters using command-line options. If necessary, print help messages and exit at this point. 6. Begin logging activities/printing debugging output (if necessary). 7. Do whatever needs doing! 8. Close file handles, delete temporary working files (if any), tidy up, and exit. 9. POD documentation. Optionally, you can add a step 3a: set up a trap for signals and some basic interrupt handling. This isn't much of a consideration in this program so we'll look at it another time, when we write a TCP/IP server. But if this was going to be a widely-distributed command-line tool, it would probably be a good idea. You'll note that step (7), the actual process of doing the FTP uploads/downloads, isn't that big a deal. This isn't because it's unimportant -- quite the contrary. But because we're trying to do this properly, there are a lot of other details to take care of! If you want to write your own command-line tool -- say, a replacement for mv or cp -- you can rip the FTP code out of LookingGlass, change the command-line arguments and configuration file handling around, and find you've actually got about 70-80% of the code you need. Or you can throw all the extras away and live dangerously. It's up to you. SUBHEADING: Program configuration You'll note that a surprisingly large chunk of this program -- lines 55 to 155 -- are taken up by configuration directives. (If we include the documentation, from line 555 to 690, it amounts to close to half the program!) There's a reason for this: we want to use both global and local configuration files, and command line options, to provide a useful user interface and online help. To get at configuration files we use a not-entirely-standard CPAN module called ConfigReader::DirectiveStyle. The ConfigReader parent class is designed to read in configuration files and create an object that returns named values, so that calling $obj->value('Version'), for example, would return some value associated with the tag 'Version' in a configuration file previously loaded by calling $obj->load('filename'). ConfigReader uses child classes to specify different configuration file types; the DirectiveStyle subclass defines a style of configuration file consisting of simple "namevalue" lines. You can tailor the DirectiveStyle file format (as in lines 68-76) by specifying permissable directives that are legal in the file, along with an indicator as to whether they're optional or mandatory, or ignored (i.e. your configuration file's format for comment lines). For parsing command line arguments we use the standard Perl module Getopt::Long; this parses the command line in @ARGV, extracting GNU style long arguments (those beginning with a double-dash and running on for more than one character). The way it works is shown in lines 110 to 125; you call GetOptions() with a parameter list consisting of a hash of command-line argument specifiers, each of which points to a variable. The command-line argument specifiers are directives like "log=s" (--log, takes a string parameter) or "del!" (--del, can be negated by specifying --nodel) -- a bit like a simple variable type mechanism that specifies the name of the variable and it's type. Whatever parameter is given for an argument ends up in the variable, so an entry like ' "log=s" => \$log ' means "if you see a command line containing --log=, treat as a string and stash it in $log". --- BEGIN LISTING --- CAPTION: Here's how we set up the hostname, sourcedir, and targetroot variables. First, we introduce them to ConfigReader::DirectiveStyle, then we load a config file (either from the file $user_config or $default_config), which we parse. Finally, we use GetOptions() to override parameters loaded from the config files, or to fill in values that weren't specified. 067: my ($c) = new ConfigReader::DirectiveStyle; 068: $c->directive('HostName', undef, 'localhost'); 069: $c->directive('SourceDir', undef, `pwd`); 070: $c->directive('TargetRoot', undef, '/incoming'); 080: if (-r $user_config) { 081: $c->load($user_config); 082: } elsif(-r $default_config) { 083: $c->load($default_config); 084: } else { 085: warn "No configuration files found\n"; 086: } 090: my ($sourcedir) = $c->value("SourceDir"); 091: chomp ($sourcedir); # default directory to upload 092: my ($hostname) = $c->value("HostName"); 093: # host to upload to -- default, local 094: my ($targetroot)= $c->value("TargetRoot"); 095: # default target directory to upload into 110: GetOptions("source=s" => \$sourcedir, 111: "so=s" => \$sourcedir, 112: "host=s" => \$hostname, 113: "ho=s" => \$hostname, 114: "root=s" => \$targetroot, 115: "ro=s" => \$targetroot, --- END LISTING (See how lines 128 to 148 print out a brief help message; if the argument --help was seen by GetOptions(), the value of $help is set to something non-zero, so the message is printed.) The meat of the program starts around line 176, where we first create a Net::FTP object. This type of object is a wrapper around a simple FTP client session. Note that Net::FTP objects default to an active FTP connection; if you are behind a firewall or a masquerading system you may need to change line 176 from: my($ftp) = Net::FTP->new($hostname); to my($ftp) = Net::FTP->new($hostname, "Passive"); (Hint: you can add this to the DirectiveStyle configuration file specification and the command line as well without much difficulty. Just create a new configuration variable called something like '$passive_only' and use that to control creation of the Net::FTP object.) Lines 178 to 184 are a useful UNIXy idiom: 'stty noecho' causes stty to put your terminal into non-echoing mode, so any text you type into it will no appear on screen. This is handy if you want to capture passwords interactively. (A 100% pure-perl alternative would be to use IO::Stty or one of the CURSES modules, but there's such a thing as over-egging the pudding.) --- BEGIN LISTING --- CAPTION: Here's how we prompt the user to type in a password, and capture the password without echoing it to their terminal. (Note that if the program is killed in the middle of this process it'll leave their terminal in noecho mode!) 178: if ($pass eq "") { 179: print "Enter password for [$uid\@$hostname]:"; 180: system("stty -echo"); 181: $pass = <>; 182: chomp($pass); 183: print "\n"; 184: system("stty echo"); 185: } --- END LISTING Lines 194 to 214 check what type of server we're talking to. FTP servers can run on a variety of platforms, including MacOS, Windows, UNIX, and more obscure things such as VM/CMS, VMS, VME, and TOPS-10. The one common unifying feature of all these operating systems is that they have a different idea of how to cough up a directory listing. LookingGlass needs to parse the remote directory listing in order to work out what files live where, and when they were updated. As some of these systems have positively weird filesystem semantics (no such thing as subdirectories, or mandatory two-level-deep directory specifiers, or different ideas of how to handle relative paths, that sort of thing) LookingGlass cries "uncle!" and quits if it thinks you're trying to, make it talk to something eldritch, rather than trying to proceed and risking damage to your files. Note the frequent calls to abend(), defined in 539-553. This is a fairly crude exception handler that logs an error in the right place, closes file handles, in particular attempts to close the FTP connection, and then exits. It might be a good idea to tie $SIG{KILL} to abend(), so that typing "killall -15 LookingGlass" would do something sensible rather than leaving a dangling socket. --- BEGIN LISTING --- CAPTION: Error handling is important and usually die() is a bit too simple-minded; you'll want a subroutine that prints messages, logs whatever is going on (and closes the logfile), and terminates active network connections before it finally quits. For added fun, hook this up to a signal handler so that it deals with signals as well as runtime errors. 539: sub abend { 540: # exit, giving some kind of appropriate warning 541: # 542: my ($session, $message) = @_; 543: print $message, " at ", ctime(time), "\n"; 544: print LOG $message, " at ", ctime(time), "\n"; 545: close LOG; 546: if (ref ($session) eq "Net::FTP") { 547: $session->quit || die "abend(): Session failed to exit properly!\n"; 548: } else { 549: print "abend(): I was expecting a Net::FTP object,\n", 550: "but you passed me a [", (ref($session)|| "scalar"), "]\n\n"; 551: } 552: exit 1; 553: } --- END LISTING Lines 216 to 230 are an attempt to work out the time difference between the FTP server and the client, to prevent the sort of synchronization errors described earlier. It will fail if the FTP server isn't willing to answer time queries. In this case, an alternative -- crude -- option would be to send a zero-length file to the server and immediately do an 'ls -l' on it to see what the server thought the creation time was. This has numerous drawbacks -- not least, the delay between sending the file and reading the results of the listing, over a congested network or to a slow server -- which is why it wasn't done here. (But you may find it useful to bear in mind if you want to deploy LookingGlass in a situation where the FTP server is firewalled.) Lines 238 to 262 show a (terse and unclear) example of how you can recursively build a list of all the files in a directory tree, either locally or via a Net::FTP connection. An array of directories to scan is created; while the array exists, you pop the top element off the array, look at it (using dir_files_remote() or dir_files_local(), about which more below), and add any returned directories to the array, while adding files you found to a separate array. dir_files_remote() and dir_files_local() are convenient wrappers; each of them returns two arrayrefs, one to a list of directories, and the other to a list of files, both found in the designated directory to scan. (Actually they return a good bit more -- the equivalent of an 'ls -l' listing, along with the full file pathname appended.) dir_files_local() uses diropen(), readdir() and closedir() on the local filesystem, then calls stat() on each file in turn to acquire the details; dir_files_remote() tells the Net::FTP object to call the directory listing program on the remote server. Lines 264 to 301 compare modification times between local and remote files. It's ugly, but at the end we're left with a hash of filenames and their modification times for each server; this allows us to easily compare files between machines and work out which is more up-to-date (in Lines 304-341). (Note: if this looks big, it's because there are a lot of debugging statements embedded here. Set the value of $DBG to some number greater than 0 to see debugging output as the program runs; set it to 4 or 5 to see everything, with great verbosity. Or chop out all the lines that begin "($DBG > n) && print ... " to see how small this code really is.) It's important, before uploading files, to ensure there's somewhere to upload them to (and the same in the opposite direction). The code in lines 355-395 accomplishes this by creating hashes of valid directory names and then creating any local or remote directories that are needed and don't exist. The process of uploading local files to the server is so small you can almost miss it (lines 395-402) -- it's not a complex task to accomplish with Net::FTP. LookingGlass then creates any missing local directories (408-433) and pulls files into them if necessary (436-442). Finally, if we're deleting old remote files, lines 445-475 delete them. Note that LookingGlass removes directories from the FTP server if files have been deleted from them and an ls -l check (on line 461-463) shows that they're now empty. So what's wrong with this program? Nothing -- and lots. As noted above the clock skew handling is crude, it can't talk to non-UNIX-family FTP servers, it may need modifications to deal with firewalls, the POD documentation could be clearer, it ought to handle password hiding better, and it doesn't use CPAN.pm to automagically install any missing modules when it first loads. On the other hand, the core functionality is all there: it's an FTP mirroring script, and it works more or less as advertised. Happy hacking! END (BODY COPY) BEGIN BOXOUT: The FTP Protocol and Net::FTP FTP, the File Transfer Protocol, is one of the oldest protocols used on the internet. It defines a means for client computers to request files from a server, and how the server should behave. An FTP session starts when an FTP client opens a TCP/IP connection to the server, which it uses to send commands. In "Active" mode, the FTP client is running a mini-server; when it requests a file, the server opens a connection to the client and sends the file, leaving the control connection unaffected. In "passive" mode (which you will need to use if you're behind a firewall or a Network Address Translation -- IP masquerading -- gateway), the file is sent over the control socket. (NAT gateways prevent the server opening a secondary connection back to the client on the other side of the gateway.) The Perl Net::FTP module wraps a client-side connection up in a Perl object, of class Net::FTP. You create a new Net::FTP object, specify the server to talk to, then issue commands that are surprisingly similar to those you'll use in a command-line FTP client such as ftp or ncftp. A major difference between FTP and HTTP (used by the web) is that FTP sessions are stateful. You first open a connection, then give the server a username and password (even for an anonymous connection) to log in. Then you can issue various commands -- either UNIX-style ones such as "ls", "cd", "get" or "put", or the cross-platform command set defined within the FTP protocol (such as NLST, RETR, STOR, or PASV). The Net::FTP object usually returns an array reference containing the output of executing a command on the FTP server, or a scalar indicating success or failure. You can find the appropriate methods and their results in the Net::FTP POD documentation -- type "perldoc Net::FTP" or "man Net::FTP" to read this. A simple Net::FTP session, stripped of the extras required for LookingGlass, looks like this: my $ftp = Net::FTP->new("ftp.demon.co.uk", Passive => 1); if (ref($ftp) ne "Net::FTP") { exit "Error: $@\n"; } # login() is anonymous unless you set username and password if (! $ftp->login()) { exit "Error: Couldn't log in\n"; }; if (! $ftp->cwd("/pub/perl/CPAN/modules/by-module/Net/") ) { exit "Could not cwd(/pub/perl/CPAN/modules/by-module/Net/)\n"; }; # list files in the current directory that have 'libnet' in their name my @libnet_files = grep(/libnet/, $ftp->ls()); print "files found:\n", join("\n", @libnet_files) , "\n\n"; # fetch libnet-1.09.tar.gz, if possible if (! $ftp->get("libnet-1.09.tar.gz")) { exit "Unable to fetch libnet-1.09.tar.gz\n"; }; print "Look in the current directory for libnet-1.09.tar.gz!\n"; $ftp->close(); END BOXOUT (The FTP Protocol and Net::FTP) BEGIN PROGRAM LISTING 001: #!/usr/bin/perl 002: # 003: # LookingGlass 004: # 005: # ftp batch upload client 006: # 007: # See __END__ for pod documentation 008: # 009: #---------------------------------- 010: # Version Changes 011: # --------------------------------- 012: # 0.1 First alpha release 013: # 014: # -------------------------------- compiler pragmas and modules 015: 016: use strict; 017: 018: use ConfigReader::DirectiveStyle; 019: use File::Basename; 020: use File::Copy; 021: use Getopt::Long; 022: use Net::FTP; 023: use Net::Time; 024: use Time::CTime; 025: use Time::ParseDate; 026: use Pod::Text; 027: 028: # ------------------------------ main program variables 029: 030: my (0.000000ound) = {}; # hash of files found on server 031: my (@files) = (); # list of files found on server 032: my ($files) = ""; # pointer to temp. array of files 033: my ($dirs) = ""; # pointer to temp. array of directories 034: my($arg) = ""; # scratch filename container for loops 035: my (@dirstack) = (); # stack of directories 036: my ($target) = ""; # current directory to traverse 037: my (@remote_files) = (); # array of refs to arrays containing stat() info 038: my (@local_files) = (); # array of refs to arrays containing stat() info 039: my (@local_dirs) = (); # array of (non-unique) local directories 040: my (@remote_dirs) = (); # array of (non-unique) remote directories 041: my (%remote_dirs) = {}; # hash of remote dirnames 042: my ($local_dirs) = {}; # hash of local dirnames 043: my (%remfiles) = {}; # hash of remote filenames; val = mtime 044: my (0calfiles) = {}; # hash of local filenames; val = mtime 045: my (@uploads) = (); # list of files to upload 046: my (@downloads) = (); # list of files to download 047: my (@deletions) = (); # list of files to delete from remote server 048: my (@gets) = (); # list of files to fetch from remote server 049: my ($remtime) = 0; # remote server time 050: my ($loctime) = 0; # local host time 051: my ($skew) = 0; # clock skew correction coefficient 052: my ($val) = ""; # hash iteration scratch val 053: my ($noremote) = 0; # flag: if 1, there are no files on remote 054: my ($man) = ""; # flag: used to print man page 055: 056: # ---------------------------- set up configuration variables - default 057: 058: my ($default_config) = 059: "/etc/lookingglass.conf"; # default config file 060: my ($user_config) = (getpwuid($<))[7] . ".lookingglass"; 061: # user's config file 062: 063: fileparse_set_fstype(); # initialize the local filesystem type 064: 065: # ------------------------------ setup config file options 066: 067: my ($c) = new ConfigReader::DirectiveStyle; 068: $c->directive('HostName', undef, 'localhost'); 069: $c->directive('SourceDir', undef, `pwd`); 070: $c->directive('TargetRoot', undef, '/incoming'); 071: $c->directive('UID', undef, 'ftp'); 072: $c->directive('Password', undef, 'user@host'); 073: $c->directive('DelRemote', undef, 0); 074: $c->directive('GetRemote', undef, 0); 075: $c->directive('Debug', undef, 0); 076: $c->directive('Logfile', undef, '/tmp/lookingglass.log'); 077: 078: # ------------------------------ read and parse config file 079: 080: if (-r $user_config) { 081: $c->load($user_config); 082: } elsif(-r $default_config) { 083: $c->load($default_config); 084: } else { 085: warn "No configuration files found\n"; 086: } 087: 088: # ------------------------------ assign config items to variables 089: 090: my ($sourcedir) = $c->value("SourceDir"); 091: chomp ($sourcedir); # default directory to upload 092: my ($hostname) = $c->value("HostName"); 093: # host to upload to -- default, local 094: my ($targetroot)= $c->value("TargetRoot"); 095: # default target directory to upload into 096: my ($uid) = $c->value("UID"); 097: # default to anonymous ftp 098: my ($pass) = $c->value("Password"); 099: my ($help) = ""; 100: my ($delremotefiles) = $c->value("DelRemote"); 101: # remote file deletion flag 102: my ($getremotefiles) = $c->value("GetRemote"); 103: # remote file get flag 104: my ($logfile) = $c->value("Logfile"); 105: # remote file to log to 106: my($DBG) = $c->value("Debug"); # debug level 107: 108: # ------------------------------ override config from command line 109: 110: GetOptions("source=s" => \$sourcedir, 111: "so=s" => \$sourcedir, 112: "host=s" => \$hostname, 113: "ho=s" => \$hostname, 114: "root=s" => \$targetroot, 115: "ro=s" => \$targetroot, 116: "uid=s" => \$uid, 117: "pw=s" => \$pass, 118: "lo=s" => \$logfile, 119: "logfile=s" => \$logfile, 120: "del!" => \$delremotefiles, 121: "get!" => \$getremotefiles, 122: "debug=i" => \$DBG, 123: "help!" => \$help, 124: "man" => \$man); 125: 126: # ----------------------------- issue help message, if needed 127: 128: if ($help > 0) { 129: print "\nLookingGlass 0.1\n", 130: "Recursively send a tree of files up to a server via ftp\n", 131: "overwriting and deleting the old tree along theway.\n", 132: "\nOptions:\n", 133: "--source=, --so= Source directory for files to send\n", 134: "--host=, --ho= Hostname to upload files to\n", 135: "--root=, --ro= Directory to root files in on server\n", 136: "--logfile=, --lo= Filename to log session to\n", 137: "--uid= User ID to log into ftp server with\n", 138: "--pw= Password to use on server\n", 139: "--del Delete remote files if local has gone\n", 140: "--get Get remote file if newer than local\n", 141: "--debug= Set debugging output (n = 1...4)\n", 142: "--help This message\n", 143: "--man Print the man page\n\n", 144: "WARNING: do not use the --pw option if you are working on a\n", 145: "machine with other people logged in! They can use the ps(1)\n", 146: "command to capture your password! If you don't specify a password\n", 147: "on the command line, LookingGlass will prompt for one.\n\n"; 148: exit 0; 149: } 150: if ($man > 0) { 151: my $parser = Pod::Text->new(); 152: $parser->parse_from_file($0, "-"); 153: exit; 154: } 155: 156: # ----------------------------- move to base dir and start logging 157: 158: my ($name, $basedir, $suffix) = fileparse($sourcedir); 159: chdir($basedir) || die "Could not enter $basedir!\n"; 160: 161: if ($logfile ne "") { 162: open (LOG, ">>$logfile") || warn "Could not append log info to $logfile\n"; 163: } else { 164: open (LOG, ">-") || warn "Could not attach LOG handle to STDOUT\n"; 165: } 166: 167: print LOG "\n", "new LookingGlass session started at ", ctime(time), "\n", 168: "source => $sourcedir\n", 169: "host => $hostname\n", 170: "root => $targetroot\n", 171: "uid => $uid\n", 172: "pw => $pass\n", "\n"; 173: 174: # ----------------------------- create new FTP session object; get password 175: 176: my($ftp) = Net::FTP->new($hostname); 177: 178: if ($pass eq "") { 179: print "Enter password for [$uid\@$hostname]:"; 180: system("stty -echo"); 181: $pass = <>; 182: chomp($pass); 183: print "\n"; 184: system("stty echo"); 185: } 186: 187: $ftp->login($uid, $pass) || abend ($ftp, 188: "ftp->login($uid) failed"); 189: 190: $ftp->binary() && print LOG "mode set to binary\n"; 191: 192: # ----------------------------- check that we're not talking to an alien 193: 194: my ($syst_type) = $ftp->quot("SYST"); 195: my ($syst_response) = ""; 196: ($DBG > 3) && print "ftp hash contains [", join('][', keys %{*$ftp}), "]\n"; 197: my ($key, $value); 198: while (($key, $value) = each %{*$ftp}) { 199: if ($key eq "net_cmd_resp") { 200: my (@tmp_arr) = @$value; 201: chomp(@tmp_arr); 202: $syst_response = join(" ", @tmp_arr); 203: } 204: } 205: 206: if ($syst_type == 2) { 207: if ($syst_response !~ /unix/i) { 208: if ($syst_response !~ /not understood/i) { 209: abend($ftp, "ftp->quot(SYST) returned $syst_type: [$syst_response]"); 210: } 211: } 212: } else { 213: abend($ftp, "ftp->quot(SYST) returned $syst_type"); 214: } 215: 216: # ----------------------------- avoid clock skew 217: 218: # note: this only needs to be accurate to within 60 seconds -- granularity of 219: # time stamps returned by ftp->ls(). 220: $loctime = Net::Time::inet_time('localhost', 'tcp'); 221: $remtime = Net::Time::inet_time($hostname, 'tcp') || 222: Net::Time::inet_time($hostname, 'udp') || undef; 223: 224: ($DBG > 1) && print "inet_time($hostname) is [$remtime]\n"; 225: ($DBG > 1) && print "inet_time(localhost) is [$loctime]\n"; 226: 227: if (defined ($remtime)) { 228: $skew = $loctime - $remtime; 229: ($DBG > 1 ) && print "clock skew adjustment coefficient: [$skew]\n"; 230: } 231: 232: # ----------------------------- run transfer session 233: 234: # 235: # First, work out what files we have locally and on the remote server 236: # 237: 238: ($DBG > 1) && print "scanning remote files using dir_files_remote()\n"; 239: push (@dirstack, $targetroot); 240: while ($target = pop(@dirstack)) { 241: $ftp->ls($target) || next; 242: ($dirs, $files) = dir_files_remote($ftp, $target); 243: push(@dirstack, @$dirs); 244: push(@remote_dirs, @$dirs); 245: foreach $target (@$files) { 246: # print LOG "deleting ftp://$hostname$target\n"; 247: push(@remote_files, $target); 248: } 249: } 250: 251: ($DBG > 1) && print "scanning local files using dir_files_local()\n"; 252: @dirstack = (); 253: @dirstack = ($sourcedir); 254: while ($target = pop(@dirstack)) { 255: ($dirs, $files) = dir_files_local($target); 256: push(@dirstack, @$dirs); 257: push(@local_dirs, @$dirs); 258: foreach $target (@$files) { 259: # print LOG "deleting ftp://$hostname$target\n"; 260: push(@local_files, $target); 261: } 262: } 263: 264: # status: @remote_files contains remote files on server; @local_files 265: # contains file on local system. Now we need to do the comparison by 266: # mtime field (field #9 (index 0) of the stat() structure in $$local_files 267: # and date encoded in fields 5-6-7 of the $$remote_file items. 268: 269: # We add $skew to mtime on the local host to correct it relative to the 270: # remote values. 271: 272: # Now let's build a table of remote files; key = file pathname, val = mod time 273: 274: if (scalar(@remote_files) != 0 ) { 275: my ($maxsub) = scalar(@{$remote_files[1]}) -1; 276: foreach $arg (@remote_files) { 277: next if (! defined $arg->[$maxsub]); 278: my ($key) = $arg->[$maxsub]; 279: $key =~ s/$targetroot//; 280: my ($val) = $arg->[7] . " " . $arg->[6] . " " . $arg->[5]; 281: $val = parsedate($val); 282: $val += $skew; 283: $remfiles{$key} = $val; 284: ($DBG > 3) && print "$key => $val\n"; 285: } 286: } else { 287: # remote dir is empty 288: $noremote = 1; 289: } 290: 291: # now do the same for local files 292: 293: ($DBG > 3) && print "\n\nsourcedir is $sourcedir\n\n"; 294: my ($maxsub) = scalar(@{$local_files[1]}) -1; 295: foreach $arg (@local_files) { 296: my ($key) = $arg->[$maxsub]; 297: $key =~ s/$sourcedir//; 298: my ($val) = $arg->[9]; 299: $localfiles{$key} = $val; 300: ($DBG > 3) && print "$key => $val\n"; 301: } 302: 303: # build list of files to upload to server, list of dirs to create on server 304: 305: while (($key, $val) = each (0calfiles)) { 306: if (! defined ($remfiles{$key}) ) { 307: # local file doesn't exist in remote tree, so send it 308: push(@uploads, $key); 309: ($DBG > 2) && print "Upload new file: $key\n"; 310: } elsif ($localfiles{$key} > $remfiles{$key}) { 311: # local date more recent than remote date, so send it 312: push(@uploads, $key); 313: ($DBG > 1) && print "Update new file: $key\n"; 314: ($DBG > 2) && print "\tlocal time: $localfiles{$key}\n", 315: "\tremote time: $remfiles{$key}\n"; 316: } 317: } 318: 319: # build lists of files to delete or download from server 320: 321: if (! $noremote) { 322: while (($key, $val) = each (%remfiles)) { 323: if (!defined ($localfiles{$key})) { 324: if ($delremotefiles > 0) { 325: # file no longer exists in local tree, so delete it 326: push(@deletions, $key); 327: ($DBG > 2) && print "Delete defunct file: $key\n"; 328: } 329: } 330: if (( -f "$sourcedir$key") && 331: ($remfiles{$key} > $localfiles{$key})) { 332: if ($getremotefiles > 0) { 333: # remote file is more recent than local file, so get it 334: push(@gets, $key); 335: ($DBG > 2) && print "Get remote file: $key\n"; 336: ($DBG > 3) && print "local time: $localfiles{$key} ", 337: "remote time: $remfiles{$key}\n"; 338: } 339: } 340: } 341: } 342: 343: # warn if nothing to do 344: 345: if ((scalar(@uploads) == 0) && 346: (scalar(@gets) == 0) && 347: (scalar(@deletions) == 0)) { 348: ($DBG > 0) && print 'scalar(@uploads)= ', scalar(@uploads), "\n", 349: 'scalar(@gets) = ', scalar(@gets), "\n", 350: 'scalar(@deletions) = ', scalar(@deletions), "\n"; 351: abend ($ftp, "Absolutely no files tagged for upload, download, or deletion!"); 352: } 353: 354: 355: # use @remote_dirs and @local_dirs to build unique hash of valid dirnames 356: # on both sites, so we can use defined($remote_dirs{foo}) to see if a 357: # directory called "foo" exists on the remote host. 358: 359: %remote_dirs = {}; # hash of directory names; val = irrelevant 360: $local_dirs = {}; # hash of directory names; val = irrelevant 361: 362: @remote_dirs = map {$remote_dirs{$_}++ } @remote_dirs; 363: @local_dirs = map {$local_dirs{$_}++ } @local_dirs; 364: 365: if ($DBG > 2) { 366: print "Remote dirs:\n"; 367: foreach (sort keys %remote_dirs) { 368: print "$_\n"; 369: } 370: print "Local dirs:\n"; 371: foreach (sort keys $local_dirs) { 372: print "$_\n"; 373: } 374: } 375: 376: print LOG "updating files:\n"; 377: 378: # create non-existent remote directories using $local_dirs 379: 380: foreach (sort keys $local_dirs) { 381: $_ =~ s/$sourcedir/$targetroot/; 382: if (! defined $remote_dirs{$_}) { 383: ($DBG > 2) && print "Directory $_ not present on remote server\n"; 384: # $_ =~ s/$targetroot\///; 385: ($DBG > 1) && print "mkdir($_, RECURSE) ..."; 386: if ($ftp->mkdir($_, 1) ) { 387: ($DBG > 1 ) && print " FAILED!\n"; 388: } else { 389: ($DBG > 1 ) && print " OK\n"; 390: } 391: } 392: } 393: 394: # upload files to remote server 395: 396: while ($target = pop(@uploads)) { 397: ($DBG > 1 ) && print "processing upload($target)\n"; 398: my ($src) = "$sourcedir$target"; 399: my ($dst) = "$targetroot$target"; 400: $ftp->put($src, $dst) && print LOG "$src -> $dst\n"; 401: print "$src -> $dst\n"; 402: } 403: 404: if ($getremotefiles > 0) { 405: # if get is set, 406: # create non-existent local dirs using %remote_dirs 407: 408: foreach (sort keys %remote_dirs) { 409: $_ =~ s/$targetroot/$sourcedir/; 410: if (! defined $local_dirs{$_}) { 411: ($DBG > 2) && print "Directory $_ not present on local host\n"; 412: $_ =~ s/$sourcedir\///; 413: my (@dstack) = split("/", $_); 414: my ($thisdir) = ""; 415: my ($buff) = $sourcedir; 416: ($DBG > 2) && print "dstack: [", join("][", @dstack), "]\n"; 417: foreach $thisdir (@dstack) { 418: # next if ($thisdir eq ""); 419: $buff .= "/" . $thisdir ; 420: print "buff is $buff\n"; 421: if (! -d $buff) { 422: ($DBG > 2) && print "mkdir($buff) ... "; 423: if (mkdir($buff, 0755) == 0) { 424: print "ERROR!\n"; 425: } else { 426: print "OK\n"; 427: } 428: } else { 429: print "something called $buff exists\n"; 430: } 431: } 432: } 433: } 434: # get remote files 435: 436: while ($target = pop(@gets)) { 437: ($DBG > 1 ) && print "processing download($target)\n"; 438: my ($src) = "$targetroot$target"; 439: my ($dst) = "$sourcedir$target"; 440: $ftp->get($src, $dst) && print LOG "$src -> $dst\n"; 441: print "$src -> $dst\n"; 442: } 443: } 444: 445: if ($delremotefiles > 0) { 446: my (%touched_dirs) = {}; # table of dirs we deleted files from 447: # if del is set, 448: # delete remote files 449: while ($target = pop(@deletions)) { 450: ($DBG > 1 ) && print "processing deletion($target)\n"; 451: my ($src) = "$targetroot$target"; 452: $ftp->delete($src) && print LOG "deleted($src)\n"; 453: print "deleted($src)\n"; 454: my ($f, $b, $ex) = fileparse($target); 455: $touched_dirs{$b}++; 456: ($DBG > 2) && print "touched_dirs{$b}++\n"; 457: } 458: # if directories are emptied by this, remove them 459: my ($deldir) = ""; 460: my (@ls) = (); 461: foreach $deldir (keys %touched_dirs) { 462: $deldir =~ s/\/$//; 463: @ls = $ftp->ls("$targetroot$deldir"); 464: ($DBG > 3) && print "dir $targetroot$deldir; contents:\n [", 465: join("][", @ls), "]\n"; 466: if (scalar(@ls) == 0) { 467: ($DBG > 1) && print "rmdir($targetroot$deldir) ... "; 468: if ($ftp->rmdir("$targetroot$deldir")) { 469: ($DBG > 1) && print "OK\n"; 470: } else { 471: ($DBG > 1) && print "Failed!\n"; 472: } 473: } 474: } 475: } 476: 477: 478: print LOG "Session finished at ", ctime(time), "\n"; 479: $ftp->quit || die "ftp session failed to exit properly!\n"; 480: exit 0; 481: 482: #--------------------------------- end of main program -- subroutines follow 483: 484: sub dir_files_remote { 485: # build an array containing a LONG listing of all files on the remote host 486: # format is an array of arrays; each file array consists of ls -l output 487: # with file's pathname appended as final field 488: my ($f) = shift @_; 489: my ($d) = shift @_; 490: my (@files) = (); 491: my (@dirs) = (); 492: my ($listing) = ""; 493: 494: $listing = $f->dir($d) || abend ($f, "f->dir($d) failed"); 495: if (ref ($listing) ) { 496: foreach $_ (@{$listing}) { 497: if (/^drwx/) { 498: my ($fn) = $_; 499: my (@fvec) = split(" ", $_); 500: if ($fvec[$#fvec] !~ /\.+/) { 501: push(@dirs, "$d/$fvec[$#fvec]"); 502: } 503: # push (@files, "$d/$fn); 504: } else { 505: my (@st) = split(" ", $_); 506: push(@st, "$d/$st[$#st]"); 507: push(@files, \@st); 508: } 509: } 510: } 511: return (\@dirs, \@files); 512: } 513: 514: sub dir_files_local { 515: # build an array containing a LONG listing of all files on the local host 516: my ($d) = shift @_; 517: my (@files) = (); 518: my (@dirs) = (); 519: my (@contents) = (); 520: if ( -d $d && -r _ && -x _ ) { 521: opendir (DIR, $d) || warn "could not open $d\n"; 522: @contents = readdir DIR; 523: closedir(DIR); 524: foreach (@contents) { 525: if (-d "$d/$_") { 526: if ($_ !~ /^\.+/) { 527: push (@dirs, "$d/$_"); 528: } 529: } elsif ( -f "$d/$_" ) { 530: my (@st) = stat("$d/$_"); 531: push(@st, "$d/$_"); 532: push (@files, \@st); 533: } 534: } 535: } 536: return \@dirs, \@files; 537: } 538: 539: sub abend { 540: # exit, giving some kind of appropriate warning 541: # 542: my ($session, $message) = @_; 543: print $message, " at ", ctime(time), "\n"; 544: print LOG $message, " at ", ctime(time), "\n"; 545: close LOG; 546: if (ref ($session) eq "Net::FTP") { 547: $session->quit || die "abend(): Session failed to exit properly!\n"; 548: } else { 549: print "abend(): I was expecting a Net::FTP object,\n", 550: "but you passed me a [", (ref($session)|| "scalar"), "]\n\n"; 551: } 552: exit 1; 553: } 554: 555: #----------------------------------- POD documentation follows 556: 557: __END__ 558: 559: =head1 NAME 560: 561: LookingGlass - batch ftp pusher client 562: 563: The opposite of mirror.pl; push a tree of directories up to an ftp server. 564: 565: =head1 Configuration 566: 567: Supports configuration via a configuration file and command line switches. 568: 569: =head2 Configuration file 570: 571: The following options are available: 572: 573: =over 4 574: 575: =item HostName 576: 577: (Default: localhost) Specifies the hostname of the ftp server to connect to 578: 579: =item SourceDir 580: 581: (Default: `pwd`) Specifies the root of the directory tree to upload 582: 583: =item TargetRoot 584: 585: (Default: '/incoming') Specifies the directory to upload SourceDir into 586: 587: =item UID 588: 589: (default: 'ftp') Specifies the User ID to log into the ftp server under 590: 591: =item Password 592: 593: (default: 'user@host') Specifies the password for User ID to log in with 594: 595: =item LogFile 596: 597: (default: '/tmp/lookingglass.log') Name of file to log session to 598: 599: =item DelRemote 600: 601: (default: 0) If non-zero, indicates that LookingGlass should delete 602: files on the server that no longer exist under SourceDir 603: 604: =item GetRemote 605: 606: (default: 0) If non-zero, indicates that LookingGlass should download 607: files from the server that are newer than the local copy, or that do not 608: exist locally 609: 610: =item Debug 611: 612: (default: 0) Enables debugging output of increasing verbosity when set 613: in the range 1 to 4 614: 615: =back 616: 617: =head2 Command Line Options 618: 619: Run LookingGlass with the --help option; it will print out all currently 620: recognized options then exit. (These correspond to the configuration 621: file options described above.) 622: 623: Command line parameters I options read from the configuration file. 624: 625: If no password is specified in the configuration file or on the command 626: line, LookingGlass will stop and prompt for one. 627: 628: =head1 DESCRIPTION 629: 630: LookingGlass is designed to facilitate the easy uploading of an entire 631: directory tree to an ftp server. In this way, it is the opposite of 632: mirror, which is designed to facilitate the easy downloading of an 633: entire directory tree from a server. 634: 635: (It was written for batch uploading of HTML and related files to a web 636: server with an ftp server running on top of the public_html 637: directories.) 638: 639: =head1 Known defects 640: 641: =over 4 642: 643: =item 1 644: 645: LookingGlass uploads a file if the local file creation time is later 646: than the remote file creation time. It makes no allowances for clock 647: skew or time zone differences (in this version). (Note: this is top of 648: the to-do list.) 649: 650: =item 2 651: 652: LookingGlass assumes that the ftp account is correctly set up; in 653: particular, that B, B, and a few other commands are working. 654: If they aren't, you're out of luck (in this release, at least). 655: 656: =item 3 657: 658: LookingGlass assumes that the remote server understands UNIX pathnames. 659: It also assumes that it is running on a UNIX-like system. (Developed on 660: Linux, to be precise.) It does a cursory check on the remote server 661: using SYST, to ensure it's not an alien, but if the server I an 662: alien with eldritch pathnames and doesn't implement SYST you're 663: in the Twilight Zone. 664: 665: =item 4 666: 667: Using LookingGlass you can very easily B by 668: accident! Be extremely cautious using this tool, and practice first with 669: a copy of your source directory tree on a safe area of your ftp server! 670: 671: =back 672: 673: LookingGlass relies on Perl 5.002, the standard module library, and the 674: following additional modules, without which it will not run: 675: 676: =over 8 677: 678: =item ConfigReader 679: 680: =item Net::FTP (from libnet 1.01) 681: 682: =item Time::CTime and Time::ParseDate (from Time-modules and TimeDate) 683: 684: =back 685: 686: These modules are all available from CPAN. 687: 688: 689: =cut END PROGRAM LISTING