Linux Format 36 Perl Tutorial: ///TITLE: Going further with Perl ///STRAP: Charlie Stross looks at further sources of Perl information [[ TYPOGRAPHICAL NOTE: text enclosed in ///BEGIN CODE and ///END CODE is a code listing. Text in the body copy surrounded by _underscores_ (thus) is italicised or emphasized -- but _not_ in the code listings ]] ///BEGIN BODY COPY ///SUBHEADING: Where to go to start learning Perl is a big, gnarly language; if you include the core modules it's probably comparable in size and complexity to Java or C++, and has a similar learning curve if you intend to become proficient. A one-week training course won't even scratch the surface -- to get to the stage of being able to effectively develop complex applications in Perl probably requires 6-12 months of full-time work. Luckily, Perl starts off easy; a one-line "Hello, World" program in Perl really _is_ one line (as opposed to the huge lumps of class declarations and wrappers you need to cough up in Java or C++). Because much of Perl's syntax is shared with other UNIX-family languages, it is possible to slip into Perl sideways. When I first started trying to use Perl seriously in 1993, I had mainly programmed in the Bourne shell and Awk, with some C on the side. From that background, Perl 4 (basically Perl 5 without lexical scoping and objects) came naturally -- it was possible to become effective, certainly to the level of re-writing basic shell/awk scripts in Perl, within a day or so. In fact, if you're intimidated by the size of Perl, you may find this the best way to get a handle on it: learn the UNIX shells (in particular Bash), read "The Awk Programming Language" by Aho, Weinberker and Kernighan, write a couple of small awk scripts ... and Perl will seem eerily familiar. (Perl is such a comprehensive superset of awk that there's an awk-to-perl translator in the core Perl distribution that translates awk programs into idiomatic Perl 4 almost perfectly.) The other approach to learning Perl is to attack it as a first language, without reference to its roots. This is probably faster if you don't have the UNIX background, but won't give you a feel for Perl idiom, much of which is UNIX-specific and more or less intuitive if you've worked with other UNIX shells and languages. There are a lot of books about learning Perl as a first language, and there's no substitute for browsing a bookshelf in your local computer bookshop to find one that you can get along with. A couple of general words of warning apply; there's a tendency among some publishers to try to sell books by the kilogram, and weight/page count does not equate with quality. If you see a tutorial or introductory book with more than three authors listed, or more than five hundred pages, then it's probably an omnibus assembled in a hurry, and the editorial or quality control oversight may be suspect. For many years the canonical book on learning Perl was "Learning Perl" by Randall Schwartz (pub. O'Reilly and Associates, 3rd edition, ISBN 0-596-00132-0). It's not particularly huge, and this is a good thing: it reduces Perl to an approachable scale, and the single author is one of the Perl ancients and knows his subject well. When learning a language, there comes a point at which a tutorial isn't enough. In addition to an introduction to a language construct or concept, you need an exhaustive reference. For Perl, the only canonical reference is the Camel book, "Programming Perl" (pub. O'Reilly and Associates, authors Larry Wall, Ton Christiansen, and Jon Orwant, 3rd edition, ISBN 0-596-00027-8). This meaty doorstep is probably too terse to learn from unless you come from a UNIX programming background, but it provides an exhaustive reference to the core language and the standard modules, along with chapters on important topics such as object oriented programming and inter-process communications. ///SUBHEADING: What online resources exist for Perl? One point to note before you go out and spend hundreds of pounds on dead trees is that Perl itself comes with a huge pile of online documentation. Look in your Perl installation directory for a subdirectory called "pods" and you'll find close to two thirds of a million words of documentation in pod format. This is accessible via the perldoc command; in a terminal window type "perldoc perldoc" for the manual page of the perldoc tool. There's an intro text ("perldoc perlintro") that gives a brief tutorial and flavour of the language, but most of the documentation is more technical; "perldoc perl" will give a list of the various pod files, their subjects, and the command to read them. The pod documentation set evolved out of the Perl man page. Back in Perl 3, the man page ran to over a hundred pages; in Perl 4 it split into different files (perlrun covers command-line options to the perl binary, for example, while perldebug covers the debugger), and in recent Perl 5's (such as 5.8) a lot of supplementary stuff has been added. The usenet newsgroups for Perl -- comp.lang.perl.misc, comp.lang.perl.moderated, and so on -- sprouted FAQ lists, and these are included (and run to about 50,000 words in their own right -- the length of a short book). In fact, if you're happy reading on a screen, you can do without a printed copy of "Programming Perl". The information in "Programming Perl" is all somewhere in the pod documentation (although Tom Christiansen's writing style is more accessible and illuminating than some of the pods, which suffer to some extent from UNIX man page disease). There is also a hypertextifying utility that allows you to convert the documentation to HTML or LaTeX formats (for online hypertext or hardcopy output); when installing, use installhtml to build a copy of the Perl documentation for your web server. (It's described in the INSTALL file in the Perl source distribution.) In addition to the core documentation, there's a lot of more specialised information on the public web. If your primary interest is using Perl as a CGI programming environment, then perl.apache.org is the website for all mod_perl related issues. For generic Perl modules, CPAN can be browsed on the web via www.cpan.org. And there are some useful community websites: Perl Mongers (the Perl advocacy group) at www.perl.org, the Use Perl; portal website at use.perl.org (which provides news updates and discussion areas), The Perl Journal at www.tpj.com (Perl's community magazine, now available online as a paid-subscription magazine), and Perl Monks (http://perlmonks.org/), a slashdot-like community for Perl programmers. Finally, www.perl.com is maintained and run by O'Reilly and Associates -- publishers of most of the books recommended in this column, employer of Larry Wall (as a research fellow, to work on Perl), and owners of the most impressive commercial Perl web portal on the planet. ///SUBHEADING: An indispensible bookshelf If there's one Perl book that you really need to keep handy -- if you use Perl for a living -- it's the Perl CD Bookshelf (version 3, pub. O'Reilly and Associates, ISBN 0-596-00389-7). O'Reilly have made a speciality out of Perl documentation (Larry Wall is a research fellow there), and this is a compendium of seven of the key texts on the language. In addition to a paperback copy of Perl in a Nutshell (the second, improved edition -- itself an indispensible desk reference), the CDROM that comes in the binder includes "Perl in a Nutshell" and the third edition versions of "Programming Perl" and "Learning Perl", as you'd expect. There's also a copy of Tom Christiansen's "Perl Cookbook", a handy tome full of useful procedures and algorithms for accomplishing day to day tasks in Perl, including many file maintenance operations. In addition to these core books, this version of the CD bookshelf drops the Windows content to focus on some more general Perl development books. "Perl and LWP" covers programming the web using LibWWW-Perl or LWP, a vital toolkit that lets you download information from the web, parse HTML to extract information, and even build small web servers into your own applications. Then there's "Perl and XML", extending the utility of Perl as an internet programming language to the next level with coverage of processing XML data in Perl, including PerlSAX, XSLT, and the Document Object Model. Finally, "Mastering Perl/TK" shows up; the only decent book about the only decent cross- platform GUI programming kit for Perl is a welcome addition to the collection. About the only criticism that can be made is that O'Reilly are only putting seven books on the CD -- "Programming the Perl DBI" and "Advanced Perl Programming", and "Perl for System Administration" would all be useful additions, even at the cost of a higher cover price. But It's hard to exaggerate the value of this book too highly; "Perl in a Nutshell" is itself vital to a jobbing Perl programmer, and the combination with six other core books provides a level of coverage that can't easily be equalled. ///SUBHEADING: Polishing your Perl skills O'Reilly and Associates, as employers of Larry Wall and publisher of the canonical texts, have an unfair advantage; but they're not the only source of useful information in the Perl world. If you're already programming in Perl, and use it for day to day data mangling tasks, you may want to take a look at "Data Munging with Perl" by David Cross (pub. Manning, ISBN 1-930110-00-6). Data munging focusses on the one task for which perl is unequivocally the tool of choice -- taking data stored in one format and filtering, sorting, reducing, and translating it into other formats. Essentially a book about data structures, parsing, and data analysis, this is a really useful text because munching on data is about the one job all Perl programs do (be it reading and writing databases via DBI, generating HTML via CGI, parsing HTML received via LWP, or just reading text files and doing creative things to them). On a more abstract level, "Effective Perl Programming" by Joseph Hall (with Randall Schwartz) (pub. Addison-Wesley, ISBN 0-201-41975-0) is a terse collection of fifty lessons that will take you from being a competent programmer to a deep understanding of some of the highly elegant tricks implicit in Perl's repertoire of data structures. It's an advanced tutorial, covering lessons that you won't find in "Learning Perl" and which are implicit in the coverage of "Programming Perl", and this book (or something like it -- Sriram Arinivasan's "Advanced Perl Programming" and Orwant, Hietaniemi and Macdonald's "Mastering Algorithms with Perl" cover the same territory and a bit more, albeit in ten times as many pages and less elegantly) is a great way to learn how to fully exploit the power of idiomatic Perl. It's not as rigid as a Design Patterns based methodology, but using standard Perl constructs intelligently can bring order of magnitude performance improvements over a brute-force attack such as might be written by transliterating raw C into Perl.