September 2000 Column


[ Site Index] [ Linux Index] [ Feedback ]


Who needs Norton Utilities?

Yesterday, I wrote an introductory article for a Linux magazine. I write these articles as plain text files, with accompanying jpegs, and email them to my editors as attachments -- usually in an old DOS-era ZIP archive (as my editors mostly labour under the burden of Windows or something similar).

Having written the deathless prose and grabbed the screen shots, I plonked them all in a subdirectory and, in the directory above it, typed:

  zip -r article.zip article/*.txt article/*.jpeg
(Creating a zipfile called article.zip, containing the .txt and .jpeg files in the subdirectory "article". Of course.)

Then I decided for some reason I didn't like the subdirectory name "article", which everything would be unpacked into by my editor -- maybe I should mv the directory to something more identifiable-sounding?

Disaster struck: I hit the up arrow (command recall is wonderful) and moved to the beginning of the line, deleted "zip", and typed "rm". Then someone shouted a question at me and, brain preoccupied with answering them, I hit the return key instead of editing the line.

What I'd meant to type was:

  rm article.zip
What actually leaked out was:
  rm -r article.zip article/*.txt article/*.jpeg
Ouch. (Would you believe me if I said this was the first time in several years I've made this particular mistake?)

At this point, common wisdom is that these files were dead and gone, never to return. UNIX filesystems are notorious for not being undelete-friendly. However, I managed to file copy on time -- without a backup, and without re-typing everything from memory. I was able to do this because, although most people don't know this, almost every Linux distribution comes with a bunch of tools at least as powerful as Norton Utilities -- but far less well-known.

What's in a filesystem?

Chances are, your Linux system stores files using an ext2 filesystem. At its simplest, a filesystem is a system for mapping filenames to blocks of data squirreled away on the tracks and sectors of a disk. MSDOS uses a filesystem called FAT-16, and Windows 95/98 uses the derived VFAT system. Macs use a system called HFS. And UNIX has used about a dozen different types over the years.

Linux uses ext2 as its standard filesystem because it's a stable, robust descendant of the original UNIX filesystem. ext2 was developed from ext1, itself designed as a replacement for the simple, low-performance MINIX filesystem (that the early, pre-1.0 Linux kernels ran on).

UNIX filesystems can be understood best if you bear in mind that they store two different things: data (in collections of disk blocks that constitute files), and meta-data (information about those collections of disk blocks -- such as their names). Some of this meta-data consists of things like lists of free blocks (which aren't part of any file); but the important bit, from our point of view, is the inode table.

All UNIX filesystems (with a few exotic exceptions) have an inode table -- a chunk of the filesystem that is used for storing inode records instead of data. An inode ("information node") is a special record that contains information about a file -- when it was created, when it was last modified, and a list of blocks that are part of the file. Blocks are just chunks of data, all of the same size (typically 1-4Kb) scattered across the disk. inode records are of fixed length, and only contains slots for pointers to fourteen blocks; as each block is 1-4Kb in size, this doesn't make for large files! To get around this, only the first twelve pointers go directly to data blocks. The thirteenth pointer goes to an indirect block -- a block containing nothing but pointers to data blocks on the disk. As each pointer is a 32 bit number, this gives you another few Mb in that file.

Most files aren't many megabytes in size, but UNIX filesystems can cope with really huge files, too. What happens is, the fourteenth pointer in the inode points to a double-indirect block. This is a block of pointers, each of which points to an indirect block. And for gigantic files, we can go all the way to triple indirect blocks.

There's one piece of information that an inode doesn't store -- the file's name. That's because names are stored in directories. A directory is simply a special data file that matches a list of names to inode numbers. (These include the special names "." -- listing the directory file's own inode -- and ".." -- listing the parent directory file's inode.) On some UNIXes, such as Solaris, you can actually read the directory file using a hex dump utility or text editor (but editing them by hand is a bad idea.) inodes themselves are anonymous, being identified only by number.

Each inode keeps count of the number of hard links to it (i.e. the number of directories that contain a name associated with this inode's number.) Every time a link is broken, the inode's link count is cut; when it hits zero, the filesystem realises that no files containing this data exist any more, so the inode's data blocks are marked as "free" and may be recycled whenever the kernel feels like it.

That last bit is important. If you delete a file accidentally and leave things for later, while some program is scribbling in the same filesystem, the odds are good that it will be unrecoverable -- some or all of its data blocks will have been recycled to store other data. The key to safely recovering a deleted file is speed.

You can recover a deleted file reliably if, as soon as possible after you deleted it, you unmount the filesystem it's on (or remount it as read-only) to stop other programs writing to it. You then use a tool called debugfs (the filesystem debugger) to identify recently deleted inodes and recover them by copying their data to a file on a different filesystem.

Note that this is a good reason for always having two or more filesystems on your machine, and for keeping data (such as the contents of /home or /usr/local) on a separate filesystem from /usr (where your programs reside). You can recover files on a single filesystem, but it's tricky; the way to do it is to enter single-user mode, create a RAMdisk, format it, mount it somewhere like /mnt/ram, remount your root filesystem read-only, use debugfs to recover the lost file onto the ramdisk, remount the root filesystem in read-write mode, and copy the recovered file from the ramdisk to wherever you need it. (Were you paying attention? There'll be a practical exam later: if you pass, you're a qualified UNIX sysadmin.)

Undelete in action

Undeleting files used to be a black art, but the 2.2 kernel and newer versions of debugfs makes it easy enough for a relative novice (but not an idiot). In general, if you delete a file by accident on a non-critical server, your first port of call is the backup tape. If there's no backup, you should then go to single-user mode immediately.

Let's suppose you deleted a file called /home/idiot/lost.zip. If /home is a filesystem mounted from /dev/hda3 (the third partition on hard disk "A", the first IDE disk on your system), your first priority is to stop processes from recycling those disk blocks that have just been marked as free for reuse. So you remount /home read-only:

  mount -o remount,ro /home
Your next port of call is the Ext2fs-Undeletion mini-HOWTO. This will tell you everything I've told you, and more. The one shortcoming of this FAQ on undeleting files is that it refers to an older version of debugfs: more on this later.

To identify the inodes of recently deleted files, you run debugfs on the filesystem:

  debugfs /dev/hda3
(This is in read-only mode. For read-write, run debugfs -w /dev/hda3.)

You now see a lovely "debugfs:" prompt. debugfs is like a shell; you type commands and it shows you stuff. Type "help" and it'll give you a list of commands; the one you want is "lsdel". This prints up a list of deleted inodes, in date order since they were zapped, and information about them -- like the number of blocks, and when they were created. As inodes have no names, you have to guess which one is your target -- but if you unmounted the filesystem immediately after your mistake, odds are that it's the most recent (or next to last) in the list.

Once you've got the inode number (left-hand column in the listing), you can get some more information about it with the "stat" command. Say it's inode 120533; type:

  stat <120533>
to see everything the ext2 filesystem knows about this file. (Note the angle brackets -- they indicate that what you're stat'ing is an inode, not a named file.)

If it looks roughly correct, you can then use "dump" to dump the contents of the inode to a file on a different filesystem:

  dump <120533> /tmp/recovered.1
At this point, it's worth noting that the 2.2 kernel series, and associated debugfs, is better than the version documented in the HOWTO. Old versions of debugfs weren't able to recover indirect or double-indirect blocks, because the kernel trashed them when an inodes' link count hit zero. If you wanted to undelete files more than about 12Kb in size, you had to manually enter the indirect blocks and dump every block they pointed to. But if you're running a 2.2 kernel, you don't have this problem; debugfs correctly identifies all the blocks associated with a file and dumps them, as you'd expect.

It worked for me. One thirty-second cold sweat after I hit the return key by mistake -- then I unmounted the filesystem, gave myself a crash-course in debugfs, and recovered the lost files. (Then I emailed them to my editor and ran a backup. Once burned, twice shy!)

A final word about ram disks -- which you might need to create, for a recovery. Most stock kernels support ramdisks; look in /dev/ for a number of block special files called /dev/ram0 to /dev/ram. These are empty ramdisks. To create a 4Mb ramdisk, you first splat 4Mb of zeroes into one of these devices, using dd:

  dd if=/dev/zero of=/dev/ram1 bs=1024 count=4096
(Copy 4096 blocks of 1024 bytes each from /dev/zero to /dev/ram1.)

Then you format it:

  mke2fs /dev/ram1
Then you mount it (after creating a directory, /mnt/ram1):
  mount -t ext2 /dev/ram1 /mnt/ram1
Note that its contents are temporary and will go away next time you reboot.

Disaster recovery tools

Linux doesn't have an equivalent of Norton Utilities -- a glossy, shrink- wrapped set of disaster recovery tools -- because the components are present in most distributions, albeit in a not very polished form.

Linux has tons of low-level backup options, from the venerable tar and cpio archivers to dump (for dumping filesystems to tape), dd (raw block-level disk copying), and so on. These low-level systems are all very well, but organising them is the hard part; which is why various networked backup tools have been written (to transfer files to a backup machine and stream them onto tape), and entire books about UNIX backup policies exist. My take on it is that for the home user, hobbyist, or self-employed person, the best answer is to keep the original installed distribution. Instead of backing up everything, backup a snapshot of the contents of /etc, a list of whatever you installed, and the contents of /home (if that's where your home directory lives). This is typically an order of magnitude smaller than the miscellaneous utilities that came with your distribution, hence easier to manage; and if it's easy, you're more likely to do it regularly. (Besides, you've still got that copy of Red Hat. Right?)

Things get harder if you have a multi-user server to back up, or a network of machines -- but that's outside the scope of this article.

Disk defragmenters are an endangered species on Linux. They exist because the MSDOS and HFS filesystems were badly designed -- relics of the floppy disk era, they weren't intended to support multi-user machines with millions of files and large hard disks. ext2, in contrast, is descended from a lineage of server filesystems. It is largely self- defragmenting; the ext2 drivers try to allocate sequential runs of blocks to each inode and spread use across the disk, avoiding the problem. If you really do need to defragment an ext2 filesystem, uhere's something's very weird about your setup! e2defrag is available, but you probably won't need it.

I've wibbled on about undelete utilities for most of this article, so I won't repeat myself here. However, in addition to debugfs there are other tools. MC, the Midnight Commander (a clone of Norton Commander, a desktop shell) supports an "undelete" virtual filesystem that lets you browse deleted inodes and give them new names, bringing them back to life. (Note that this is an option that needs to be compiled into mc.)

There's also a graphical Gtk (GNOME-compatible) undelete tool available.

Virus checking, a staple of the Windows and Mac worlds, isn't strictly irrelevant on Linux -- it's simply not a factor in our life at present. There are no known Linux viruses in the wild -- just a couple of rather anaemic proof-of-concept demos that can only thrive under laboratory conditions. (Some would say that the GNU Public License is a virus, but that's another matter.)

File system damage is another matter entirely. Ext2 is a lot more robust that traditional UNIX filesystems, but still doesn't take kindly to the power cable being yanked out without warning. To this end, fsck (file system check) is your friend. fsck is so ubiquitous that you probably never notice it as it runs whenever you boot up your system. The two problems with fsck are that sometimes it can't figure out a problem on its own -- at that point, you have to run it interactively and tell it what to do. The other problem is that it can be slow, on large filesystems.

If fsck finds an orphaned inode with a non-zero link count but no filenames in the directory tree, it will stick it in the directory lost+found at the root of the current filesystem. So now you know what lost+found is for -- don't delete it to save space, or the next time you boot with a dirty filesystem you'll be sorry!

Ext2 is good, but has some drawbacks. The inode table is of fixed size; you can fill it up by creating lots of tiny files (on the order of millions). And it isn't failure-proof -- a power outage or kernel panic can lose you a lot of data that is currently stuck in the kernel buffer cache (where data blocks being written are queued up before they are sent to the disk). To that end, Steven Tweedie is developing a next- generation filesystem, ext3. Ext3 is very similar to ext2; in fact, if you get a disk with an ext3 filesystem you can mount it as ext2 on a machine that doesn't understand ext3 extensions. The difference is that ext3 supports journaling.

There's a special file of contiguous blocks on every ext3 filesystem, called a journal file. The ext3 driver ensures that, except when they're writing something else to disk, the hard disk heads stay over the journal file: and any data that's written to the disk goes straight into the journal, along with an index entry. When no files are being written, the driver copies journal entries out to files elsewhere on the disk, then marks them as committed.

If a computer with an ext3 filesystem crashes, when it reboots and tries to remount the ext3 system the first thing that happens is that the ext3 driver reads the journal file. If it sees any uncopied entries in the journal, it writes them into the correct places on the disk. The result is a consistent filesystem, because all successful writes have completed; partial journal entries are ignored. It's also a lot faster to check a journaling filesystem than a non-journaling one -- instead of scanning the entire inode table and directory structure to ensure there are no inconsistencies, fsck simply has to check one file.

ext3 is due to debut in the 2.4 kernel, late this year; meanwhile, it's already powering some web servers that support heavy throughput (and Steven's own laptop).

And now for something completely different

Astute readers may remember a few months ago I was in search of the ultimate personal digital assistant for use with Linux. Psion didn't cut it -- beautiful hardware, shame about the weird file formats. A plethora of Windows CE machines were, fundamentally, boring. And Palm Pilots were terra incognita to me.

I say "were", because I'm now a convert. If you have a need for some computing power that will fit in a pocket, this is the way to do it. Because of Palm's welcoming policy towards developers, all the guts of PalmOS are open to programmers who want to write glue to let palm pilots connect to non-Wintel systems. The results turn out to be spectacular: go to Freshmeat, type "palm pilot" in the search box, and you'll get more than sixty hits on applications as diverse as desktop connectivity tools (such as JPilot and the KDE application KPilot) to tools for synchronising the Pilot and Netscape address books -- and even a Palm Pilot emulator. There's a full Gcc- based development toolchain for the Palm Pilot that runs on Linux; indeed, with a bit of poking around for the necessary bits, your Palm Pilot can integrate as tightly with a Linux box as with a Windows desktop.

This isn't the open source world, unfortunately. Some necessary extras are, well, extras. I keep my keyboard jones happy by toting around a Palm portable keyboard -- an insanely neat gadget that concertinas a laptop-sized keyboard into a parcel the size of a deck of playing cards -- but had to pay extra for a text editor. File formats are different, too. The Palm Pilot doesn't have a file system so much as a database: you can load text files or HTML files into it, but you'll need to convert them to a native format (and vice versa on the way back). Luckily tools for doing this are readily available on Linux. Space is too short for me to go into detail on this, but you could do worse than look at O'Reilly's website for a detailed discussion.


[ Site Index] [ Linux Index] [ Feedback ]