September 2001 Column

[ Site Index] [ Linux Index] [ Feedback ]

User Mode Linux

Boxes are a nuisance. PCs come in boxes -- big lumps of plastic and metal that lurk on (or under) your desk and make loud humming noises. The boxes are strung together with fat bundles of wire for you to trip over, they to require an inordinate number of mains electricity sockets, and they breed in dark corners. On a recent audit, your humble columnist discovered roughly nine of the beasts hiding in his study -- only three of them in regular use (or indeed, in a sufficient state of repair to do duty). Admittedly, Linux consultants are more prone to these sudden infestations of mechanical life than most people, but anything that reduces the box count has got to be good -- especially if you're as badly out of control as a friend of mine whose herd numbers in excess of twenty seven machines.
Yes, boxes are bad -- unless they're virtual.
The Linux kernel is just another computer program, like any other -- the only really unusual feature of it is that it is one of that select bunch that is designed to run on the bare metal of a computer, rather than in a padded cell provided by an operating system kernel. The padded cell comes equipped with seats and bunks and so forth that give a user process a view of the host machine's shared resources -- disks, displays, and so forth. The reason we have operating systems in the first place is to handle resource contention: to ensure that all the other programs have access to the underlying facilities without hogging them. These processes run in user space, while the kernel's own code runs in privileged kernel space, with direct access to the hardware.
Which brings me to the topic of user mode linux. UML is just a Linux kernel that has been tweaked so that instead of talking to the bare metal, it talks to services provided by a lower-level kernel. You can run a UML kernel as a process under a parent Linux session; it uses a separate partition (or a loopback-mounted filesystem stored in a file) as its root filesystem, and it doesn't share any processes, memory or files with the parent Linux session; in fact, it's a Linux system running in a virtual box, instead of a separate physical one. If the UML kernel panics, your main linux session will carry on as normal -- just as it does when any other user process dies.
To run UML, you need a recent 2.2 series kernel (version 2.2.16 or later), a 2.4 series kernel, or an older 2.2 series kernel that's been patched and recompiled to support UML. Then you need the UML kit, available from user-mode-linux.sourceforge.net. It helps to have plenty of memory and disk space, too -- if you're going to build a root filesystem for a second Linux system on your main computer, bet on needing at least a hundred megabytes (and preferably a whole lot more).
A UML kernel runs as a user process under your existing Linux session. It's totally insulated from all other user processes; however it can talk to your main Linux session over TCP/IP via a virtual SLIP interface (assuming you've got SLIP support compiled into your main kernel). If your parent kernel supports IP forwarding, you can set things up so that your UML kernel can see the outside network; and you can use port forwarders on the parent kernel to ensure that connections from an outside internet link are forwarded to an appropriate port on your UML system, which can be running a server.
This is, of course, where UML begins to get interesting. Right now, UML is a bit of a curiosity -- the HOWTO mostly discusses using it to debug kernel code (without needing a second beige humming box on your desk), or to install and test different Linux distributions (say, running SuSE as a UML-hosted guest under Red Hat). However you can do much more useful things with UML. For example, why not replace your hardware-based firewall with a UML session and a second ethernet card in your desktop PC? Plug your cable modem or router into the second ethernet card. Tell the UML kernel about this card, and configure it to do IP filtering and masquerading. Then tell the main Linux kernel to use the SLIP interface to the UML session as a gateway: all network connections from the main Linux session will be forwarded to the UML session, which will act as a firewall. You can go further; configure the UML session to talk to two ethernet cards and act as a firewall, and hang your main Linux session off it as a client system. Hey presto, one less box under your desk.
You can also use UML to allow your system to provide network services to the outside world without compromising security. For example, if you're an ISP you can give all your customers a virtual Linux box of their own to run their web server on -- complete with gcc, perl, their own apache server, and all the access they want -- without compromising your own system's security. (To be fair, FreeBSD has supported this capability for a long time now.) Traditionally this was done using chroot (change-root), a program that spawns a sub-shell that thinks the root of your filesystem is some directory other than "/"; you can find an example of this in most FTP server setups (for example, under /home/ftp on this Red Hat system). However, setting up a secure chroot'd environment is tricky. You need to include programs and shared libraries in the chroot'd area, and if any of them are compromised an intruder may be able to gain access to other areas of the filesystem. In contrast, a user mode kernel has no access to the root filesystem; it can be cracked wide open by an intruder but you, the system administrator, can contain the security breach by simply shutting down the UML session.
There are ways to specifically grant a usermode kernel access to underlying system resources. To start with, you can supply some command line flags to the linux kernel (which you run like any other program): mem=24M, for example, tells it to grab 24Mb of memory. Then there are externally-configurable devices. All devices seen by the user mode kernel are virtual (from the point of view of the parent kernel). For example, block devices (which the parent kernel would use as interfaces to hardware block devices, such as hard disk drives), are virtual devices that point to files within the parent kernel's namespace.
If you really want you can make a ubd -- user-mode block device -- device point at a real hardware device; the command-line parameter "ubd1=/dev/hdc3" maps the hardware device /dev/hdc3 (third partition on the third IDE device) to /dev/ubd1 inside the user-mode kernel's environment, and the command-line option "root=/dev/ubd1" makes that the root filesystem.
Network devices communicate via a SLIP interface or a virtual ethernet frame interface (and can talk directly to a real ethernet card using the ethertap device) -- with an appropriate packet forwarding policy in the daemon, the virtual Ethernet can be transparently merged with the physical Ethernet, totally isolated from it, or anything in between.
What all this boils down to is that with a modern, reasonably well specified PC -- any machine able to run Windows 2000 should do -- you can simulate a cluster of linux systems, each configured for a separate stand-alone task. This lets you test new development kernels, simulate the behaviour of a Beowulf cluster, provide secure execution environments for web or ftp servers, give multiple ISP customers root access to a fully-configurable linux environment on a colocated server, and -- maybe -- run Linux in a box under other operating systems. Of course, operating systems have been running themselves as client applications since about 1972 -- IBM mainframe systems like VM are specifically designed for this job -- but it's fun to have it on the desktop.

Mainframes to go

Speaking of IBM mainframes, this is pretty much how you run Linux on a zServer. The zServer is a large bunch of blue boxes that contains one or more processor units -- each with ten power architecture RISC processors inside it -- connected via very high speed i/o buses. If you can afford this kind of kit, you can gang together up to 640 processors, in a cluster, with up to 32Gb of memory on each ten-processor node. The bandwidth available to a mainframe is colossal; they can easily saturate gigabit ethernet links, and use multiple busses for shuffling data to and from their disk arrays.
The software a zServer runs is a 64-bit derivative of IBM's old VM system, called zVM. VM is a multi-tasking operating system that lets you spawn execution environments (called partitions) each of which "sees" exactly the same hardware resources as are visible to the underlying VM system. Within a VM partition, you can run another operating system; in the old days it would be something like the single-user single-tasking CMS system, but these days it's as likely to be Linux. A Linux account on a zServer mainframe is a bit like having your own UML process on a Linux box; when you log in, you're in your own virtual machine and can't directly access processes run by users in other VM's. The big difference between a zServer and a PC is that the zVM system is able to juggle hundreds to thousands of Linux sessions, and has the i/o bandwidth to page them in and out of working memory on demand. In addition, zServers provide a mechanism called HiperSockets(TM) -- basically, mapping TCP/IP network connections onto a fast memory bus, so that processes on the same mainframe can exchange data extremely fast. And they've got dedicated cryptographic co-processors to accelerating SSL and similar number- crunching tasks.
In case it wasn't obvious, there are advantages to running multiple Linux sessions on a zServer. For one thing, it makes it a lot harder to crack the security of a mainframe if each network service (such as the web server, ftp server, and so on) is running on an entirely different virtual machine with no direct access to other services. For another thing, each user can have total control over their own Linux workspace, along with the ability to install software of their own or perform kernel upgrades if necessary. And with a Java runtime in each Linux partition, it's possible for mainframe users to use a common binary executable format that runs on PCs and minicomputers as well: the mainframe is finally compatible -- directly -- with desktop systems and everything in between.
If you're a Linux developer you can get a free (90 day limited term) account on IBM's Linux Community Development System -- a ten processor zServer mainframe that runs Linux -- to ensure your software runs alright on their mainframes. It's an inspired marketing move, and a canny way of helping ensure that mainframes don't end up shoved to the periphery of Linux development; you can find out more at the LCDS website. (Note that at time of writing they were a bit backlogged with applications and have temporarily frozen the sign-up process.)

Linux and DVDs

Five years ago, if you asked who the strongest opposition to Linux would be, most people -- myself included -- would have said "Microsoft", followed (after some thought) by the traditional UNIX industry. The film business wasn't even on the map -- but as things have turned out, two organisations above all seem inimical to open source software -- the RIAA (Recording Industry Association of America) and MPAA (Motion Picture Association of America). Both these organisations make their money by licensing production and performance of media -- music and movies respectively -- and they hate the idea of reproduction tools that are outside their control.
In the case of the MPAA, the battleground has been drawn up over DVD, the Digital Versatile Disk system pitched as a high-quality replacement for VHS video cassettes. The MPAA -- to be extremely cynical -- have several motives for pushing DVD. One is the ease with which video cassettes can be copied; they want the new medium to be resistant to duplication. To this end, the DVD standard comes with a built-in encryption system (CSS, the Content Scrambling System) which doesn't really stop anyone creating a bitwise copy of the original tracks -- but is highly useful for enforcing region encoding. In DVD speak, there are three regions -- region one (the USA and Canada), region two (Europe) and region three (Pacific rim). A region one disk can only be decoded by a region one aware player, and so on. This scheme adds up to extra profits when you consider that movies are frequently released at staggered intervals in different regions, and that disks cost more in region two than in region one (a fact that the EU trade commission is currently investigating).
Anyway, all this is old news. It's now nearly two years since CSS was cracked by a sixteen year old from Norway. The DVD-CCA (Copy Control Association) promptly sued everyone under the sun who even whispered the details in public (or put links to the software on their websites), and a case is currently working its way through the US court system. (Don't bother hunting: you can find DeCSS here).
(Note for any lawyers working for the DVD-CCA who happen to read this web page: this is a reprint. That URL has ALREADY BEEN PUBLISHED in a print magazine with an audited circulation of nearly 200,000 copies. You are TOO LATE. US trade secrecy law does not apply in Scotland. Moreover, as a journalist I have both legal insurance and a public interest defense. So piss off.)
So where does this leave those of us who want to take advantage of our DVD drives and laptops but run Linux?
If you have a 2.4 series kernel, you can read data tracks off DVD-ROMs without difficulty; it behaves just like a big CDROM. The filesystem format, UDF, is now supported; this isn't subject to the same legal problems as movies, and indeed SuSE provide their Linux distribution in a single-DVD format.
Playing encrypted movies is a bit harder. There are a couple of different software projects working on actual DVD players; some rely on hardware MPEG decoders, while others work entirely in software. In general, the players don't decrypt the CSS-encoded stream themselves; that would make their authors vulnerable to legal action. Rather, they take plugin codecs which handle the decryption side of things for them.
Of the DVD players out there, the most mature right now is probably Xine. To use Xine with CSS-encoded files, you will need to download both Xine itself (from the sourceforge site), and a DeCSS plugin. A compiled package is also available, and there's a XINE how-to.
When you run Xine, starting it from the command line (under X11, of course!), it provides you with a window and a floating toolbar. In addition to DVD's, Xine can also play the older VCD (Video CD) format disks; hit the DVD or VCD button and it will scan the disk in the default drive for a recognized table of contents. (Note that it's a good idea to create a symbolic link from your DVD-ROM's device node -- typically /dev/hdc -- to /dev/dvd.) Alternatively, you can specify a file to play by giving Xine a uniform media locator (a variant on the web's URL syntax) on the command line. In addition to playing DVDs, Xine can play Windows AVI media files if given apropriate codecs -- and a bunch of codecs (which Xine expects to find in /usr/lib/win32) are available from http://bpinaud.free.fr/video. Full details are in the HOWTO.
Playing DVDs is memory and CPU intensive. A number of accelerated video cards that support the XVideo extension will work better with DVDs; the HOWTO contains extensive notes on tweaking your system to optimise it for DVD playback. In practice, don't bother unless you've got at least a 450MHz Pentium-II class machine with bags of RAM and a fast graphics card -- or a 600MHz or faster Pentium-III box. Any current machine with at least 128Mb of RAM should be fine, but older models will suffer.
Xine is well-established, but a particularly good contender is vlc (from the Videolan project). This is a project run by students from the Ecole Centrale Paris, to produce a free cross-platform DVD and MPEG playing suite and streaming video server. The Vlc client has a rather nice Gnome interface, and is easier to use than Xine; alas, when attempting to play a subtitled (French) DVD I experienced some segmentation faults. (I suspect this may have something to do with Vlc not yet supporting raw i/o devices, as Xine does, and with me trying to run it on borderline hardware.) In any event, Vlc is only at release 0.2.8, and is undergoing rapid development. It's also available for BeOS, MacOS X, and Windows as well as Linux, although Linux is the primary development platform.
Xine and Vlc aren't the only DVD players for Linux; you can find links to a whole lot more information on this topic at linuxtv.org. Another major player site is based off www.linuxvideo.org -- this is the home of the LiViD project, who are working on OMS, the open media system (which includes DVD playback as a goal). (Note that LiViD has been making slower progress than Xine of late.)
If you want to play DVDs on your Linux system, you should be aware that it will take a fair bit of fiddling at this stage; if all you want to do is watch the odd movie in front of the TV, you're best off buying a DVD player. On the other hand, if you've got a laptop and a need to travel regularly, it may help you fend off boredom or survive the long evenings in hotel rooms far from home. In any event, it's fun! Now to get it working on an iPaq running Linux ...

[ Site Index] [ Linux Index] [ Feedback ]