|  | Devfs (Device File System) FAQ | 
|  |  | 
|  |  | 
|  | Linux Devfs (Device File System) FAQ | 
|  | Richard Gooch | 
|  | 20-AUG-2002 | 
|  |  | 
|  |  | 
|  | Document languages: | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  | NOTE: the master copy of this document is available online at: | 
|  |  | 
|  | http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html | 
|  | and looks much better than the text version distributed with the | 
|  | kernel sources. A mirror site is available at: | 
|  |  | 
|  | http://www.ras.ucalgary.ca/~rgooch/linux/docs/devfs.html | 
|  |  | 
|  | There is also an optional daemon that may be used with devfs. You can | 
|  | find out more about it at: | 
|  |  | 
|  | http://www.atnf.csiro.au/~rgooch/linux/ | 
|  |  | 
|  | A mailing list is available which you may subscribe to. Send | 
|  | email | 
|  | to [email protected] with the following line in the | 
|  | body of the message: | 
|  | subscribe devfs | 
|  | To unsubscribe, send the message body: | 
|  | unsubscribe devfs | 
|  | instead. The list is archived at | 
|  |  | 
|  | http://oss.sgi.com/projects/devfs/archive/. | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  | Contents | 
|  |  | 
|  |  | 
|  | What is it? | 
|  |  | 
|  | Why do it? | 
|  |  | 
|  | Who else does it? | 
|  |  | 
|  | How it works | 
|  |  | 
|  | Operational issues (essential reading) | 
|  |  | 
|  | Instructions for the impatient | 
|  | Permissions persistence across reboots | 
|  | Dealing with drivers without devfs support | 
|  | All the way with Devfs | 
|  | Other Issues | 
|  | Kernel Naming Scheme | 
|  | Devfsd Naming Scheme | 
|  | Old Compatibility Names | 
|  | SCSI Host Probing Issues | 
|  |  | 
|  |  | 
|  |  | 
|  | Device drivers currently ported | 
|  |  | 
|  | Allocation of Device Numbers | 
|  |  | 
|  | Questions and Answers | 
|  |  | 
|  | Making things work | 
|  | Alternatives to devfs | 
|  | What I don't like about devfs | 
|  | How to report bugs | 
|  | Strange kernel messages | 
|  | Compilation problems with devfsd | 
|  |  | 
|  |  | 
|  | Other resources | 
|  |  | 
|  | Translations of this document | 
|  |  | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  |  | 
|  | What is it? | 
|  |  | 
|  | Devfs is an alternative to "real" character and block special devices | 
|  | on your root filesystem. Kernel device drivers can register devices by | 
|  | name rather than major and minor numbers. These devices will appear in | 
|  | devfs automatically, with whatever default ownership and | 
|  | protection the driver specified. A daemon (devfsd) can be used to | 
|  | override these defaults. Devfs has been in the kernel since 2.3.46. | 
|  |  | 
|  | NOTE that devfs is entirely optional. If you prefer the old | 
|  | disc-based device nodes, then simply leave CONFIG_DEVFS_FS=n (the | 
|  | default). In this case, nothing will change.  ALSO NOTE that if you do | 
|  | enable devfs, the defaults are such that full compatibility is | 
|  | maintained with the old devices names. | 
|  |  | 
|  | There are two aspects to devfs: one is the underlying device | 
|  | namespace, which is a namespace just like any mounted filesystem. The | 
|  | other aspect is the filesystem code which provides a view of the | 
|  | device namespace. The reason I make a distinction is because devfs | 
|  | can be mounted many times, with each mount showing the same device | 
|  | namespace. Changes made are global to all mounted devfs filesystems. | 
|  | Also, because the devfs namespace exists without any devfs mounts, you | 
|  | can easily mount the root filesystem by referring to an entry in the | 
|  | devfs namespace. | 
|  |  | 
|  |  | 
|  | The cost of devfs is a small increase in kernel code size and memory | 
|  | usage. About 7 pages of code (some of that in __init sections) and 72 | 
|  | bytes for each entry in the namespace. A modest system has only a | 
|  | couple of hundred device entries, so this costs a few more | 
|  | pages. Compare this with the suggestion to put /dev on a <a | 
|  | href="#why-faq-ramdisc">ramdisc. | 
|  |  | 
|  | On a typical machine, the cost is under 0.2 percent. On a modest | 
|  | system with 64 MBytes of RAM, the cost is under 0.1 percent.  The | 
|  | accusations of "bloatware" levelled at devfs are not justified. | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  |  | 
|  | Why do it? | 
|  |  | 
|  | There are several problems that devfs addresses. Some of these | 
|  | problems are more serious than others (depending on your point of | 
|  | view), and some can be solved without devfs. However, the totality of | 
|  | these problems really calls out for devfs. | 
|  |  | 
|  | The choice is a patchwork of inefficient user space solutions, which | 
|  | are complex and likely to be fragile, or to use a simple and efficient | 
|  | devfs which is robust. | 
|  |  | 
|  | There have been many counter-proposals to devfs, all seeking to | 
|  | provide some of the benefits without actually implementing devfs. So | 
|  | far there has been an absence of code and no proposed alternative has | 
|  | been able to provide all the features that devfs does. Further, | 
|  | alternative proposals require far more complexity in user-space (and | 
|  | still deliver less functionality than devfs). Some people have the | 
|  | mantra of reducing "kernel bloat", but don't consider the effects on | 
|  | user-space. | 
|  |  | 
|  | A good solution limits the total complexity of kernel-space and | 
|  | user-space. | 
|  |  | 
|  |  | 
|  | Major&minor allocation | 
|  |  | 
|  | The existing scheme requires the allocation of major and minor device | 
|  | numbers for each and every device. This means that a central | 
|  | co-ordinating authority is required to issue these device numbers | 
|  | (unless you're developing a "private" device driver), in order to | 
|  | preserve uniqueness. Devfs shifts the burden to a namespace. This may | 
|  | not seem like a huge benefit, but actually it is. Since driver authors | 
|  | will naturally choose a device name which reflects the functionality | 
|  | of the device, there is far less potential for namespace conflict. | 
|  | Solving this requires a kernel change. | 
|  |  | 
|  | /dev management | 
|  |  | 
|  | Because you currently access devices through device nodes, these must | 
|  | be created by the system administrator. For standard devices you can | 
|  | usually find a MAKEDEV programme which creates all these (hundreds!) | 
|  | of nodes. This means that changes in the kernel must be reflected by | 
|  | changes in the MAKEDEV programme, or else the system administrator | 
|  | creates device nodes by hand. | 
|  |  | 
|  | The basic problem is that there are two separate databases of | 
|  | major and minor numbers. One is in the kernel and one is in /dev (or | 
|  | in a MAKEDEV programme, if you want to look at it that way). This is | 
|  | duplication of information, which is not good practice. | 
|  | Solving this requires a kernel change. | 
|  |  | 
|  | /dev growth | 
|  |  | 
|  | A typical /dev has over 1200 nodes! Most of these devices simply don't | 
|  | exist because the hardware is not available. A huge /dev increases the | 
|  | time to access devices (I'm just referring to the dentry lookup times | 
|  | and the time taken to read inodes off disc: the next subsection shows | 
|  | some more horrors). | 
|  |  | 
|  | An example of how big /dev can grow is if we consider SCSI devices: | 
|  |  | 
|  | host           6  bits  (say up to 64 hosts on a really big machine) | 
|  | channel        4  bits  (say up to 16 SCSI buses per host) | 
|  | id             4  bits | 
|  | lun            3  bits | 
|  | partition      6  bits | 
|  | TOTAL          23 bits | 
|  |  | 
|  |  | 
|  | This requires 8 Mega (1024*1024) inodes if we want to store all | 
|  | possible device nodes. Even if we scrap everything but id,partition | 
|  | and assume a single host adapter with a single SCSI bus and only one | 
|  | logical unit per SCSI target (id), that's still 10 bits or 1024 | 
|  | inodes. Each VFS inode takes around 256 bytes (kernel 2.1.78), so | 
|  | that's 256 kBytes of inode storage on disc (assuming real inodes take | 
|  | a similar amount of space as VFS inodes). This is actually not so bad, | 
|  | because disc is cheap these days. Embedded systems would care about | 
|  | 256 kBytes of /dev inodes, but you could argue that embedded systems | 
|  | would have hand-tuned /dev directories. I've had to do just that on my | 
|  | embedded systems, but I would rather just leave it to devfs. | 
|  |  | 
|  | Another issue is the time taken to lookup an inode when first | 
|  | referenced. Not only does this take time in scanning through a list in | 
|  | memory, but also the seek times to read the inodes off disc. | 
|  | This could be solved in user-space using a clever programme which | 
|  | scanned the kernel logs and deleted /dev entries which are not | 
|  | available and created them when they were available. This programme | 
|  | would need to be run every time a new module was loaded, which would | 
|  | slow things down a lot. | 
|  |  | 
|  | There is an existing programme called scsidev which will automatically | 
|  | create device nodes for SCSI devices. It can do this by scanning files | 
|  | in /proc/scsi. Unfortunately, to extend this idea to other device | 
|  | nodes would require significant modifications to existing drivers (so | 
|  | they too would provide information in /proc). This is a non-trivial | 
|  | change (I should know: devfs has had to do something similar). Once | 
|  | you go to this much effort, you may as well use devfs itself (which | 
|  | also provides this information).  Furthermore, such a system would | 
|  | likely be implemented in an ad-hoc fashion, as different drivers will | 
|  | provide their information in different ways. | 
|  |  | 
|  | Devfs is much cleaner, because it (naturally) has a uniform mechanism | 
|  | to provide this information: the device nodes themselves! | 
|  |  | 
|  |  | 
|  | Node to driver file_operations translation | 
|  |  | 
|  | There is an important difference between the way disc-based character | 
|  | and block nodes and devfs entries make the connection between an entry | 
|  | in /dev and the actual device driver. | 
|  |  | 
|  | With the current 8 bit major and minor numbers the connection between | 
|  | disc-based c&b nodes and per-major drivers is done through a | 
|  | fixed-length table of 128 entries. The various filesystem types set | 
|  | the inode operations for c&b nodes to {chr,blk}dev_inode_operations, | 
|  | so when a device is opened a few quick levels of indirection bring us | 
|  | to the driver file_operations. | 
|  |  | 
|  | For miscellaneous character devices a second step is required: there | 
|  | is a scan for the driver entry with the same minor number as the file | 
|  | that was opened, and the appropriate minor open method is called. This | 
|  | scanning is done *every time* you open a device node. Potentially, you | 
|  | may be searching through dozens of misc. entries before you find your | 
|  | open method. While not an enormous performance overhead, this does | 
|  | seem pointless. | 
|  |  | 
|  | Linux *must* move beyond the 8 bit major and minor barrier, | 
|  | somehow. If we simply increase each to 16 bits, then the indexing | 
|  | scheme used for major driver lookup becomes untenable, because the | 
|  | major tables (one each for character and block devices) would need to | 
|  | be 64 k entries long (512 kBytes on x86, 1 MByte for 64 bit | 
|  | systems). So we would have to use a scheme like that used for | 
|  | miscellaneous character devices, which means the search time goes up | 
|  | linearly with the average number of major device drivers on your | 
|  | system. Not all "devices" are hardware, some are higher-level drivers | 
|  | like KGI, so you can get more "devices" without adding hardware | 
|  | You can improve this by creating an ordered (balanced:-) | 
|  | binary tree, in which case your search time becomes log(N). | 
|  | Alternatively, you can use hashing to speed up the search. | 
|  | But why do that search at all if you don't have to? Once again, it | 
|  | seems pointless. | 
|  |  | 
|  | Note that devfs doesn't use the major&minor system. For devfs | 
|  | entries, the connection is done when you lookup the /dev entry. When | 
|  | devfs_register() is called, an internal table is appended which has | 
|  | the entry name and the file_operations. If the dentry cache doesn't | 
|  | have the /dev entry already, this internal table is scanned to get the | 
|  | file_operations, and an inode is created. If the dentry cache already | 
|  | has the entry, there is *no lookup time* (other than the dentry scan | 
|  | itself, but we can't avoid that anyway, and besides Linux dentries | 
|  | cream other OS's which don't have them:-). Furthermore, the number of | 
|  | node entries in a devfs is only the number of available device | 
|  | entries, not the number of *conceivable* entries. Even if you remove | 
|  | unnecessary entries in a disc-based /dev, the number of conceivable | 
|  | entries remains the same: you just limit yourself in order to save | 
|  | space. | 
|  |  | 
|  | Devfs provides a fast connection between a VFS node and the device | 
|  | driver, in a scalable way. | 
|  |  | 
|  | /dev as a system administration tool | 
|  |  | 
|  | Right now /dev contains a list of conceivable devices, most of which I | 
|  | don't have. Devfs only shows those devices available on my | 
|  | system. This means that listing /dev is a handy way of checking what | 
|  | devices are available. | 
|  |  | 
|  | Major&minor size | 
|  |  | 
|  | Existing major and minor numbers are limited to 8 bits each. This is | 
|  | now a limiting factor for some drivers, particularly the SCSI disc | 
|  | driver, which consumes a single major number. Only 16 discs are | 
|  | supported, and each disc may have only 15 partitions. Maybe this isn't | 
|  | a problem for you, but some of us are building huge Linux systems with | 
|  | disc arrays. With devfs an arbitrary pointer can be associated with | 
|  | each device entry, which can be used to give an effective 32 bit | 
|  | device identifier (i.e. that's like having a 32 bit minor | 
|  | number). Since this is private to the kernel, there are no C library | 
|  | compatibility issues which you would have with increasing major and | 
|  | minor number sizes. See the section on "Allocation of Device Numbers" | 
|  | for details on maintaining compatibility with userspace. | 
|  |  | 
|  | Solving this requires a kernel change. | 
|  |  | 
|  | Since writing this, the kernel has been modified so that the SCSI disc | 
|  | driver has more major numbers allocated to it and now supports up to | 
|  | 128 discs. Since these major numbers are non-contiguous (a result of | 
|  | unplanned expansion), the implementation is a little more cumbersome | 
|  | than originally. | 
|  |  | 
|  | Just like the changes to IPv4 to fix impending limitations in the | 
|  | address space, people find ways around the limitations. In the long | 
|  | run, however, solutions like IPv6 or devfs can't be put off forever. | 
|  |  | 
|  | Read-only root filesystem | 
|  |  | 
|  | Having your device nodes on the root filesystem means that you can't | 
|  | operate properly with a read-only root filesystem. This is because you | 
|  | want to change ownerships and protections of tty devices. Existing | 
|  | practice prevents you using a CD-ROM as your root filesystem for a | 
|  | *real* system. Sure, you can boot off a CD-ROM, but you can't change | 
|  | tty ownerships, so it's only good for installing. | 
|  |  | 
|  | Also, you can't use a shared NFS root filesystem for a cluster of | 
|  | discless Linux machines (having tty ownerships changed on a common | 
|  | /dev is not good). Nor can you embed your root filesystem in a | 
|  | ROM-FS. | 
|  |  | 
|  | You can get around this by creating a RAMDISC at boot time, making | 
|  | an ext2 filesystem in it, mounting it somewhere and copying the | 
|  | contents of /dev into it, then unmounting it and mounting it over | 
|  | /dev. | 
|  |  | 
|  | A devfs is a cleaner way of solving this. | 
|  |  | 
|  | Non-Unix root filesystem | 
|  |  | 
|  | Non-Unix filesystems (such as NTFS) can't be used for a root | 
|  | filesystem because they variously don't support character and block | 
|  | special files or symbolic links. You can't have a separate disc-based | 
|  | or RAMDISC-based filesystem mounted on /dev because you need device | 
|  | nodes before you can mount these. Devfs can be mounted without any | 
|  | device nodes. Devlinks won't work because symlinks aren't supported. | 
|  | An alternative solution is to use initrd to mount a RAMDISC initial | 
|  | root filesystem (which is populated with a minimal set of device | 
|  | nodes), and then construct a new /dev in another RAMDISC, and finally | 
|  | switch to your non-Unix root filesystem. This requires clever boot | 
|  | scripts and a fragile and conceptually complex boot procedure. | 
|  |  | 
|  | Devfs solves this in a robust and conceptually simple way. | 
|  |  | 
|  | PTY security | 
|  |  | 
|  | Current pseudo-tty (pty) devices are owned by root and read-writable | 
|  | by everyone. The user of a pty-pair cannot change | 
|  | ownership/protections without being suid-root. | 
|  |  | 
|  | This could be solved with a secure user-space daemon which runs as | 
|  | root and does the actual creation of pty-pairs. Such a daemon would | 
|  | require modification to *every* programme that wants to use this new | 
|  | mechanism. It also slows down creation of pty-pairs. | 
|  |  | 
|  | An alternative is to create a new open_pty() syscall which does much | 
|  | the same thing as the user-space daemon. Once again, this requires | 
|  | modifications to pty-handling programmes. | 
|  |  | 
|  | The devfs solution allows a device driver to "tag" certain device | 
|  | files so that when an unopened device is opened, the ownerships are | 
|  | changed to the current euid and egid of the opening process, and the | 
|  | protections are changed to the default registered by the driver. When | 
|  | the device is closed ownership is set back to root and protections are | 
|  | set back to read-write for everybody. No programme need be changed. | 
|  | The devpts filesystem provides this auto-ownership feature for Unix98 | 
|  | ptys. It doesn't support old-style pty devices, nor does it have all | 
|  | the other features of devfs. | 
|  |  | 
|  | Intelligent device management | 
|  |  | 
|  | Devfs implements a simple yet powerful protocol for communication with | 
|  | a device management daemon (devfsd) which runs in user space. It is | 
|  | possible to send a message (either synchronously or asynchronously) to | 
|  | devfsd on any event, such as registration/unregistration of device | 
|  | entries, opening and closing devices, looking up inodes, scanning | 
|  | directories and more. This has many possibilities. Some of these are | 
|  | already implemented. See: | 
|  |  | 
|  |  | 
|  | http://www.atnf.csiro.au/~rgooch/linux/ | 
|  |  | 
|  | Device entry registration events can be used by devfsd to change | 
|  | permissions of newly-created device nodes. This is one mechanism to | 
|  | control device permissions. | 
|  |  | 
|  | Device entry registration/unregistration events can be used to run | 
|  | programmes or scripts. This can be used to provide automatic mounting | 
|  | of filesystems when a new block device media is inserted into the | 
|  | drive. | 
|  |  | 
|  | Asynchronous device open and close events can be used to implement | 
|  | clever permissions management. For example, the default permissions on | 
|  | /dev/dsp do not allow everybody to read from the device. This is | 
|  | sensible, as you don't want some remote user recording what you say at | 
|  | your console. However, the console user is also prevented from | 
|  | recording. This behaviour is not desirable. With asynchronous device | 
|  | open and close events, you can have devfsd run a programme or script | 
|  | when console devices are opened to change the ownerships for *other* | 
|  | device nodes (such as /dev/dsp). On closure, you can run a different | 
|  | script to restore permissions. An advantage of this scheme over | 
|  | modifying the C library tty handling is that this works even if your | 
|  | programme crashes (how many times have you seen the utmp database with | 
|  | lingering entries for non-existent logins?). | 
|  |  | 
|  | Synchronous device open events can be used to perform intelligent | 
|  | device access protections. Before the device driver open() method is | 
|  | called, the daemon must first validate the open attempt, by running an | 
|  | external programme or script. This is far more flexible than access | 
|  | control lists, as access can be determined on the basis of other | 
|  | system conditions instead of just the UID and GID. | 
|  |  | 
|  | Inode lookup events can be used to authenticate module autoload | 
|  | requests. Instead of using kmod directly, the event is sent to | 
|  | devfsd which can implement an arbitrary authentication before loading | 
|  | the module itself. | 
|  |  | 
|  | Inode lookup events can also be used to construct arbitrary | 
|  | namespaces, without having to resort to populating devfs with symlinks | 
|  | to devices that don't exist. | 
|  |  | 
|  | Speculative Device Scanning | 
|  |  | 
|  | Consider an application (like cdparanoia) that wants to find all | 
|  | CD-ROM devices on the system (SCSI, IDE and other types), whether or | 
|  | not their respective modules are loaded. The application must | 
|  | speculatively open certain device nodes (such as /dev/sr0 for the SCSI | 
|  | CD-ROMs) in order to make sure the module is loaded. This requires | 
|  | that all Linux distributions follow the standard device naming scheme | 
|  | (last time I looked RedHat did things differently). Devfs solves the | 
|  | naming problem. | 
|  |  | 
|  | The same application also wants to see which devices are actually | 
|  | available on the system. With the existing system it needs to read the | 
|  | /dev directory and speculatively open each /dev/sr* device to | 
|  | determine if the device exists or not. With a large /dev this is an | 
|  | inefficient operation, especially if there are many /dev/sr* nodes. A | 
|  | solution like scsidev could reduce the number of /dev/sr* entries (but | 
|  | of course that also requires all that inefficient directory scanning). | 
|  |  | 
|  | With devfs, the application can open the /dev/sr directory | 
|  | (which triggers the module autoloading if required), and proceed to | 
|  | read /dev/sr. Since only the available devices will have | 
|  | entries, there are no inefficencies in directory scanning or device | 
|  | openings. | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  | Who else does it? | 
|  |  | 
|  | FreeBSD has a devfs implementation. Solaris and AIX each have a | 
|  | pseudo-devfs (something akin to scsidev but for all devices, with some | 
|  | unspecified kernel support). BeOS, Plan9 and QNX also have it. SGI's | 
|  | IRIX 6.4 and above also have a device filesystem. | 
|  |  | 
|  | While we shouldn't just automatically do something because others do | 
|  | it, we should not ignore the work of others either. FreeBSD has a lot | 
|  | of competent people working on it, so their opinion should not be | 
|  | blithely ignored. | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  |  | 
|  | How it works | 
|  |  | 
|  | Registering device entries | 
|  |  | 
|  | For every entry (device node) in a devfs-based /dev a driver must call | 
|  | devfs_register(). This adds the name of the device entry, the | 
|  | file_operations structure pointer and a few other things to an | 
|  | internal table. Device entries may be added and removed at any | 
|  | time. When a device entry is registered, it automagically appears in | 
|  | any mounted devfs'. | 
|  |  | 
|  | Inode lookup | 
|  |  | 
|  | When a lookup operation on an entry is performed and if there is no | 
|  | driver information for that entry devfs will attempt to call | 
|  | devfsd. If still no driver information can be found then a negative | 
|  | dentry is yielded and the next stage operation will be called by the | 
|  | VFS (such as create() or mknod() inode methods). If driver information | 
|  | can be found, an inode is created (if one does not exist already) and | 
|  | all is well. | 
|  |  | 
|  | Manually creating device nodes | 
|  |  | 
|  | The mknod() method allows you to create an ordinary named pipe in the | 
|  | devfs, or you can create a character or block special inode if one | 
|  | does not already exist. You may wish to create a character or block | 
|  | special inode so that you can set permissions and ownership. Later, if | 
|  | a device driver registers an entry with the same name, the | 
|  | permissions, ownership and times are retained. This is how you can set | 
|  | the protections on a device even before the driver is loaded. Once you | 
|  | create an inode it appears in the directory listing. | 
|  |  | 
|  | Unregistering device entries | 
|  |  | 
|  | A device driver calls devfs_unregister() to unregister an entry. | 
|  |  | 
|  | Chroot() gaols | 
|  |  | 
|  | 2.2.x kernels | 
|  |  | 
|  | The semantics of inode creation are different when devfs is mounted | 
|  | with the "explicit" option. Now, when a device entry is registered, it | 
|  | will not appear until you use mknod() to create the device. It doesn't | 
|  | matter if you mknod() before or after the device is registered with | 
|  | devfs_register(). The purpose of this behaviour is to support | 
|  | chroot(2) gaols, where you want to mount a minimal devfs inside the | 
|  | gaol. Only the devices you specifically want to be available (through | 
|  | your mknod() setup) will be accessible. | 
|  |  | 
|  | 2.4.x kernels | 
|  |  | 
|  | As of kernel 2.3.99, the VFS has had the ability to rebind parts of | 
|  | the global filesystem namespace into another part of the namespace. | 
|  | This now works even at the leaf-node level, which means that | 
|  | individual files and device nodes may be bound into other parts of the | 
|  | namespace. This is like making links, but better, because it works | 
|  | across filesystems (unlike hard links) and works through chroot() | 
|  | gaols (unlike symbolic links). | 
|  |  | 
|  | Because of these improvements to the VFS, the multi-mount capability | 
|  | in devfs is no longer needed. The administrator may create a minimal | 
|  | device tree inside a chroot(2) gaol by using VFS bindings. As this | 
|  | provides most of the features of the devfs multi-mount capability, I | 
|  | removed the multi-mount support code (after issuing an RFC). This | 
|  | yielded code size reductions and simplifications. | 
|  |  | 
|  | If you want to construct a minimal chroot() gaol, the following | 
|  | command should suffice: | 
|  |  | 
|  | mount --bind /dev/null /gaol/dev/null | 
|  |  | 
|  |  | 
|  | Repeat for other device nodes you want to expose. Simple! | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  |  | 
|  | Operational issues | 
|  |  | 
|  |  | 
|  | Instructions for the impatient | 
|  |  | 
|  | Nobody likes reading documentation. People just want to get in there | 
|  | and play. So this section tells you quickly the steps you need to take | 
|  | to run with devfs mounted over /dev. Skip these steps and you will end | 
|  | up with a nearly unbootable system. Subsequent sections describe the | 
|  | issues in more detail, and discuss non-essential configuration | 
|  | options. | 
|  |  | 
|  | Devfsd | 
|  | OK, if you're reading this, I assume you want to play with | 
|  | devfs. First you should ensure that /usr/src/linux contains a | 
|  | recent kernel source tree. Then you need to compile devfsd, the device | 
|  | management daemon, available at | 
|  |  | 
|  | http://www.atnf.csiro.au/~rgooch/linux/. | 
|  | Because the kernel has a naming scheme | 
|  | which is quite different from the old naming scheme, you need to | 
|  | install devfsd so that software and configuration files that use the | 
|  | old naming scheme will not break. | 
|  |  | 
|  | Compile and install devfsd. You will be provided with a default | 
|  | configuration file /etc/devfsd.conf which will provide | 
|  | compatibility symlinks for the old naming scheme. Don't change this | 
|  | config file unless you know what you're doing. Even if you think you | 
|  | do know what you're doing, don't change it until you've followed all | 
|  | the steps below and booted a devfs-enabled system and verified that it | 
|  | works. | 
|  |  | 
|  | Now edit your main system boot script so that devfsd is started at the | 
|  | very beginning (before any filesystem | 
|  | checks). /etc/rc.d/rc.sysinit is often the main boot script | 
|  | on systems with SysV-style boot scripts. On systems with BSD-style | 
|  | boot scripts it is often /etc/rc. Also check | 
|  | /sbin/rc. | 
|  |  | 
|  | NOTE that the line you put into the boot | 
|  | script should be exactly: | 
|  |  | 
|  | /sbin/devfsd /dev | 
|  |  | 
|  | DO NOT use some special daemon-launching | 
|  | programme, otherwise the boot script may not wait for devfsd to finish | 
|  | initialising. | 
|  |  | 
|  | System Libraries | 
|  | There may still be some problems because of broken software making | 
|  | assumptions about device names. In particular, some software does not | 
|  | handle devices which are symbolic links. If you are running a libc 5 | 
|  | based system, install libc 5.4.44 (if you have libc 5.4.46, go back to | 
|  | libc 5.4.44, which is actually correct). If you are running a glibc | 
|  | based system, make sure you have glibc 2.1.3 or later. | 
|  |  | 
|  | /etc/securetty | 
|  | PAM (Pluggable Authentication Modules) is supposed to be a flexible | 
|  | mechanism for providing better user authentication and access to | 
|  | services. Unfortunately, it's also fragile, complex and undocumented | 
|  | (check out RedHat 6.1, and probably other distributions as well). PAM | 
|  | has problems with symbolic links. Append the following lines to your | 
|  | /etc/securetty file: | 
|  |  | 
|  | vc/1 | 
|  | vc/2 | 
|  | vc/3 | 
|  | vc/4 | 
|  | vc/5 | 
|  | vc/6 | 
|  | vc/7 | 
|  | vc/8 | 
|  |  | 
|  | This will not weaken security. If you have a version of util-linux | 
|  | earlier than 2.10.h, please upgrade to 2.10.h or later. If you | 
|  | absolutely cannot upgrade, then also append the following lines to | 
|  | your /etc/securetty file: | 
|  |  | 
|  | 1 | 
|  | 2 | 
|  | 3 | 
|  | 4 | 
|  | 5 | 
|  | 6 | 
|  | 7 | 
|  | 8 | 
|  |  | 
|  | This may potentially weaken security by allowing root logins over the | 
|  | network (a password is still required, though). However, since there | 
|  | are problems with dealing with symlinks, I'm suspicious of the level | 
|  | of security offered in any case. | 
|  |  | 
|  | XFree86 | 
|  | While not essential, it's probably a good idea to upgrade to XFree86 | 
|  | 4.0, as patches went in to make it more devfs-friendly. If you don't, | 
|  | you'll probably need to apply the following patch to | 
|  | /etc/security/console.perms so that ordinary users can run | 
|  | startx. Note that not all distributions have this file (e.g. Debian), | 
|  | so if it's not present, don't worry about it. | 
|  |  | 
|  | --- /etc/security/console.perms.orig    Sat Apr 17 16:26:47 1999 | 
|  | +++ /etc/security/console.perms Fri Feb 25 23:53:55 2000 | 
|  | @@ -14,7 +14,7 @@ | 
|  | # man 5 console.perms | 
|  |  | 
|  | # file classes -- these are regular expressions | 
|  | -<console>=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9] | 
|  | +<console>=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9] | 
|  |  | 
|  | # device classes -- these are shell-style globs | 
|  | <floppy>=/dev/fd[0-1]* | 
|  |  | 
|  | If the patch does not apply, then change the line: | 
|  |  | 
|  | <console>=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9] | 
|  |  | 
|  | with: | 
|  |  | 
|  | <console>=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9] | 
|  |  | 
|  |  | 
|  | Disable devpts | 
|  | I've had a report of devpts mounted on /dev/pts not working | 
|  | correctly. Since devfs will also manage /dev/pts, there is no | 
|  | need to mount devpts as well. You should either edit your | 
|  | /etc/fstab so devpts is not mounted, or disable devpts from | 
|  | your kernel configuration. | 
|  |  | 
|  | Unsupported drivers | 
|  | Not all drivers have devfs support. If you depend on one of these | 
|  | drivers, you will need to create a script or tarfile that you can use | 
|  | at boot time to create device nodes as appropriate. There is a | 
|  | section which describes this. Another | 
|  | section lists the drivers which have | 
|  | devfs support. | 
|  |  | 
|  | /dev/mouse | 
|  |  | 
|  | Many disributions configure /dev/mouse to be the mouse device | 
|  | for XFree86 and GPM. I actually think this is a bad idea, because it | 
|  | adds another level of indirection. When looking at a config file, if | 
|  | you see /dev/mouse you're left wondering which mouse | 
|  | is being referred to. Hence I recommend putting the actual mouse | 
|  | device (for example /dev/psaux) into your | 
|  | /etc/X11/XF86Config file (and similarly for the GPM | 
|  | configuration file). | 
|  |  | 
|  | Alternatively, use the same technique used for unsupported drivers | 
|  | described above. | 
|  |  | 
|  | The Kernel | 
|  | Finally, you need to make sure devfs is compiled into your kernel. Set | 
|  | CONFIG_EXPERIMENTAL=y, CONFIG_DEVFS_FS=y and CONFIG_DEVFS_MOUNT=y by | 
|  | using favourite configuration tool (i.e. make config or | 
|  | make xconfig) and then make clean and then recompile your kernel and | 
|  | modules. At boot, devfs will be mounted onto /dev. | 
|  |  | 
|  | If you encounter problems booting (for example if you forgot a | 
|  | configuration step), you can pass devfs=nomount at the kernel | 
|  | boot command line. This will prevent the kernel from mounting devfs at | 
|  | boot time onto /dev. | 
|  |  | 
|  | In general, a kernel built with CONFIG_DEVFS_FS=y but without mounting | 
|  | devfs onto /dev is completely safe, and requires no | 
|  | configuration changes. One exception to take note of is when | 
|  | LABEL= directives are used in /etc/fstab. In this | 
|  | case you will be unable to boot properly. This is because the | 
|  | mount(8) programme uses /proc/partitions as part of | 
|  | the volume label search process, and the device names it finds are not | 
|  | available, because setting CONFIG_DEVFS_FS=y changes the names in | 
|  | /proc/partitions, irrespective of whether devfs is mounted. | 
|  |  | 
|  | Now you've finished all the steps required. You're now ready to boot | 
|  | your shiny new kernel. Enjoy. | 
|  |  | 
|  | Changing the configuration | 
|  |  | 
|  | OK, you've now booted a devfs-enabled system, and everything works. | 
|  | Now you may feel like changing the configuration (common targets are | 
|  | /etc/fstab and /etc/devfsd.conf). Since you have a | 
|  | system that works, if you make any changes and it doesn't work, you | 
|  | now know that you only have to restore your configuration files to the | 
|  | default and it will work again. | 
|  |  | 
|  |  | 
|  | Permissions persistence across reboots | 
|  |  | 
|  | If you don't use mknod(2) to create a device file, nor use chmod(2) or | 
|  | chown(2) to change the ownerships/permissions, the inode ctime will | 
|  | remain at 0 (the epoch, 12 am, 1-JAN-1970, GMT). Anything with a ctime | 
|  | later than this has had it's ownership/permissions changed. Hence, a | 
|  | simple script or programme may be used to tar up all changed inodes, | 
|  | prior to shutdown. Although effective, many consider this approach a | 
|  | kludge. | 
|  |  | 
|  | A much better approach is to use devfsd to save and restore | 
|  | permissions. It may be configured to record changes in permissions and | 
|  | will save them in a database (in fact a directory tree), and restore | 
|  | these upon boot. This is an efficient method and results in immediate | 
|  | saving of current permissions (unlike the tar approach, which saves | 
|  | permissions at some unspecified future time). | 
|  |  | 
|  | The default configuration file supplied with devfsd has config entries | 
|  | which you may uncomment to enable persistence management. | 
|  |  | 
|  | If you decide to use the tar approach anyway, be aware that tar will | 
|  | first unlink(2) an inode before creating a new device node. The | 
|  | unlink(2) has the effect of breaking the connection between a devfs | 
|  | entry and the device driver. If you use the "devfs=only" boot option, | 
|  | you lose access to the device driver, requiring you to reload the | 
|  | module. I consider this a bug in tar (there is no real need to | 
|  | unlink(2) the inode first). | 
|  |  | 
|  | Alternatively, you can use devfsd to provide more sophisticated | 
|  | management of device permissions. You can use devfsd to store | 
|  | permissions for whole groups of devices with a single configuration | 
|  | entry, rather than the conventional single entry per device entry. | 
|  |  | 
|  | Permissions database stored in mounted-over /dev | 
|  |  | 
|  | If you wish to save and restore your device permissions into the | 
|  | disc-based /dev while still mounting devfs onto /dev | 
|  | you may do so. This requires a 2.4.x kernel (in fact, 2.3.99 or | 
|  | later), which has the VFS binding facility. You need to do the | 
|  | following to set this up: | 
|  |  | 
|  |  | 
|  |  | 
|  | make sure the kernel does not mount devfs at boot time | 
|  |  | 
|  |  | 
|  | make sure you have a correct /dev/console entry in your | 
|  | root file-system (where your disc-based /dev lives) | 
|  |  | 
|  | create the /dev-state directory | 
|  |  | 
|  |  | 
|  | add the following lines near the very beginning of your boot | 
|  | scripts: | 
|  |  | 
|  | mount --bind /dev /dev-state | 
|  | mount -t devfs none /dev | 
|  | devfsd /dev | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | add the following lines to your /etc/devfsd.conf file: | 
|  |  | 
|  | REGISTER	^pt[sy]		IGNORE | 
|  | CREATE		^pt[sy]		IGNORE | 
|  | CHANGE		^pt[sy]		IGNORE | 
|  | DELETE		^pt[sy]		IGNORE | 
|  | REGISTER	.*		COPY	/dev-state/$devname $devpath | 
|  | CREATE		.*		COPY	$devpath /dev-state/$devname | 
|  | CHANGE		.*		COPY	$devpath /dev-state/$devname | 
|  | DELETE		.*		CFUNCTION GLOBAL unlink /dev-state/$devname | 
|  | RESTORE		/dev-state | 
|  |  | 
|  | Note that the sample devfsd.conf file contains these lines, | 
|  | as well as other sample configurations you may find useful. See the | 
|  | devfsd distribution | 
|  |  | 
|  |  | 
|  | reboot. | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | Permissions database stored in normal directory | 
|  |  | 
|  | If you are using an older kernel which doesn't support VFS binding, | 
|  | then you won't be able to have the permissions database in a | 
|  | mounted-over /dev. However, you can still use a regular | 
|  | directory to store the database. The sample /etc/devfsd.conf | 
|  | file above may still be used. You will need to create the | 
|  | /dev-state directory prior to installing devfsd. If you have | 
|  | old permissions in /dev, then just copy (or move) the device | 
|  | nodes over to the new directory. | 
|  |  | 
|  | Which method is better? | 
|  |  | 
|  | The best method is to have the permissions database stored in the | 
|  | mounted-over /dev. This is because you will not need to copy | 
|  | device nodes over to /dev-state, and because it allows you to | 
|  | switch between devfs and non-devfs kernels, without requiring you to | 
|  | copy permissions between /dev-state (for devfs) and | 
|  | /dev (for non-devfs). | 
|  |  | 
|  |  | 
|  | Dealing with drivers without devfs support | 
|  |  | 
|  | Currently, not all device drivers in the kernel have been modified to | 
|  | use devfs. Device drivers which do not yet have devfs support will not | 
|  | automagically appear in devfs. The simplest way to create device nodes | 
|  | for these drivers is to unpack a tarfile containing the required | 
|  | device nodes. You can do this in your boot scripts. All your drivers | 
|  | will now work as before. | 
|  |  | 
|  | Hopefully for most people devfs will have enough support so that they | 
|  | can mount devfs directly over /dev without losing most functionality | 
|  | (i.e. losing access to various devices). As of 22-JAN-1998 (devfs | 
|  | patch version 10) I am now running this way. All the devices I have | 
|  | are available in devfs, so I don't lose anything. | 
|  |  | 
|  | WARNING: if your configuration requires the old-style device names | 
|  | (i.e. /dev/hda1 or /dev/sda1), you must install devfsd and configure | 
|  | it to maintain compatibility entries. It is almost certain that you | 
|  | will require this. Note that the kernel creates a compatibility entry | 
|  | for the root device, so you don't need initrd. | 
|  |  | 
|  | Note that you no longer need to mount devpts if you use Unix98 PTYs, | 
|  | as devfs can manage /dev/pts itself. This saves you some RAM, as you | 
|  | don't need to compile and install devpts. Note that some versions of | 
|  | glibc have a bug with Unix98 pty handling on devfs systems. Contact | 
|  | the glibc maintainers for a fix. Glibc 2.1.3 has the fix. | 
|  |  | 
|  | Note also that apart from editing /etc/fstab, other things will need | 
|  | to be changed if you *don't* install devfsd. Some software (like the X | 
|  | server) hard-wire device names in their source. It really is much | 
|  | easier to install devfsd so that compatibility entries are created. | 
|  | You can then slowly migrate your system to using the new device names | 
|  | (for example, by starting with /etc/fstab), and then limiting the | 
|  | compatibility entries that devfsd creates. | 
|  |  | 
|  | IF YOU CONFIGURE TO MOUNT DEVFS AT BOOT, MAKE SURE YOU INSTALL DEVFSD | 
|  | BEFORE YOU BOOT A DEVFS-ENABLED KERNEL! | 
|  |  | 
|  | Now that devfs has gone into the 2.3.46 kernel, I'm getting a lot of | 
|  | reports back. Many of these are because people are trying to run | 
|  | without devfsd, and hence some things break. Please just run devfsd if | 
|  | things break. I want to concentrate on real bugs rather than | 
|  | misconfiguration problems at the moment. If people are willing to fix | 
|  | bugs/false assumptions in other code (i.e. glibc, X server) and submit | 
|  | that to the respective maintainers, that would be great. | 
|  |  | 
|  |  | 
|  | All the way with Devfs | 
|  |  | 
|  | The devfs kernel patch creates a rationalised device tree. As stated | 
|  | above, if you want to keep using the old /dev naming scheme, | 
|  | you just need to configure devfsd appopriately (see the man | 
|  | page). People who prefer the old names can ignore this section. For | 
|  | those of us who like the rationalised names and an uncluttered | 
|  | /dev, read on. | 
|  |  | 
|  | If you don't run devfsd, or don't enable compatibility entry | 
|  | management, then you will have to configure your system to use the new | 
|  | names. For example, you will then need to edit your | 
|  | /etc/fstab to use the new disc naming scheme. If you want to | 
|  | be able to boot non-devfs kernels, you will need compatibility | 
|  | symlinks in the underlying disc-based /dev pointing back to | 
|  | the old-style names for when you boot a kernel without devfs. | 
|  |  | 
|  | You can selectively decide which devices you want compatibility | 
|  | entries for. For example, you may only want compatibility entries for | 
|  | BSD pseudo-terminal devices (otherwise you'll have to patch you C | 
|  | library or use Unix98 ptys instead). It's just a matter of putting in | 
|  | the correct regular expression into /dev/devfsd.conf. | 
|  |  | 
|  | There are other choices of naming schemes that you may prefer. For | 
|  | example, I don't use the kernel-supplied | 
|  | names, because they are too verbose. A common misconception is | 
|  | that the kernel-supplied names are meant to be used directly in | 
|  | configuration files. This is not the case. They are designed to | 
|  | reflect the layout of the devices attached and to provide easy | 
|  | classification. | 
|  |  | 
|  | If you like the kernel-supplied names, that's fine. If you don't then | 
|  | you should be using devfsd to construct a namespace more to your | 
|  | liking. Devfsd has built-in code to construct a | 
|  | namespace that is both logical and easy to | 
|  | manage. In essence, it creates a convenient abbreviation of the | 
|  | kernel-supplied namespace. | 
|  |  | 
|  | You are of course free to build your own namespace. Devfsd has all the | 
|  | infrastructure required to make this easy for you. All you need do is | 
|  | write a script. You can even write some C code and devfsd can load the | 
|  | shared object as a callable extension. | 
|  |  | 
|  |  | 
|  | Other Issues | 
|  |  | 
|  | The init programme | 
|  | Another thing to take note of is whether your init programme | 
|  | creates a Unix socket /dev/telinit. Some versions of init | 
|  | create /dev/telinit so that the telinit programme can | 
|  | communicate with the init process. If you have such a system you need | 
|  | to make sure that devfs is mounted over /dev *before* init | 
|  | starts. In other words, you can't leave the mounting of devfs to | 
|  | /etc/rc, since this is executed after init. Other | 
|  | versions of init require a named pipe /dev/initctl | 
|  | which must exist *before* init starts. Once again, you need to | 
|  | mount devfs and then create the named pipe *before* init | 
|  | starts. | 
|  |  | 
|  | The default behaviour now is not to mount devfs onto /dev at | 
|  | boot time for 2.3.x and later kernels. You can correct this with the | 
|  | "devfs=mount" boot option. This solves any problems with init, | 
|  | and also prevents the dreaded: | 
|  |  | 
|  | Cannot open initial console | 
|  |  | 
|  | message. For 2.2.x kernels where you need to apply the devfs patch, | 
|  | the default is to mount. | 
|  |  | 
|  | If you have automatic mounting of devfs onto /dev then you | 
|  | may need to create /dev/initctl in your boot scripts. The | 
|  | following lines should suffice: | 
|  |  | 
|  | mknod /dev/initctl p | 
|  | kill -SIGUSR1 1       # tell init that /dev/initctl now exists | 
|  |  | 
|  | Alternatively, if you don't want the kernel to mount devfs onto | 
|  | /dev then you could use the following procedure is a | 
|  | guideline for how to get around /dev/initctl problems: | 
|  |  | 
|  | # cd /sbin | 
|  | # mv init init.real | 
|  | # cat > init | 
|  | #! /bin/sh | 
|  | mount -n -t devfs none /dev | 
|  | mknod /dev/initctl p | 
|  | exec /sbin/init.real $* | 
|  | [control-D] | 
|  | # chmod a+x init | 
|  |  | 
|  | Note that newer versions of init create /dev/initctl | 
|  | automatically, so you don't have to worry about this. | 
|  |  | 
|  | Module autoloading | 
|  | You will need to configure devfsd to enable module | 
|  | autoloading. The following lines should be placed in your | 
|  | /etc/devfsd.conf file: | 
|  |  | 
|  | LOOKUP	.*		MODLOAD | 
|  |  | 
|  |  | 
|  | As of devfsd-v1.3.10, a generic /etc/modules.devfs | 
|  | configuration file is installed, which is used by the MODLOAD | 
|  | action. This should be sufficient for most configurations. If you | 
|  | require further configuration, edit your /etc/modules.conf | 
|  | file. The way module autoloading work with devfs is: | 
|  |  | 
|  |  | 
|  | a process attempts to lookup a device node (e.g. /dev/fred) | 
|  |  | 
|  |  | 
|  | if that device node does not exist, the full pathname is passed to | 
|  | devfsd as a string | 
|  |  | 
|  |  | 
|  | devfsd will pass the string to the modprobe programme (provided the | 
|  | configuration line shown above is present), and specifies that | 
|  | /etc/modules.devfs is the configuration file | 
|  |  | 
|  |  | 
|  | /etc/modules.devfs includes /etc/modules.conf to | 
|  | access local configurations | 
|  |  | 
|  | modprobe will search it's configuration files, looking for an alias | 
|  | that translates the pathname into a module name | 
|  |  | 
|  |  | 
|  | the translated pathname is then used to load the module. | 
|  |  | 
|  |  | 
|  | If you wanted a lookup of /dev/fred to load the | 
|  | mymod module, you would require the following configuration | 
|  | line in /etc/modules.conf: | 
|  |  | 
|  | alias    /dev/fred    mymod | 
|  |  | 
|  | The /etc/modules.devfs configuration file provides many such | 
|  | aliases for standard device names. If you look closely at this file, | 
|  | you will note that some modules require multiple alias configuration | 
|  | lines. This is required to support module autoloading for old and new | 
|  | device names. | 
|  |  | 
|  | Mounting root off a devfs device | 
|  | If you wish to mount root off a devfs device when you pass the | 
|  | "devfs=only" boot option, then you need to pass in the | 
|  | "root=<device>" option to the kernel when booting. If you use | 
|  | LILO, then you must have this in lilo.conf: | 
|  |  | 
|  | append = "root=<device>" | 
|  |  | 
|  | Surprised? Yep, so was I. It turns out if you have (as most people | 
|  | do): | 
|  |  | 
|  | root = <device> | 
|  |  | 
|  |  | 
|  | then LILO will determine the device number of <device> and will | 
|  | write that device number into a special place in the kernel image | 
|  | before starting the kernel, and the kernel will use that device number | 
|  | to mount the root filesystem. So, using the "append" variety ensures | 
|  | that LILO passes the root filesystem device as a string, which devfs | 
|  | can then use. | 
|  |  | 
|  | Note that this isn't an issue if you don't pass "devfs=only". | 
|  |  | 
|  | TTY issues | 
|  | The ttyname(3) function in some versions of the C library makes | 
|  | false assumptions about device entries which are symbolic links.  The | 
|  | tty(1) programme is one that depends on this function.  I've | 
|  | written a patch to libc 5.4.43 which fixes this. This has been | 
|  | included in libc 5.4.44 and a similar fix is in glibc 2.1.3. | 
|  |  | 
|  |  | 
|  | Kernel Naming Scheme | 
|  |  | 
|  | The kernel provides a default naming scheme. This scheme is designed | 
|  | to make it easy to search for specific devices or device types, and to | 
|  | view the available devices. Some device types (such as hard discs), | 
|  | have a directory of entries, making it easy to see what devices of | 
|  | that class are available. Often, the entries are symbolic links into a | 
|  | directory tree that reflects the topology of available devices. The | 
|  | topological tree is useful for finding how your devices are arranged. | 
|  |  | 
|  | Below is a list of the naming schemes for the most common drivers. A | 
|  | list of reserved device names is | 
|  | available for reference. Please send email to | 
|  | [email protected] to obtain an allocation. Please be | 
|  | patient (the maintainer is busy). An alternative name may be allocated | 
|  | instead of the requested name, at the discretion of the maintainer. | 
|  |  | 
|  | Disc Devices | 
|  |  | 
|  | All discs, whether SCSI, IDE or whatever, are placed under the | 
|  | /dev/discs hierarchy: | 
|  |  | 
|  | /dev/discs/disc0	first disc | 
|  | /dev/discs/disc1	second disc | 
|  |  | 
|  |  | 
|  | Each of these entries is a symbolic link to the directory for that | 
|  | device. The device directory contains: | 
|  |  | 
|  | disc	for the whole disc | 
|  | part*	for individual partitions | 
|  |  | 
|  |  | 
|  | CD-ROM Devices | 
|  |  | 
|  | All CD-ROMs, whether SCSI, IDE or whatever, are placed under the | 
|  | /dev/cdroms hierarchy: | 
|  |  | 
|  | /dev/cdroms/cdrom0	first CD-ROM | 
|  | /dev/cdroms/cdrom1	second CD-ROM | 
|  |  | 
|  |  | 
|  | Each of these entries is a symbolic link to the real device entry for | 
|  | that device. | 
|  |  | 
|  | Tape Devices | 
|  |  | 
|  | All tapes, whether SCSI, IDE or whatever, are placed under the | 
|  | /dev/tapes hierarchy: | 
|  |  | 
|  | /dev/tapes/tape0	first tape | 
|  | /dev/tapes/tape1	second tape | 
|  |  | 
|  |  | 
|  | Each of these entries is a symbolic link to the directory for that | 
|  | device. The device directory contains: | 
|  |  | 
|  | mt			for mode 0 | 
|  | mtl			for mode 1 | 
|  | mtm			for mode 2 | 
|  | mta			for mode 3 | 
|  | mtn			for mode 0, no rewind | 
|  | mtln			for mode 1, no rewind | 
|  | mtmn			for mode 2, no rewind | 
|  | mtan			for mode 3, no rewind | 
|  |  | 
|  |  | 
|  | SCSI Devices | 
|  |  | 
|  | To uniquely identify any SCSI device requires the following | 
|  | information: | 
|  |  | 
|  | controller	(host adapter) | 
|  | bus		(SCSI channel) | 
|  | target	(SCSI ID) | 
|  | unit		(Logical Unit Number) | 
|  |  | 
|  |  | 
|  | All SCSI devices are placed under /dev/scsi (assuming devfs | 
|  | is mounted on /dev). Hence, a SCSI device with the following | 
|  | parameters: c=1,b=2,t=3,u=4 would appear as: | 
|  |  | 
|  | /dev/scsi/host1/bus2/target3/lun4	device directory | 
|  |  | 
|  |  | 
|  | Inside this directory, a number of device entries may be created, | 
|  | depending on which SCSI device-type drivers were installed. | 
|  |  | 
|  | See the section on the disc naming scheme to see what entries the SCSI | 
|  | disc driver creates. | 
|  |  | 
|  | See the section on the tape naming scheme to see what entries the SCSI | 
|  | tape driver creates. | 
|  |  | 
|  | The SCSI CD-ROM driver creates: | 
|  |  | 
|  | cd | 
|  |  | 
|  |  | 
|  | The SCSI generic driver creates: | 
|  |  | 
|  | generic | 
|  |  | 
|  |  | 
|  | IDE Devices | 
|  |  | 
|  | To uniquely identify any IDE device requires the following | 
|  | information: | 
|  |  | 
|  | controller | 
|  | bus		(aka. primary/secondary) | 
|  | target	(aka. master/slave) | 
|  | unit | 
|  |  | 
|  |  | 
|  | All IDE devices are placed under /dev/ide, and uses a similar | 
|  | naming scheme to the SCSI subsystem. | 
|  |  | 
|  | XT Hard Discs | 
|  |  | 
|  | All XT discs are placed under /dev/xd. The first XT disc has | 
|  | the directory /dev/xd/disc0. | 
|  |  | 
|  | TTY devices | 
|  |  | 
|  | The tty devices now appear as: | 
|  |  | 
|  | New name                   Old-name                   Device Type | 
|  | --------                   --------                   ----------- | 
|  | /dev/tts/{0,1,...}         /dev/ttyS{0,1,...}         Serial ports | 
|  | /dev/cua/{0,1,...}         /dev/cua{0,1,...}          Call out devices | 
|  | /dev/vc/0                  /dev/tty                   Current virtual console | 
|  | /dev/vc/{1,2,...}          /dev/tty{1...63}           Virtual consoles | 
|  | /dev/vcc/{0,1,...}         /dev/vcs{1...63}           Virtual consoles | 
|  | /dev/pty/m{0,1,...}        /dev/ptyp??                PTY masters | 
|  | /dev/pty/s{0,1,...}        /dev/ttyp??                PTY slaves | 
|  |  | 
|  |  | 
|  | RAMDISCS | 
|  |  | 
|  | The RAMDISCS are placed in their own directory, and are named thus: | 
|  |  | 
|  | /dev/rd/{0,1,2,...} | 
|  |  | 
|  |  | 
|  | Meta Devices | 
|  |  | 
|  | The meta devices are placed in their own directory, and are named | 
|  | thus: | 
|  |  | 
|  | /dev/md/{0,1,2,...} | 
|  |  | 
|  |  | 
|  | Floppy discs | 
|  |  | 
|  | Floppy discs are placed in the /dev/floppy directory. | 
|  |  | 
|  | Loop devices | 
|  |  | 
|  | Loop devices are placed in the /dev/loop directory. | 
|  |  | 
|  | Sound devices | 
|  |  | 
|  | Sound devices are placed in the /dev/sound directory | 
|  | (audio, sequencer, ...). | 
|  |  | 
|  |  | 
|  | Devfsd Naming Scheme | 
|  |  | 
|  | Devfsd provides a naming scheme which is a convenient abbreviation of | 
|  | the kernel-supplied namespace. In some | 
|  | cases, the kernel-supplied naming scheme is quite convenient, so | 
|  | devfsd does not provide another naming scheme. The convenience names | 
|  | that devfsd creates are in fact the same names as the original devfs | 
|  | kernel patch created (before Linus mandated the Big Name | 
|  | Change). These are referred to as "new compatibility entries". | 
|  |  | 
|  | In order to configure devfsd to create these convenience names, the | 
|  | following lines should be placed in your /etc/devfsd.conf: | 
|  |  | 
|  | REGISTER	.*		MKNEWCOMPAT | 
|  | UNREGISTER	.*		RMNEWCOMPAT | 
|  |  | 
|  | This will cause devfsd to create (and destroy) symbolic links which | 
|  | point to the kernel-supplied names. | 
|  |  | 
|  | SCSI Hard Discs | 
|  |  | 
|  | All SCSI discs are placed under /dev/sd (assuming devfs is | 
|  | mounted on /dev). Hence, a SCSI disc with the following | 
|  | parameters: c=1,b=2,t=3,u=4 would appear as: | 
|  |  | 
|  | /dev/sd/c1b2t3u4	for the whole disc | 
|  | /dev/sd/c1b2t3u4p5	for the 5th partition | 
|  | /dev/sd/c1b2t3u4p5s6	for the 6th slice in the 5th partition | 
|  |  | 
|  |  | 
|  | SCSI Tapes | 
|  |  | 
|  | All SCSI tapes are placed under /dev/st. A similar naming | 
|  | scheme is used as for SCSI discs. A SCSI tape with the | 
|  | parameters:c=1,b=2,t=3,u=4 would appear as: | 
|  |  | 
|  | /dev/st/c1b2t3u4m0	for mode 0 | 
|  | /dev/st/c1b2t3u4m1	for mode 1 | 
|  | /dev/st/c1b2t3u4m2	for mode 2 | 
|  | /dev/st/c1b2t3u4m3	for mode 3 | 
|  | /dev/st/c1b2t3u4m0n	for mode 0, no rewind | 
|  | /dev/st/c1b2t3u4m1n	for mode 1, no rewind | 
|  | /dev/st/c1b2t3u4m2n	for mode 2, no rewind | 
|  | /dev/st/c1b2t3u4m3n	for mode 3, no rewind | 
|  |  | 
|  |  | 
|  | SCSI CD-ROMs | 
|  |  | 
|  | All SCSI CD-ROMs are placed under /dev/sr. A similar naming | 
|  | scheme is used as for SCSI discs. A SCSI CD-ROM with the | 
|  | parameters:c=1,b=2,t=3,u=4 would appear as: | 
|  |  | 
|  | /dev/sr/c1b2t3u4 | 
|  |  | 
|  |  | 
|  | SCSI Generic Devices | 
|  |  | 
|  | The generic (aka. raw) interface for all SCSI devices are placed under | 
|  | /dev/sg. A similar naming scheme is used as for SCSI discs. A | 
|  | SCSI generic device with the parameters:c=1,b=2,t=3,u=4 would appear | 
|  | as: | 
|  |  | 
|  | /dev/sg/c1b2t3u4 | 
|  |  | 
|  |  | 
|  | IDE Hard Discs | 
|  |  | 
|  | All IDE discs are placed under /dev/ide/hd, using a similar | 
|  | convention to SCSI discs. The following mappings exist between the new | 
|  | and the old names: | 
|  |  | 
|  | /dev/hda	/dev/ide/hd/c0b0t0u0 | 
|  | /dev/hdb	/dev/ide/hd/c0b0t1u0 | 
|  | /dev/hdc	/dev/ide/hd/c0b1t0u0 | 
|  | /dev/hdd	/dev/ide/hd/c0b1t1u0 | 
|  |  | 
|  |  | 
|  | IDE Tapes | 
|  |  | 
|  | A similar naming scheme is used as for IDE discs. The entries will | 
|  | appear in the /dev/ide/mt directory. | 
|  |  | 
|  | IDE CD-ROM | 
|  |  | 
|  | A similar naming scheme is used as for IDE discs. The entries will | 
|  | appear in the /dev/ide/cd directory. | 
|  |  | 
|  | IDE Floppies | 
|  |  | 
|  | A similar naming scheme is used as for IDE discs. The entries will | 
|  | appear in the /dev/ide/fd directory. | 
|  |  | 
|  | XT Hard Discs | 
|  |  | 
|  | All XT discs are placed under /dev/xd. The first XT disc | 
|  | would appear as /dev/xd/c0t0. | 
|  |  | 
|  |  | 
|  | Old Compatibility Names | 
|  |  | 
|  | The old compatibility names are the legacy device names, such as | 
|  | /dev/hda, /dev/sda, /dev/rtc and so on. | 
|  | Devfsd can be configured to create compatibility symlinks so that you | 
|  | may continue to use the old names in your configuration files and so | 
|  | that old applications will continue to function correctly. | 
|  |  | 
|  | In order to configure devfsd to create these legacy names, the | 
|  | following lines should be placed in your /etc/devfsd.conf: | 
|  |  | 
|  | REGISTER	.*		MKOLDCOMPAT | 
|  | UNREGISTER	.*		RMOLDCOMPAT | 
|  |  | 
|  | This will cause devfsd to create (and destroy) symbolic links which | 
|  | point to the kernel-supplied names. | 
|  |  | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  |  | 
|  | Device drivers currently ported | 
|  |  | 
|  | - All miscellaneous character devices support devfs (this is done | 
|  | transparently through misc_register()) | 
|  |  | 
|  | - SCSI discs and generic hard discs | 
|  |  | 
|  | - Character memory devices (null, zero, full and so on) | 
|  | Thanks to C. Scott Ananian <[email protected]> | 
|  |  | 
|  | - Loop devices (/dev/loop?) | 
|  |  | 
|  | - TTY devices (console, serial ports, terminals and pseudo-terminals) | 
|  | Thanks to C. Scott Ananian <[email protected]> | 
|  |  | 
|  | - SCSI tapes (/dev/scsi and /dev/tapes) | 
|  |  | 
|  | - SCSI CD-ROMs (/dev/scsi and /dev/cdroms) | 
|  |  | 
|  | - SCSI generic devices (/dev/scsi) | 
|  |  | 
|  | - RAMDISCS (/dev/ram?) | 
|  |  | 
|  | - Meta Devices (/dev/md*) | 
|  |  | 
|  | - Floppy discs (/dev/floppy) | 
|  |  | 
|  | - Parallel port printers (/dev/printers) | 
|  |  | 
|  | - Sound devices (/dev/sound) | 
|  | Thanks to Eric Dumas <[email protected]> and | 
|  | C. Scott Ananian <[email protected]> | 
|  |  | 
|  | - Joysticks (/dev/joysticks) | 
|  |  | 
|  | - Sparc keyboard (/dev/kbd) | 
|  |  | 
|  | - DSP56001 digital signal processor (/dev/dsp56k) | 
|  |  | 
|  | - Apple Desktop Bus (/dev/adb) | 
|  |  | 
|  | - Coda network file system (/dev/cfs*) | 
|  |  | 
|  | - Virtual console capture devices (/dev/vcc) | 
|  | Thanks to Dennis Hou <[email protected]> | 
|  |  | 
|  | - Frame buffer devices (/dev/fb) | 
|  |  | 
|  | - Video capture devices (/dev/v4l) | 
|  |  | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  |  | 
|  | Allocation of Device Numbers | 
|  |  | 
|  | Devfs allows you to write a driver which doesn't need to allocate a | 
|  | device number (major&minor numbers) for the internal operation of the | 
|  | kernel. However, there are a number of userspace programmes that use | 
|  | the device number as a unique handle for a device. An example is the | 
|  | find programme, which uses device numbers to determine whether | 
|  | an inode is on a different filesystem than another inode. The device | 
|  | number used is the one for the block device which a filesystem is | 
|  | using. To preserve compatibility with userspace programmes, block | 
|  | devices using devfs need to have unique device numbers allocated to | 
|  | them. Furthermore, POSIX specifies device numbers, so some kind of | 
|  | device number needs to be presented to userspace. | 
|  |  | 
|  | The simplest option (especially when porting drivers to devfs) is to | 
|  | keep using the old major and minor numbers. Devfs will take whatever | 
|  | values are given for major&minor and pass them onto userspace. | 
|  |  | 
|  | This device number is a 16 bit number, so this leaves plenty of space | 
|  | for large numbers of discs and partitions. This scheme can also be | 
|  | used for character devices, in particular the tty devices, which are | 
|  | currently limited to 256 pseudo-ttys (this limits the total number of | 
|  | simultaneous xterms and remote logins).  Note that the device number | 
|  | is limited to the range 36864-61439 (majors 144-239), in order to | 
|  | avoid any possible conflicts with existing official allocations. | 
|  |  | 
|  | Please note that using dynamically allocated block device numbers may | 
|  | break the NFS daemons (both user and kernel mode), which expect dev_t | 
|  | for a given device to be constant over the lifetime of remote mounts. | 
|  |  | 
|  | A final note on this scheme: since it doesn't increase the size of | 
|  | device numbers, there are no compatibility issues with userspace. | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  |  | 
|  | Questions and Answers | 
|  |  | 
|  |  | 
|  | Making things work | 
|  | Alternatives to devfs | 
|  | What I don't like about devfs | 
|  | How to report bugs | 
|  | Strange kernel messages | 
|  | Compilation problems with devfsd | 
|  |  | 
|  |  | 
|  |  | 
|  | Making things work | 
|  |  | 
|  | Here are some common questions and answers. | 
|  |  | 
|  |  | 
|  |  | 
|  | Devfsd doesn't start | 
|  |  | 
|  | Make sure you have compiled and installed devfsd | 
|  | Make sure devfsd is being started from your boot | 
|  | scripts | 
|  | Make sure you have configured your kernel to enable devfs (see | 
|  | below) | 
|  | Make sure devfs is mounted (see below) | 
|  |  | 
|  |  | 
|  | Devfsd is not managing all my permissions | 
|  |  | 
|  | Make sure you are capturing the appropriate events. For example, | 
|  | device entries created by the kernel generate REGISTER events, | 
|  | but those created by devfsd generate CREATE events. | 
|  |  | 
|  |  | 
|  | Devfsd is not capturing all REGISTER events | 
|  |  | 
|  | See the previous entry: you may need to capture CREATE events. | 
|  |  | 
|  |  | 
|  | X will not start | 
|  |  | 
|  | Make sure you followed the steps | 
|  | outlined above. | 
|  |  | 
|  |  | 
|  | Why don't my network devices appear in devfs? | 
|  |  | 
|  | This is not a bug. Network devices have their own, completely separate | 
|  | namespace. They are accessed via socket(2) and | 
|  | setsockopt(2) calls, and thus require no device nodes. I have | 
|  | raised the possibilty of moving network devices into the device | 
|  | namespace, but have had no response. | 
|  |  | 
|  |  | 
|  | How can I test if I have devfs compiled into my kernel? | 
|  |  | 
|  | All filesystems built-in or currently loaded are listed in | 
|  | /proc/filesystems. If you see a devfs entry, then | 
|  | you know that devfs was compiled into your kernel. If you have | 
|  | correctly configured and rebuilt your kernel, then devfs will be | 
|  | built-in. If you think you've configured it in, but | 
|  | /proc/filesystems doesn't show it, you've made a mistake. | 
|  | Common mistakes include: | 
|  |  | 
|  | Using a 2.2.x kernel without applying the devfs patch (if you | 
|  | don't know how to patch your kernel, use 2.4.x instead, don't bother | 
|  | asking me how to patch) | 
|  | Forgetting to set CONFIG_EXPERIMENTAL=y | 
|  | Forgetting to set CONFIG_DEVFS_FS=y | 
|  | Forgetting to set CONFIG_DEVFS_MOUNT=y (if you want devfs | 
|  | to be automatically mounted at boot) | 
|  | Editing your .config manually, instead of using make | 
|  | config or make xconfig | 
|  | Forgetting to run make dep; make clean after changing the | 
|  | configuration and before compiling | 
|  | Forgetting to compile your kernel and modules | 
|  | Forgetting to install your kernel | 
|  | Forgetting to install your modules | 
|  |  | 
|  | Please check twice that you've done all these steps before sending in | 
|  | a bug report. | 
|  |  | 
|  |  | 
|  |  | 
|  | How can I test if devfs is mounted on /dev? | 
|  |  | 
|  | The device filesystem will always create an entry called | 
|  | ".devfsd", which is used to communicate with the daemon. Even | 
|  | if the daemon is not running, this entry will exist. Testing for the | 
|  | existence of this entry is the approved method of determining if devfs | 
|  | is mounted or not. Note that the type of entry (i.e. regular file, | 
|  | character device, named pipe, etc.) may change without notice. Only | 
|  | the existence of the entry should be relied upon. | 
|  |  | 
|  |  | 
|  | When I start devfsd, I see the error: | 
|  | Error opening file: ".devfsd"   No such file or directory? | 
|  |  | 
|  | This means that devfs is not mounted. Make sure you have devfs mounted. | 
|  |  | 
|  |  | 
|  | How do I mount devfs? | 
|  |  | 
|  | First make sure you have devfs compiled into your kernel (see | 
|  | above). Then you will either need to: | 
|  |  | 
|  | set CONFIG_DEVFS_MOUNT=y in your kernel config | 
|  | pass devfs=mount to your boot loader | 
|  | mount devfs manually in your boot scripts with: | 
|  | mount -t none devfs /dev | 
|  |  | 
|  |  | 
|  |  | 
|  | Mount by volume LABEL=<label> doesn't work with | 
|  | devfs | 
|  |  | 
|  | Most probably you are not mounting devfs onto /dev. What | 
|  | happens is that if your kernel config has CONFIG_DEVFS_FS=y | 
|  | then the contents of /proc/partitions will have the devfs | 
|  | names (such as scsi/host0/bus0/target0/lun0/part1). The | 
|  | contents of /proc/partitions are used by mount(8) when | 
|  | mounting by volume label. If devfs is not mounted on /dev, | 
|  | then mount(8) will fail to find devices. The solution is to | 
|  | make sure that devfs is mounted on /dev. See above for how to | 
|  | do that. | 
|  |  | 
|  |  | 
|  | I have extra or incorrect entries in /dev | 
|  |  | 
|  | You may have stale entries in your dev-state area. Check for a | 
|  | RESTORE configuration line in your devfsd configuration | 
|  | (typically /etc/devfsd.conf). If you have this line, check | 
|  | the contents of the specified directory for stale entries. Remove | 
|  | any entries which are incorrect, then reboot. | 
|  |  | 
|  |  | 
|  | I get "Unable to open initial console" messages at boot | 
|  |  | 
|  | This usually happens when you don't have devfs automounted onto | 
|  | /dev at boot time, and there is no valid | 
|  | /dev/console entry on your root file-system. Create a valid | 
|  | /dev/console device node. | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | Alternatives to devfs | 
|  |  | 
|  | I've attempted to collate all the anti-devfs proposals and explain | 
|  | their limitations. Under construction. | 
|  |  | 
|  |  | 
|  | Why not just pass device create/remove events to a daemon? | 
|  |  | 
|  | Here the suggestion is to develop an API in the kernel so that devices | 
|  | can register create and remove events, and a daemon listens for those | 
|  | events. The daemon would then populate/depopulate /dev (which | 
|  | resides on disc). | 
|  |  | 
|  | This has several limitations: | 
|  |  | 
|  |  | 
|  | it only works for modules loaded and unloaded (or devices inserted | 
|  | and removed) after the kernel has finished booting. Without a database | 
|  | of events, there is no way the daemon could fully populate | 
|  | /dev | 
|  |  | 
|  |  | 
|  | if you add a database to this scheme, the question is then how to | 
|  | present that database to user-space. If you make it a list of strings | 
|  | with embedded event codes which are passed through a pipe to the | 
|  | daemon, then this is only of use to the daemon. I would argue that the | 
|  | natural way to present this data is via a filesystem (since many of | 
|  | the events will be of a hierarchical nature), such as devfs. | 
|  | Presenting the data as a filesystem makes it easy for the user to see | 
|  | what is available and also makes it easy to write scripts to scan the | 
|  | "database" | 
|  |  | 
|  |  | 
|  | the tight binding between device nodes and drivers is no longer | 
|  | possible (requiring the otherwise perfectly avoidable | 
|  | table lookups) | 
|  |  | 
|  |  | 
|  | you cannot catch inode lookup events on /dev which means | 
|  | that module autoloading requires device nodes to be created. This is a | 
|  | problem, particularly for drivers where only a few inodes are created | 
|  | from a potentially large set | 
|  |  | 
|  |  | 
|  | this technique can't be used when the root FS is mounted | 
|  | read-only | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | Just implement a better scsidev | 
|  |  | 
|  | This suggestion involves taking the scsidev programme and | 
|  | extending it to scan for all devices, not just SCSI devices. The | 
|  | scsidev programme works by scanning /proc/scsi | 
|  |  | 
|  | Problems: | 
|  |  | 
|  |  | 
|  | the kernel does not currently provide a list of all devices | 
|  | available. Not all drivers register entries in /proc or | 
|  | generate kernel messages | 
|  |  | 
|  |  | 
|  | there is no uniform mechanism to register devices other than the | 
|  | devfs API | 
|  |  | 
|  |  | 
|  | implementing such an API is then the same as the | 
|  | proposal above | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | Put /dev on a ramdisc | 
|  |  | 
|  | This suggestion involves creating a ramdisc and populating it with | 
|  | device nodes and then mounting it over /dev. | 
|  |  | 
|  | Problems: | 
|  |  | 
|  |  | 
|  |  | 
|  | this doesn't help when mounting the root filesystem, since you | 
|  | still need a device node to do that | 
|  |  | 
|  |  | 
|  | if you want to use this technique for the root device node as | 
|  | well, you need to use initrd. This complicates the booting sequence | 
|  | and makes it significantly harder to administer and configure. The | 
|  | initrd is essentially opaque, robbing the system administrator of easy | 
|  | configuration | 
|  |  | 
|  |  | 
|  | insufficient information is available to correctly populate the | 
|  | ramdisc. So we come back to the | 
|  | proposal above to "solve" this | 
|  |  | 
|  |  | 
|  | a ramdisc-based solution would take more kernel memory, since the | 
|  | backing store would be (at best) normal VFS inodes and dentries, which | 
|  | take 284 bytes and 112 bytes, respectively, for each entry. Compare | 
|  | that to 72 bytes for devfs | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | Do nothing: there's no problem | 
|  |  | 
|  | Sometimes people can be heard to claim that the existing scheme is | 
|  | fine. This is what they're ignoring: | 
|  |  | 
|  |  | 
|  | device number size (8 bits each for major and minor) is a real | 
|  | limitation, and must be fixed somehow. Systems with large numbers of | 
|  | SCSI devices, for example, will continue to consume the remaining | 
|  | unallocated major numbers. USB will also need to push beyond the 8 bit | 
|  | minor limitation | 
|  |  | 
|  |  | 
|  | simply increasing the device number size is insufficient. Apart | 
|  | from causing a lot of pain, it doesn't solve the management issues | 
|  | of a /dev with thousands or more device nodes | 
|  |  | 
|  |  | 
|  | ignoring the problem of a huge /dev will not make it go | 
|  | away, and dismisses the legitimacy of a large number of people who | 
|  | want a dynamic /dev | 
|  |  | 
|  |  | 
|  | the standard response then becomes: "write a device management | 
|  | daemon", which brings us back to the | 
|  | proposal above | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | What I don't like about devfs | 
|  |  | 
|  | Here are some common complaints about devfs, and some suggestions and | 
|  | solutions that may make it more palatable for you. I can't please | 
|  | everybody, but I do try :-) | 
|  |  | 
|  | I hate the naming scheme | 
|  |  | 
|  | First, remember that no naming scheme will please everybody. You hate | 
|  | the scheme, others love it. Who's to say who's right and who's wrong? | 
|  | Ultimately, the person who writes the code gets to choose, and what | 
|  | exists now is a combination of the choices made by the | 
|  | devfs author and the | 
|  | kernel maintainer (Linus). | 
|  |  | 
|  | However, not all is lost. If you want to create your own naming | 
|  | scheme, it is a simple matter to write a standalone script, hack | 
|  | devfsd, or write a script called by devfsd. You can create whatever | 
|  | naming scheme you like. | 
|  |  | 
|  | Further, if you want to remove all traces of the devfs naming scheme | 
|  | from /dev, you can mount devfs elsewhere (say | 
|  | /devfs) and populate /dev with links into | 
|  | /devfs. This population can be automated using devfsd if you | 
|  | wish. | 
|  |  | 
|  | You can even use the VFS binding facility to make the links, rather | 
|  | than using symbolic links. This way, you don't even have to see the | 
|  | "destination" of these symbolic links. | 
|  |  | 
|  | Devfs puts policy into the kernel | 
|  |  | 
|  | There's already policy in the kernel. Device numbers are in fact | 
|  | policy (why should the kernel dictate what device numbers I use?). | 
|  | Face it, some policy has to be in the kernel. The real difference | 
|  | between device names as policy and device numbers as policy is that | 
|  | no one will use device numbers directly, because device | 
|  | numbers are devoid of meaning to humans and are ugly. At least with | 
|  | the devfs device names, (even though you can add your own naming | 
|  | scheme) some people will use the devfs-supplied names directly. This | 
|  | offends some people :-) | 
|  |  | 
|  | Devfs is bloatware | 
|  |  | 
|  | This is not even remotely true. As shown above, | 
|  | both code and data size are quite modest. | 
|  |  | 
|  |  | 
|  | How to report bugs | 
|  |  | 
|  | If you have (or think you have) a bug with devfs, please follow the | 
|  | steps below: | 
|  |  | 
|  |  | 
|  |  | 
|  | make sure you have enabled debugging output when configuring your | 
|  | kernel. You will need to set (at least) the following config options: | 
|  |  | 
|  | CONFIG_DEVFS_DEBUG=y | 
|  | CONFIG_DEBUG_KERNEL=y | 
|  | CONFIG_DEBUG_SLAB=y | 
|  |  | 
|  |  | 
|  |  | 
|  | please make sure you have the latest devfs patches applied. The | 
|  | latest kernel version might not have the latest devfs patches applied | 
|  | yet (Linus is very busy) | 
|  |  | 
|  |  | 
|  | save a copy of your complete kernel logs (preferably by | 
|  | using the dmesg programme) for later inclusion in your bug | 
|  | report. You may need to use the -s switch to increase the | 
|  | internal buffer size so you can capture all the boot messages. | 
|  | Don't edit or trim the dmesg output | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | try booting with devfs=dall passed to the kernel boot | 
|  | command line (read the documentation on your bootloader on how to do | 
|  | this), and save the result to a file. This may be quite verbose, and | 
|  | it may overflow the messages buffer, but try to get as much of it as | 
|  | you can | 
|  |  | 
|  |  | 
|  | if you get an Oops, run ksymoops to decode it so that the | 
|  | names of the offending functions are provided. A non-decoded Oops is | 
|  | pretty useless | 
|  |  | 
|  |  | 
|  | send a copy of your devfsd configuration file(s) | 
|  |  | 
|  | send the bug report to me first. | 
|  | Don't expect that I will see it if you post it to the linux-kernel | 
|  | mailing list. Include all the information listed above, plus | 
|  | anything else that you think might be relevant. Put the string | 
|  | devfs somewhere in the subject line, so my mail filters mark | 
|  | it as urgent | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | Here is a general guide on how to ask questions in a way that greatly | 
|  | improves your chances of getting a reply: | 
|  |  | 
|  | http://www.tuxedo.org/~esr/faqs/smart-questions.html. If you have | 
|  | a bug to report, you should also read | 
|  |  | 
|  | http://www.chiark.greenend.org.uk/~sgtatham/bugs.html. | 
|  |  | 
|  |  | 
|  | Strange kernel messages | 
|  |  | 
|  | You may see devfs-related messages in your kernel logs. Below are some | 
|  | messages and what they mean (and what you should do about them, if | 
|  | anything). | 
|  |  | 
|  |  | 
|  |  | 
|  | devfs_register(fred): could not append to parent, err: -17 | 
|  |  | 
|  | You need to check what the error code means, but usually 17 means | 
|  | EEXIST. This means that a driver attempted to create an entry | 
|  | fred in a directory, but there already was an entry with that | 
|  | name. This is often caused by flawed boot scripts which untar a bunch | 
|  | of inodes into /dev, as a way to restore permissions. This | 
|  | message is harmless, as the device nodes will still | 
|  | provide access to the driver (unless you use the devfs=only | 
|  | boot option, which is only for dedicated souls:-). If you want to get | 
|  | rid of these annoying messages, upgrade to devfsd-v1.3.20 and use the | 
|  | recommended RESTORE directive to restore permissions. | 
|  |  | 
|  |  | 
|  | devfs_mk_dir(bill): using old entry in dir: c1808724 "" | 
|  |  | 
|  | This is similar to the message above, except that a driver attempted | 
|  | to create a directory named bill, and the parent directory | 
|  | has an entry with the same name. In this case, to ensure that drivers | 
|  | continue to work properly, the old entry is re-used and given to the | 
|  | driver. In 2.5 kernels, the driver is given a NULL entry, and thus, | 
|  | under rare circumstances, may not create the require device nodes. | 
|  | The solution is the same as above. | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | Compilation problems with devfsd | 
|  |  | 
|  | Usually, you can compile devfsd just by typing in | 
|  | make in the source directory, followed by a make | 
|  | install (as root). Sometimes, you may have problems, particularly | 
|  | on broken configurations. | 
|  |  | 
|  |  | 
|  |  | 
|  | error messages relating to DEVFSD_NOTIFY_DELETE | 
|  |  | 
|  | This happened because you have an ancient set of kernel headers | 
|  | installed in /usr/include/linux or /usr/src/linux. | 
|  | Install kernel 2.4.10 or later. You may need to pass the | 
|  | KERNEL_DIR variable to make (if you did not install | 
|  | the new kernel sources as /usr/src/linux), or you may copy | 
|  | the devfs_fs.h file in the kernel source tree into | 
|  | /usr/include/linux. | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  |  | 
|  | Other resources | 
|  |  | 
|  |  | 
|  |  | 
|  | Douglas Gilbert has written a useful document at | 
|  |  | 
|  | http://www.torque.net/sg/devfs_scsi.html which | 
|  | explores the SCSI subsystem and how it interacts with devfs | 
|  |  | 
|  |  | 
|  | Douglas Gilbert has written another useful document at | 
|  |  | 
|  | http://www.torque.net/scsi/SCSI-2.4-HOWTO/ which | 
|  | discusses the Linux SCSI subsystem in 2.4. | 
|  |  | 
|  |  | 
|  | Johannes Erdfelt has started a discussion paper on Linux and | 
|  | hot-swap devices, describing what the requirements are for a scalable | 
|  | solution and how and why he's used devfs+devfsd. Note that this is an | 
|  | early draft only, available in plain text form at: | 
|  |  | 
|  | http://johannes.erdfelt.com/hotswap.txt. | 
|  | Johannes has promised a HTML version will follow. | 
|  |  | 
|  |  | 
|  | I presented an invited | 
|  | paper | 
|  | at the | 
|  |  | 
|  | 2nd Annual Storage Management Workshop held in Miamia, Florida, | 
|  | U.S.A. in October 2000. | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  |  | 
|  |  | 
|  | Translations of this document | 
|  |  | 
|  | This document has been translated into other languages. | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | The document master (in English) by [email protected] is | 
|  | available at | 
|  |  | 
|  | http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html | 
|  |  | 
|  |  | 
|  |  | 
|  | A Korean translation by [email protected] is available at | 
|  |  | 
|  | http://your.destiny.pe.kr/devfs/devfs.html | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | ----------------------------------------------------------------------------- | 
|  | Most flags courtesy of ITA's | 
|  | Flags of All Countries | 
|  | used with permission. |