Let's see a very simple directory implementation, coming from NetBSD's tmpfs (see tmpfs.h). This file system implements directories by using a linked list (struct tmpfs_dir) of directory entries (struct tmpfs_dirent). These structures look like:
struct tmpfs_dirent {Of special interest are the td_name field, which holds the entry name (file name), and the td_node pointer, which points to the file system node related to this directory entry (UFS could use i-node numbers instead).
TAILQ_ENTRY(tmpfs_dirent) td_entries;
uint16_t td_namelen;
char *td_name;
struct tmpfs_node *td_node;
};
TAILQ_HEAD(tmpfs_dir, tmpfs_dirent);
This implementation is really simple as it is completely backed by virtual memory; adding and removing entries is as easy as allocating a memory block and modifying the linked list accordingly. It could, of course, be more complex if it used a B+ tree or some other structure instead.
However, on-disk file systems do extra tasks to optimize directory accesses. For example, when an entry is removed, it is marked as deleted using a special flag but it is not really removed from disk, because shrinking the directory could be expensive. Similarly, new entries overwrite previously deleted entries, if possible.
In the next post, we will outline how some directory operations (such as lookup and readdir) work.
Post suggested by Pavel Cahyna.