hardlink

The main reason the Unix hard link is confusing is that it suggests a feature in Unix file systems that doesn't actually exist.

In Unix, a file is identified by an inode on a file system. Files have attributes, such as permissions, ownership and various timestamps. None of these attributes is a name for the file.

A directory is a file that contains a mapping of file names to inodes. An inode can appear multiple times, in different directories, or even in the same directory under different names. An occurrence of a file in a directory is called a hard link to the file.

So a filename is not actually a property of a file: it is a property of a hard link to the file, which is an entry in a directory in which the file appears. Creating a hard link is not an operation on a file; it is an edit operation on a directory.

However, the constraint is maintained that every inode has at least one hard link, while every hard link (= directory entry) points to a valid file. To impose these constraints, a file is deleted automatically after the last hard link to it is deleted (and no process has the file open), while a file can only be created together with a hard link to it. (Exceptions are possible, once you know about file descriptors.)

A further constraint is that for every file there is a finite path of hardlinks to it from the root directory, /. This guarantees that every file has a full pathname, also called an absolute name or path to the file: it is formed by concatenating the names of the hard links preceded by /.

A file has a unique full pathname only when all files on the path have no more than one hard link. For directories, this constraint is usually maintained within a single file system, but it isn't across file systems: the same file system can be simultaneously mounted on different directories, and automounters and loopback file systems even allow it to appear in infinitely many different places. All this without using a single symlink.

So now that we know what a hard link is, where does the confusion come from?

It originates from the fact that most operations on files are by filename or full pathname, and practically none of them will increase the number of hard links to a file. Therefore it is natural for users to think of a filename as being a property of the file, indicating its unique point of appearance in the directory tree. Users may work with Unix for years without encountering multiple hard links.

Users who do know about hard links tend to use the term for additional hard links to an already existing file, created, for instance, with the ln command. While this is a natural thought pattern, it leaves the incorrect impression that such additional hard links are in any way different from the first hard link to a file. They are not: once a file has multiple hard links, none of them can be distinguished as "the real filename" or "the original link" in any way. After doing ln a b, I do not have "the real file a" and "the hardlink b"; I have one file with two hardlinks, a and b.

Other systems, such as E2, use the term hard link in different ways.

Soft link	Hard link	brackets	softlink
E2 Link and Logger Client	E2 node autolinker in perl	unixism	Link and Link
Hit by the realization that they are all getting to know you nodes	link	inode	Pipe link
symlink	HTML-formatted hard links	interconnectedness of all things	Red
autonoding	Piping	Mandelbrot set	Please help us recover your nodes by linking their titles below
E2 HTML tags	section	pipe

Recommended Reading

About Everything2

User Picks

Editor Picks

New Writeups