display | more...

A form of index or table of contents which was very popular for UN*X manuals in the days when these were printed on dead trees. A permuted index allows rapid access to a set of headings using traditional paper-based searching techniques. Given a set of headings, if we sorted them we could rapidly find any of the headings, if only we could remember it exactly. But if we only remember the beginning of a word or phrase from it, it's much harder.

A permuted index of the set of headings is produced as follows: Suppose you have these 3 headings, pointing to the page numbers in brackets:

The quick brown fox (1)
The slow brown fox (22)
The quick blue dog (123)
To produce a permuted index, first apply all cyclic permutations to each heading. In plain English, this means you rotate the words around to get several headings. For instance, from "The quick brown fox" we get:
|The quick brown fox
quick brown fox |The
brown fox |The quick
fox |The quick brown
(here the | shows the start of the heading). Now sort all of the generated headings:
blue dog |The quick (123)
brown fox |The quick (1)
brown fox |The slow (22)
dog |The quick blue (123)
fox |The quick brown (1)
fox |The slow brown (22)
quick blue dog |The (123)
quick brown fox |The (1)
slow brown fox |The (22)
|The quick blue dog (123)
|The quick brown fox (1)
|The slow brown fox (22)
Given this permuted index, it's very easy to find all headings containing the word "quick" -- just binary search for that word, like you'd find people in the phone directory. And even a phrase will stick together!

For automated information retrieval, better methods exist, of course. However, do note that a suffix tree is no more than a permuted index, magically compressed into linear space!

On UN*X, you can generate permuted indexes for troff and nroff with the program "ptx". A GNU version of ptx exists, too.

The name KWIC -- KeyWord In Context -- is another term for "permuted index".

Log in or register to write something here or to contact authors.