display | more...

The inverted file, sometimes also called the inverted index, is a very common data structure in the information retrieval field. The inverted file contains the words (the vocabulary) which are to be indexed, and associated with each word are the occurences, or location(s) of the word.

Here's an example:

1   2        3    4  5    6      7   8        9
The inverted file is also called the inverted index.

The inverted file would look like:

also     5
called   6
file     3
index    9
inverted 2, 8
is       4
the      1, 7

As you can imagine, the inverted file is used to rapidly locate the location(s) of a query term. The occurences part can vary in granularity, here we index acording to word-position, many information retrieval systems will record only the document in which the word is located, while others may record up to the exact character position.

Reference: Ricardo Baeza-Yates and Berthier Rivbiero-Neto, Modern Information Retrieval, ACM Press, 1999.

Log in or register to write something here or to contact authors.