The empty string (often written as lowercase epsilon ε) is the string of length zero. It is analogous to the empty set (sometimes called the null set, and written ∅) in set theory. The empty string is slightly more difficult to conceptualize because of the differences between sets and strings. Still, the two have much in common.
- String length corresponds to set cardinality. The length of the empty string and the cardinality of the empty set are both zero. |ε| = |∅| = 0.
- Both the empty string and the empty set contain only themselves.
- Just as the empty set is a member of all sets, the empty string is contained in all strings.
String and sets are different. Most importantly, sets are unordered collections of unique elements, while strings have order and the potential for repetition. For example, though the notation used to describe them is different, in set theory all the following sets are considered to be the same.
Each of the following strings
however, is different.
, the strings abc and εabc are equivalent. This is the tricky part to understand about the empty string. Since sets are not ordered, there's no concept of where
in a set an element occurs. Nor is the number of occurrences an issue. With strings, these things make a difference. So, where is
the empty string? How many
times does it occur?
One way to think about it is that there are an infinite number of empty strings before and after every element in every string. Since the empty string has no length, you can always have one whenever you need it, but since writing it doesn't change the string at all, you never need to explicitly list it. Take the string ab. You could write it aεb, aεεb, or aεε…εb, and it is still the same string. This can be very useful when working with automata because the empty string can be used to represent a transition that takes effect without reading any input.