About the Pumping Lemma
Here I try to explain the Pumping Lemma in plain English
Still some knowledge of Computational Theory is assumed.
that a Language
, one can construct a DFA
, an NFA
a Regular Expression
for that Language
. If any of these can accept
or express the exact set of strings in the Language, then it can be said that the Language is Regular.
are called proof by construction
To prove that a language is not
Regular is a different matter. The fact that a DFA
or a Regular Expression
has yet to be found to accept or express all strings in the language is not sufficient evidence
. It is always possible that such a device still exists
but has yet to be found!
In order to prove that a language is not Regular, we will use a different mechanism
We will consider a property
that is true of all Regular Languages
is known as the Pumping Lemma
We will then assume
that a language which we suspect
is not Regular to be Regular, and then demonstrate that the properties defined by the Pumping Lemma do not hold. In
other words, we will utilize proof by contradiction
of the Pumping Lemma is as follows:
Consider a DFA that consists of three states
We know that when we run the Machine
on a string, a transition is made for each character in the string. Each transition leaves us in a predetermined
state. This means that for every character
in a string
, we have visited a (not necessarily unique
) state. If our DFA has three states, and we run a string of length five on it, (ignoring the start state
) five states have been reached. If our DFA consists of only three states, the pigeon-hole principle
states that at least on state must have been visited multiple times. If this is true, we know that there is at least one cycle
in our DFA. By cycle, we mean, a sequence of states that start and end with the same state. If we look at the DFA, we can find a sequence
of characters that will bring us through this cycle. Since we return
to the same state each time we go through the cycle, it can be seen that if a cycle can be traversed once in a DFA, it can be traversed infinitely
In other words; If a DFA has a cycle that, when traversed, keeps open the possibility of exiting and reaching an accept state, any string that has more characters than the number of states in the DFA will have a substring
which is infinitely repeatable
. When this substring is repeated (pumped
), the resulting string will also be in the language of the DFA.
Since this is true of all DFA's, it is true of Regular Languages.
To prove that a language is not Regular, we show that the Pumping Lemma
does not hold.
To do this, find a string which is known to be larger than the Pumping L
ength. Show that there is no substring that can be infinitely repeated
while keeping the resulting string in the language. If you can not show this, try another string. There may be some strings in non-regular languages that are perfectly pumpable!
Failure to find such a string proves nothing about the language being Regular or Non-Regular
. The Pumping Lemma can not be used to prove that a language is Regular.
The Formal definition
of the Pumping Lemma can be found on page 78 of Michael Sipser
's Introduction to the Thoery of Computation
It says that if a language is Regular
, a Pumping Length
can be found. Remember, intuitively
, we consider this number to be the number of states
in our Machine
If a string is longer than this length, it is possible to subdivide
it into three parts:
x, y and z
x consists of all characters parsed before the DFA cycle.
y consists of all characters parsed in the DFA cycle.
z consists of all characters parsed after the DFA cycle.
There are three properties that this string, under the subdivision will have. Here I explain their signifigance
- for all i >= 0, x(y^i)z will be a string that is in t
This makes since because the cycle can be traversed any number of times.
(By y^i, we mean y*)
- |y| > 0
If the length of y were 0, this would mean that there is nothing to pump
! Essentially, there is no part of the string that has gone through the cycle.
- |xy| < = p
If the length of xy were greater than p, we have made a bad choice for y.
xy is longer than the number of states, y is our cycle, bu
t before we have finished traversing it, we have already seen some state
more than once. There must be a cycle somewhere earlier in the string.
That earlier cycle should be our y!
When proving that the language is not regular, it is not sufficient to choose one arbitrary
subdivision of the string into x,y,z and show that y
can not be pumped.
It must be shown that for all possible subdivisions
of the string into x,y,z, pumping y
will always, at some point, result in a string which is not in the language.