**About the Pumping Lemma**
Here I try to explain the Pumping Lemma in

plain English.
Still some knowledge of Computational Theory is assumed.

To

prove that a

Language is

Regular, one can construct a

DFA, an

NFA or
a

Regular Expression for that

Language. If any of these can

accept or express the exact set of strings in the Language, then it can be said that the Language is Regular.

Such

proofs are called

proof by *construction*.

To prove that a language is

*not* Regular is a different matter. The fact that a

DFA,

NFA or a

Regular Expression has yet to be found to accept or express all strings in the language is not sufficient

evidence. It is always possible that such a device still

exists but has yet to be found!
In order to prove that a language is not Regular, we will use a different

mechanism.
We will consider a

property that is true of

*all* Regular Languages.
This

property is known as the

*Pumping Lemma*.
We will then

assume that a language which we

suspect is not Regular to be Regular, and then demonstrate that the properties defined by the Pumping Lemma do not hold. In
other words, we will utilize

proof by contradiction.

The

idea of the Pumping Lemma is as follows:

Consider a DFA that consists of three

states.

We know that when we run the

Machine on a string, a transition is made for each character in the string. Each transition leaves us in a

predetermined state. This means that for every

character in a

string, we have visited a (not necessarily

unique) state. If our DFA has three states, and we run a string of length five on it, (ignoring the

start state) five states have been reached. If our DFA consists of only three states, the

pigeon-hole principle states that at least on state must have been visited multiple times. If this is true, we know that there is at least one

cycle in our DFA. By cycle, we mean, a sequence of states that start and end with the same state. If we look at the DFA, we can find a

sequence of characters that will bring us through this cycle. Since we return
to the same state each time we go through the cycle, it can be seen that if a cycle can be traversed once in a DFA, it can be

traversed infinitely many times.
In other words; If a DFA has a cycle that, when traversed, keeps open the possibility of exiting and reaching an accept state, any string that has more characters than the number of states in the DFA will have a

substring which is

infinitely repeatable. When this substring is repeated (

pumped), the resulting string will also be in the language of the DFA.
Since this is true of all DFA's, it is true of Regular Languages.

To prove that a language is not Regular, we show that the Pumping Lemma
does not hold.
To do this, find a string which is known to be larger than the Pumping L
ength. Show that there is no substring that can be infinitely repeated
while keeping the resulting string in the language. If you can not show this, try another string. There may be some strings in non-regular languages that are perfectly pumpable!
Failure to find such a string proves nothing about the language being Regular or

Non-Regular. The Pumping Lemma can not be used to prove that a language is Regular.

The

Formal definition of the Pumping Lemma can be found on page 78 of

Michael Sipser's Introduction to the

Thoery of Computation.

It says that if a language is

Regular, a

Pumping Length can be found. Remember,

intuitively, we consider this number to be the number of states
in our

Machine.

If a string is longer than this length, it is possible to

subdivide it into three parts:

x, y and z

x consists of all characters parsed before the DFA cycle.

y consists of all characters parsed in the DFA cycle.

z consists of all characters parsed after the DFA cycle.

There are three properties that this string, under the subdivision will have. Here I explain their

signifigance.

**for all ***i* >= 0, x(y^i)z will be a string that is in t
he language.

This makes since because the cycle can be traversed any number of times.

(By y^i, we mean y*)
**|y| > 0**

If the length of y were 0, this would mean that there is nothing to pump
! Essentially, there is no part of the string that has gone through the cycle.
** |xy| < = p **

If the length of xy were greater than p, we have made a bad choice for *y*.

xy is longer than the number of states, y is our cycle, bu
t before we have finished traversing it, we have already seen some state
more than once. There must be a cycle somewhere earlier in the string.
That earlier cycle should be our y!

When proving that the language is not regular, it is not sufficient to choose one

arbitrary subdivision of the string into x,y,z and show that

*y* can not be pumped.
It must be shown that for

*all possible subdivisions* of the string into x,y,z, pumping

*y* will always, at some point, result in a string which is not in the language.