display | more...

Regular Expressions:

This node serves as an introduction to regular expressions.
It covers concatenation, the union operator '|' and star '*'
These are the only necessary operators. Other Operators simply make the expression more simple, but do not add to the expressive power of the expressions themselves.

Consider a language to be a set of elements.
An expression is used to describe the elements of the set without explicitly listing them.

Consider a regular language as a set of strings
A regular expression is used to describe all strings in the set without explicitly listing every possible one.


Examples:
{"Do"} is a language, or a set containing one element. That element is the string, "Do".
A suitable expression for this language would be:
Do

{"Do","Die"} is another language. It consists of two strings, "Do" and "Die".
A suitable expression for this language would be:
Do|Die
This says that if a string is made up of either the sequence of chars "D o" or the sequence of chars "Die", then it is in the language .

{"Do","Die","Dig"} similarly can be expressed by:
Do|Die|Dig

Now consider the language, {"Dos","Dies","Digs"}
In the same manner as above, we can describe this language with the expression:
Dos|Dies|Digs
Notice that there is a common char 's' at the end of each string. By concatenation, this can be distributed over the expression:
(Dos|Dies|Digs) is equivalent to (Do|Die|Dig)s

Also, notice that there is a common char 'D' at the beginning of each string. This can be similarly distributed over the expression:
(Do|Die|Dig)s is equivalent to D(o|ie|ig)s

This latest expression says that every string in the language begins with a D, which is followed by one of ("o","ie","ig") and ends with an s. If a string does not follow this form, it is not in the language described by the expression.

Notice that the expression: D(o|ie|ig)s is also equivalent to D(o|(i(e|g)))s

All of the languages mentioned so far have a finite number of elements.
Regular expressions can also describe languages that are made up of an infinite number of elements.

Consider the language,
{"Dos","Doso",Dosoo","Dosoo",...} U {"Dies","Dieso","Diesoo","Diesooo",...}U {"Digs","Digso","Digsoo","Digsooo",...}
This contains all elements described by the regular expression D(o|ie|g)s as well as all strings that start with one of these members, followed by any number of the letter "o".

A regular expression that describes all strings consisting of zero or more "o"'s would be:
o*
This describes the set {e,"o","oo","ooo",...}
We can express the concatenation of this set with the last set with the regular expression:
D(o|ie|g)so*

More regular expressions:
One digit, 0 Through 9:
(0|1|2|3|4|5 ... |9) or (0-9)

At least one digit:
(0-9)(0-9)*

All decimals:
(0-9)(0-9)*("."|e)(0-9)*

A regular expresion over {0,1} where "0" is the third digit from the last:
(0|1)*0(0|1)(0|1)

A regular expresion over {0,1} where "0" is the fourth digit from the last:
(0|1)*0(0|1)(0|1)(0|1)

A string consisting of an even number of ones and zeros
(e|01|10|00|11)* or ((0|1)(0|1))*

Log in or register to write something here or to contact authors.