# An In-Depth Look at Regular Expressions Part 2: Regular Languages

Jun 21, 2016

So we’ve discussed what a language is from a computer science theory viewpoint in part 1 of this series, and we have 2 ways of representing languages:
Set notation: `{λ, a, aa, aaa, ...}` or `{a^n | 0 <= n}` for short,
and grammar production rules: `{S -> aS | λ}`

As I stated in part 1, a regular expression is a representation of not just any language, but a regular language specifically.

Regular languages are just languages with certain restrictions. To avoid getting too confusing and mathy, we’re going to skip over the formal definition of a regular language and jump straight to regular grammars; recall that a grammar is simply a way of representing a language, so a regular (or linear) grammar is a representation of a regular language (just like a regular expression!). So one example of a regular grammar is one in which every production rule is of the form: A -> aB, A -> a, or A -> λ, where A and B are non-terminals, and a is a terminal (this is called a left linear or left regular grammar).

For example:

```P = { S -> aB B -> bC C -> c }```

Each production rule in this grammar fits one of the forms above, so it is a regular grammar, and therefore the language represented by this grammar is a regular language.
Note: There are other grammars that represent this same language, such as:

```P1 = { S -> abc }```

or

```P2 = { S -> Bc B -> Ab A -> a }```

(For the record, P2 is an example of a right linear grammar)

Depending on the language or library you’re using, regular expressions have a lot of extra features and shortcuts, but a standard regex has only 3 basic operations:
1) Concatination `(ab)`
2) Alternation `(a|b)`
3) Kleene Star `(a*)`

Any other operation can be made from these operations.
The + operator, for example, can be defined as: a+ is equivalent to `a(a*)`
Another useful operator is the ? operator, which can be defined as: a? is equivalent to `(a|)`