Intro to Theory of Computation

Introduction Finite Automata & Regular Languages Regular Operations Deterministic & Nondeterministic Machine Regular Expressions

Introduction

Automata theory is a foundational area of theoretical computer science and discrete mathematics that studies abstract machines and computational problems. It provides a rigorous framework for understanding formal languages, state-based computation, and algorithmic processes. For instance, automata theory is essential in compiler design, natural language processing, software verification, and the study of complexity classes in the theory of computation. By exploring automata theory, we will develop a deeper understanding of computation, logic, and the mathematical principles governing algorithmic processes.

In this page, we introduce the simplest computational model — the finite automaton - and the class of languages it recognizes: regular languages. We then study the regular operations (union, concatenation, and Kleene star), the equivalence between deterministic and nondeterministic machines, and the connection to regular expressions.

Finite Automata & Regular Languages

Real computers are enormously complex, so to study the fundamental principles of computation, we introduce simplified computational models that isolate specific features. The simplest such model is the finite automaton (also called a finite state machine).

Definition: Finite Automaton (DFA)

A finite automaton is a 5-tuple \((Q, \Sigma, \delta, q_0, F)\) where

  • \(Q\) is a finite set called the states,
  • \(\Sigma\) is a finite set called the alphabet,
  • \(\delta: Q \times \Sigma \to Q\) is the transition function,
  • \(q_0 \in Q\) is the start state,
  • \(F \subseteq Q\) is the set of accept states.

Let \(A\) be the set of all strings that machine \(M\) accepts. We say that \(A\) is the language of machine \(M\) and write: \[ L(M) = A \] We also say that \(M\) recognizes \(A\). Moreover, a language is called a regular language if some finite automaton recognizes it.

Consider the following finite automaton.

Finite automata

First, this directed multigraph is called the state diagram of the finite automaton. It has three states, \(Q = \{q_0, q_1, q_2\}\), where \(q_0\) is the start state, and \(F = \{q_1\}\) is the accept state which is denoted by a double circle. The arrows are transitions \(\delta\) from one state to another state. This automaton processes an input string over the alphabet (in this case, \(\Sigma = \{0, 1\}\)) and returns either "accept" or "reject" as output.

For example, if input string is \(1101101\), this automaton processes the string as follows:

   Start reading
   Read 1, move from \(q_0\) to \(q_1\)
   Read 1, stay in \(q_1\)
   Read 0, move to \(q_2\)
   Read 1, move to \(q_1\)
   Read 1, stay in \(q_1\)
   Read 0, move to \(q_2\)
   Read 1, move to \(q_1\)
   End reading
   Return "accept."

This automaton recognizes strings that end with a 1 followed by an even number of 0s. For example, "10100" is accepted because it ends with 1 followed by two 0s, but "1011000" is rejected because it ends with three 0s after the last 1. Note that a string ending in 1 (with zero trailing 0s) is also accepted, since zero is even.

Regular Operations

We now define three fundamental operations on languages. A key result is that regular languages are closed under all three - applying any of these operations to regular languages always yields another regular language.

Definition: Regular Operations

Let \(A\) and \(B\) be languages over an alphabet \(\Sigma\).

  • Union:
    \(A \cup B = \{x \mid x \in A \text{ or } x \in B\}\)
  • Concatenation:
    \(A \circ B = \{xy \mid x \in A \text{ and } y \in B\}\)
    Note: \(\emptyset \circ A = \emptyset\), and \(\{\epsilon\} \circ A = A\). Also, \(A \circ A = A^2\), and we define \(A^0 = \{\epsilon\}\).
  • Kleene Star:
    \(A^* = \{x_1 x_2 \cdots x_k \mid k \geq 0 \text{ and each } x_i \in A\}\)
    Equivalently, \(A^* = A^0 \cup A^1 \cup A^2 \cup \cdots\).
    Note: The empty string \(\epsilon\) is always in \(A^*\) (the case \(k = 0\)). In particular, \(\emptyset^* = \{\epsilon\}\) since \(\emptyset^* = \emptyset^0 \cup \emptyset^1 \cup \emptyset^2 \cup \cdots\).

For example, Let \(\Sigma = \{a, \, b, \, c, \, \cdots, \, x, \, y, \, z\}\). If \(A = \{black, \, white\}\) and \(B = \{dog, \, cat\}\), then \[ \begin{align*} &A \cup B = \{black, \, white, \, dog, \, cat \} \\\\ &A \circ B = \{blackdog, \, blackcat, \, whitedog, \, whitecat\} \\\\ &A^* = \{\epsilon, \, black, \, white, \, blackblack, \, blackwhite, \, whiteblack, \, whitewhite, \, blackblackblack, \, blackblackwhite, \,\ \cdots\} \end{align*} \]

Theorem 1: Closure of Regular Languages

The class of regular languages is closed under the regular operations. That is, if \(A\) and \(B\) are regular languages, then \(A \cup B\), \(A \circ B\), \(A^*\), and \(B^*\) are also regular.

Proof: Closure under union

Suppose we have two finite automata \[ M_1 = (Q_1, \Sigma, \delta_1, q_{0_1}, F_1) \] and \[ M_2 = (Q_2, \Sigma, \delta_2, q_{0_2}, F_2). \] These can recognize regular languages \(A_1\) and \(A_2\) respectively.

Here, we can construct \(M_3\) to recognize \(A_1 \cup A_2\) as follows: \[ M_3 = (Q_3, \Sigma, \delta_3, q_{0_3}, F_3) \] where \[ Q_3 = Q_1 \times Q_2 = \{(q_1, q_2)\, | \, q_1 \in Q_1 \text{ and } q_2 \in Q_2\} \]

\[ \delta_3((q_1, q_2), a) = (\delta_1(q_1, a), \delta_2(q_2, a)) \quad \forall a \in \Sigma \]

\[ q_{0_3} = (q_{0_1}, q_{0_2}) \]

\[ F_3 = \{(q_1, q_2) \,| \, q_1 \in F_1 \text{ or } q_2 \in F_2 \} \] which is the same as \((F_1 \times Q_2) \cup (Q_1 \times F_2)\).

Note: If we say \(F_3 = F_1 \times F_2\), the resulting language is the intersection. Actually, the class of regular languages is closed under intersection too.

Deterministic & Nondeterministic Machine

So far, we have considered deterministic computation: at each step, the current state and input symbol uniquely determine the next state. However, what if a computation is allowed multiple choices for the next state? Such a computation is called nondeterministic. Every deterministic finite automaton (DFA) is trivially a nondeterministic finite automaton (NFA), since determinism is a special case of nondeterminism.

Definition: Nondeterministic Finite Automaton (NFA)

A nondeterministic finite automaton is a 5-tuple \((Q, \Sigma, \delta, q_0, F)\) where

  • \(Q\) is a finite set of states,
  • \(\Sigma\) is a finite alphabet,
  • \(\delta: Q \times \Sigma_{\epsilon} \to \mathcal{P}(Q)\) is the transition function,
  • \(q_0 \in Q\) is the start state,
  • \(F \subseteq Q\) is the set of accept states,

where \(\Sigma_{\epsilon} = \Sigma \cup \{\epsilon\}\) and \(\mathcal{P}(Q)\) is the power set of \(Q\).

Note: The power set of a set is the set of all possible subsets of that set, including the empty set and the set itself. For example, if \(Q = \{a, b, c\}\), then \(\mathcal{P}(Q) = \{\emptyset, \{a\}, \{b\}, \{c\}, \{a, b\}, \{a, c\}, \{b, c\}, \{a, b, c\}\} \). So, if a set has \(n\) elements, its power set has \(2^n\) elements.

At each state, the NFA can split itself into multiple branches and keeps following all the possible branches in parallel. In other words, the NFA can have multiple active states at once, while the DFA always has a single active state. If at least one of these branches reaches an accept state, then the whole computation accepts.

Note: The NFA sounds like parallel computation, but in "actual" parallel computation, multiple processors or threads execute tasks concurrently, often with considerations for synchronization, communication, and resource sharing. NFAs don't model these practical concerns.

Theorem 2: NFA-DFA Equivalence

Every NFA has an equivalent DFA. That is, for every NFA \(N\), there exists a DFA \(D\) such that \(L(N) = L(D)\).

Importantly, this theorem says that NFAs and DFAs recognize the same class of languages - the regular languages. However, an NFA is often much smaller and easier to design than its equivalent DFA. The conversion (known as the power set construction or subset construction) can produce a DFA with up to \(2^{|Q|}\) states, so the two models are not equivalent in terms of succinctness.

We now introduce a third way to describe regular languages - through algebraic expressions rather than machines.

Regular Expressions

Definition: Regular Expression

\(R\) is a regular expression over the alphabet \(\Sigma\) if \(R\) is one of the following:

  1. \(a\) for some \(a \in \Sigma\)  (the language \(\{a\}\)),
  2. \(\epsilon\)  (the language \(\{\epsilon\}\) containing only the empty string),
  3. \(\emptyset\)  (the empty language, containing no strings),
  4. \((R_1 \cup R_2)\), where \(R_1\) and \(R_2\) are regular expressions,
  5. \((R_1 \circ R_2)\), where \(R_1\) and \(R_2\) are regular expressions,
  6. \((R_1^*)\), where \(R_1\) is a regular expression.

This is an inductive definition: cases 4-6 build larger expressions from smaller ones.

A language that can be described by a regular expression is precisely a language recognizable by a finite automaton — this is the reason such languages are called regular. To distinguish a regular expression \(R\) from the language it describes, we write \(L(R)\) for the language of \(R\).

Theorem 3: Equivalence of Regular Expressions and Finite Automata

A language is regular if and only if some regular expression describes it. That is:

  1. If a language is described by a regular expression, then it is regular.
  2. If a language is regular, then it is described by a regular expression.
Proof: Part 1

For each case in the definition, we want to convert regular expression \(R\) into NFA \(M\) recognizing some language because we know that if an NFA recognizes the language, then it is regular.

  1. \(R = a\) for some \(a \in \Sigma\):
    Then \[ L(R) = \{a\} \] and the following NFA recognizes \(L(R)\): \[ M = (\{q_0 , q_1\}, \Sigma, \delta, q_0, \{q_1\}) \] where \(\delta(q_0, a) = \{q_1\}\) and \(\delta(r, b) = \emptyset\) for \(r \neq q_0\) or \(b \neq a\).
  2. \(R =\epsilon\):
    Then \[ L(R) = \{\epsilon\} \] and the following NFA recognizes \(L(R)\): \[ M = (\{q_0\}, \Sigma, \delta, q_0, \{q_0\}) \] where \(\delta(r, b) = \emptyset\) for any \(r\) and \(b\).
  3. \(R = \emptyset\):
    Then \[ L(R) = \emptyset \] and the following NFA recognizes \(L(R)\): \[ M = (\{q\}, \Sigma, \delta, q, \emptyset) \] where \(\delta(r, b) = \emptyset\) for any \(r\) and \(b\).
  4. Case 1, 2, and 3

    For cases 4, 5, and 6, let \(M_1 = (Q_1, \Sigma, \delta_1, q_{0_1}, F_1)\) be an NFA that recognizes \(L(R_1)\) and \(M_2 = (Q_2, \Sigma, \delta_2, q_{0_2}, F_2)\) be an NFA that recognizes \(L(R_2)\). We assume that the state sets \(Q_1\) and \(Q_2\) are disjoint.

  5. \(R = (R_1 \cup R_2)\):
    Then \[ L(R) = (R_1 \cup R_2) \] and the following NFA recognizes \(L(R)\): \[ M = ( Q_1 \cup Q_2 \cup \{q_0\}, \Sigma, \delta, q_0, F_1 \cup F_2) \] where for any \(q \in Q\) and any \(a \in \Sigma_{\epsilon}\), \[ \delta(q, a) = \begin{cases} \delta_1(q, a) &\text{if \(q \in Q_1\)}, \\ \delta_2(q, a) &\text{if \(q \in Q_2\)}, \\ \{q_{0_1}, q_{0_2}\} &\text{if \(q = q_0\) and \(a = \epsilon\)}, \\ \emptyset &\text{if \(q = q_0\) and \(a \neq \epsilon\)}. \\ \end{cases} \]
  6. Case 4
  7. \(R = (R_1 \circ R_2)\):
    Then \[ L(R) = (R_1 \circ R_2) \] and the following NFA recognizes \(L(R)\): \[ M = ( Q_1 \cup Q_2 , \Sigma, \delta, q_{0_1}, F_2) \] where for any \(q \in Q\) and any \(a \in \Sigma_{\epsilon}\), \[ \delta(q, a) = \begin{cases} \delta_1(q, a) &\text{if \(q \in Q_1\) and \(q \not\in F_1\) }, \\ \delta_1(q, a) &\text{if \(q \in F_1\) and \(a \neq \epsilon\) }, \\ \delta_1(q, a) \cup \{q_{0_2}\} &\text{if \(q \in F_1\) and \(a = \epsilon\) } \\ \delta_2(q, a) &\text{if \(q \in Q_2\) }. \\ \end{cases} \]
  8. Case 5
  9. \(R = (R_1^*) \):
    Then \[ L(R) = (R_1^*) \] and the following NFA recognizes \(L(R)\): \[ M = ( Q_1 \cup \{q_0\}, \Sigma, \delta, q_0, \{q_0\} \cup F_1) \] where for any \(q \in Q\) and any \(a \in \Sigma_{\epsilon}\), \[ \delta(q, a) = \begin{cases} \delta_1(q, a) &\text{if \(q \in Q_1\) and \(q \not\in F_1\) } \\ \delta_1(q, a) &\text{if \(q \in F_1\) and \(a \neq \epsilon\) }, \\ \delta_1(q, a) \cup \{q_{0_1}\} &\text{if \(q \in F_1\) and \(a = \epsilon\) }, \\ \{q_{0_1}\} &\text{if \(q = q_0\) and \(a = \epsilon\) }, \\ \emptyset &\text{if \(q=q_0\) and \(a \neq \epsilon\)}. \\ \end{cases} \]
  10. Case 6

As an example, we convert the regular expression \((01 \cup 1)^*\) into an NFA through a sequence of stages. This expression generates strings such as \[ \{\epsilon, \, 01, \, 1, \, 0101, \, 11, \, 1101, \, 01011101, \, 11111, \, 01101011110111, \, \cdots\} \] and the following NFA recognizes these strings. When the NFA follows an \(\epsilon\)-transition, it moves to the next state without reading the next symbol from the input string.

NFA from Regular Expression