Patterns
Atomic
-
matched by the empty string -
matched by nothing -
matches any symbol in
Compound
-
matched by string if matches or -
: matches both -
: if string , matches and matches -
: doesn’t match -
: matches or if is an empty string -
: zero or more characters -
: one or more characters
Language Definition
-
-
-
-
-
-
- RHS: iterated language operation (concatenating language with itself infinitely many times)
-
-
-
Order of operations:
, concatenation, union
Proving Every Language Generated by a Regular Expression is an NFA
Where
∅:
---> ◯ ⭘
a: ε:
a ε
---> ◯ ---> ⭘ ---> ◯ ---> ⭘
p|q:
ε ε
---> ◯ p ◯ ---
ε / \
---> ◯ ---> ◯ ⭘
\ ε ε /
---> ◯ q ◯ ---
pq:
ε ε ε
---> ◯ ---> ◯ p ◯ -----> ◯ q ◯ ---> ⭘
p*:
ε ε
---> ◯ ---> ◯ p ◯ ---> ⭘
| ^ | ^
| -------- |
----------------------
Additionally:
Finite Automata to RegExp
- Labelled with RegExps as well as symbols from the alphabet (or
) - One accept state
- At most one transition between states
- No transitions into the start state or out of the end state
Every language accepted by an NFA is generated by a RegExp. We can successively remove states and replace them with longer and longer RegExps as transitions. To remove
To remove q_j, for every qi -> qj -> qk, where qj not qi or qj:
rij rjk
qi ---------> qj ---------> qk
| ^ | ^
| |__| |
| rjj |
|___________________________|
rik
rik | (rij (rjj)* rjk)
qi ----------------------> qk
For two-state loops:
rij rjk
qi ---------> qj ---------> qk
^ rji |
|------------|
Remove q_j: i -> j -> k, and i -> j -> i
r_ij r_jk
qi ---------> qk
^ |
|__|
rij rji
(rij rji)* r_ij r_jk
qi -----------------------> qk
Minimization of DFA
- Eliminate states that can’t be reached from the start state
- Find equivalent states
- Collapse equivalent states
Equivalence of states
i.e. states are equivalent if, for all strings, if you start at
If states are not equivalent, they are distinguishable:
- If one state is accepting and another is not accepting,
can be passed and they are clearly distinguishable - If two states
and are distinguishable by the string , if and , then this means and produce different outputs for the string so they are also distinguishable
Algorithm to find distinguishable states
- Create a table (staircase) of pairs where one is accepting and one is not (i.e. distinguishable)
- For remaining pairs, see what happens under all transitions. If the pair of states you end up in are known to be distinguishable for any transition, then the pair you started with are also distinguishable. If not, no new information is obtained
- Repeat until no new information is obtained
- Reflexive:
- Symmetric:
implies - Transitive:
and implies
Thus it is an equivalence relation
-
is the equivalence class of -
is an representative of - e.g. if
,
- e.g. if
-
is the quotient of by - i.e. the set of all equivalence classes
Minimization Algorithm
- Let DFA
- The quotient automation
- A state of
is an equivalence class of -
-
-
- The transition function is well-defined: it doesn’t matter which representative of
is chosen
-
Notes:
- Two minimal DFAs for the same language are identical, except for the names of the states
- Minimization does not work for NFAs
Decision Problems
Languages
- Membership:
? - Emptiness:
? - Finiteness:
? - Universality:
? - Intersection Emptiness:
? - Inclusion:
- Equivalence:
Membership
If the string
Run
Emptiness
If no string is accepted by the language.
Reachability to an accepting state. Use BFS or DFS.
Running time is linear in the size of the input
Finiteness
If the number of strings accepted by the language is finite.
- Check if there is a cycle on path from start state to an accept state
- Remove all states from which accept state is not reachable
- Check if the cycle is still reachable from start state
- Infinite if this is true
Running time is linear in the size of
Universality
If the language accepts all possible strings.
True if and only if
Running time is linear.
Intersection Emptiness
True if
Reduce the problem to the emptiness problem by constructing the product automaton.
- Constructing product DFA is proportional to
times : it is quadratic - Running time is linear in the size of the input: the product automaton, not
- So the running time is quadratic as well
Inclusion
If
Hence, take the complement of the
Equivalence
Use inclusion property going both ways.
Running time is quadratic.
Non-Regular Languages
Not all languages are regular. Counterexample:
- In every state, there is exactly one transition for each symbol
- There are finitely many states
- A DFA can accept infinitely long strings
- So there must be a loop
- Which means its behavior will repeat
Assume
- Pass in a string longer than
: it must go through states. Assume it is - Therefore it must repeat through one state
- Let
be the number of ’s required to get to the repeated states the first time - And
the number of ’s to get to the repeated state the second time -
-
- So
must be in the language, but can’t be in the language - So by contradiction,
can’t be a DFA
- And
Pumping Lemma
If
-
-
-
for every
Three parts: getting to cycle, looping through the cycle
To show
Set
Therefore
Hence,
Modelling Independent Processes
Shuffling:
-
: insert next character from or into output. Returns set of all possible shuffles -
Shuffling languages: every possible shuffle of every combination of two strings (one from each language)
Shuffle of the same language: NFA with
Product Automaton
Product of two automaton accepts the intersection of the two languages - strings that satisfy conditions for both languages. The languages must have the same alphabet.
Create by drawing a matrix of states, each axis representing states from one automaton (useful to have different naming schemes for each automaton such as 1, 2, 3 etc. for one and a, b, c etc. for the other).
To be in a state, it is in both states simultaneously. To determine transitions, determine the state transitions individually, and find the ‘coordinates’ for the product by concatenating the two. e.g. if you are in state a3, to determine the transition for symbol 0, let transition(dfa1, a, 0) = a and transition(dfa2, 3, 0) = 4, then draw a transition to the resultant state a4. Accept states are states with are accepting in both automata.