5. Computability and Complexity

Decision Problems

LL is a language over the alphabet Σ\Sigma, and the string wΣw \in \Sigma^*. ωL\omega \in L is a decision problem.

LL is decidable, reversible, computable, iff there is an algorithm that outputs if wLw \in L or not.

The algorithm must terminate for all inputs. If there is no such algorithm, L is undecidable.

Optimization problems can be encoded into decision problems: for example, to find the length of shortest path, simply check if there a path with length less than kk, incrementing kk until you get a yes.

There are uncountably many languages over the alphabet: P(Σ)=2N=R|\mathbb{P}(\Sigma^*)| = 2^{|\mathbb{N}|} =|\mathbb{R}|. However, there are only a countable number of algorithms: Σ=N|\Sigma^{'*}| = |\mathbb{N}|. Hence for most languages, there is no decision algorithm.

LL is semi-decidable, recursively enumerable, or partially computable, iff the algorithm outputs if wLw \in L, and doesn’t output anything if not - i.e. It does not terminate if wLw \notin L. Hence, you do not know if it is decidable or not unless it outputs the result.

Every decidable language is also semi-decidable: if no is outputted, just pass it through to an endless loop.

Some undecidable languages are semi-decidable.

If LL and L\overline{L} are both semi-decidable, then LL is decidable: run both algorithms simultaneously for each string; one of them is guaranteed to terminate.

Algorithms correspond to a function ff: ΣΣ\Sigma^* \rightarrow \Sigma^*; map an input to an output, both of which are encoding using the alphabet Σ\Sigma.

NB: Functions on numbers can be encoded using binary (otherwise, the alphabet would be infinite).

Semi-decidable:

Turing Machines

M=(Q,Σ,Γ,δ,q0,_,qa,qr) M = (Q, \Sigma, \Gamma, \delta, q_0,\__, q_a, q_r)

Where:

The TM operates as follows:

Notes:

Configuration

Storing the current state of the Turing machine: (x,q,y)Γ×Q×Γ(x, q, y) \in \Gamma^* \times Q \times \Gamma^* where:

Since only one cell on the tape can be modified at a time, a configuration relation can be used to describe transitions. e.g. where a,b,cΓa, b, c \in \Gamma and p,qQp, q \in Q:

(xa,p,by){(x,q,acy)δ(p,b)=(q,c,L)(xa,q,cy)δ(p,b)=(q,c,N) (xa, p, by) \rightarrow \begin{cases} (x, q, acy) &\delta (p, b) = (q, c, L) \\ (xa, q, cy) &\delta (p,b) = (q, c, N) \end{cases}

Notes:

Computation Models

There are many computation models e.g. TMs with multiple tapes, pushdown automata with two stacks etc. Many of these are equivalent: they accept the same languages and compute the same functions.

Church-Turing thesis: the intuitively computable functions (that is, that humans can compute - hard to formalize) are those partially computable by a Turing machine.

Chomsky Hierarchy

G=(N,Σ,P,S) G = (N, \Sigma, P, S)

Where:

Different grammars are obtained by restricting the form of each production xyx \rightarrow y:

Notes:

Grammar Language Automata
Type-0 General/Recursively Enumerable Turing Machine (or NTM)
Type-1 Context-sensitive Linearly space-bounded non-deterministic automata
Type-2 Context-free Push-down automata (not deterministic PDA)
Type-3 Regular Deterministic-Finite Automata (or NFA)

Undecidability

Programs can be the input of other programs e.g. compiler takes a program as input, Python interpreter.

Special Halting Problem

The following, called the special halting problem, is undecidable:

SHP={MM halts on input M} SHP = \{ \langle M\rangle \mid M \text{ halts on input } \langle M \rangle \}

M\langle M\rangle is the representation of the machine in tape.

Thus, the SHP is the set of all programs which, when given itself as input, halts.

Proof
def M_SHP(program_str):
  return True if exec(program_str)(program_str) halts else False

def M_dash(program_str):
  if M_SHP(program_str):
    while True:
      continue
  return False

M_dash(as_string(M_dash))
# If false is returned, then it does not halt so it cannot have returned false
# If it does not terminate, then it must halt, but it cannot have halted
# Hence, M_SHP cannot exist

Halting Problem

{M#wM halts on input w}\{ \langle M \rangle \#w \mid M \text{ halts on input } w \}, where #\# is a symbol that does not appear in M\langle M\rangle, allowing it to act as a separator.

Assume there is some total TM which determines if some program given some arbitrary string terminates. If this is possible, solving the special halting problem is trivial - just pass the specific input of M\langle M \rangle. Since its not possible, neither is the halting problem

Some other undecidable languages:

Complexity Classes

Running times of programs:

t(M,x)t(M, x) returns the number of steps the (total?) TM M takes for a given input xx until it halts.

Function f:NNf:N \rightarrow N gives upper bound on how long a function can take for a given input size: T(f) is the set of languages, with the language generated by a total TM MM being in T(f)T(f) if t(M,x)<f(x)t(M, x) < f(|x|) for all values of xx.

The program can be solved efficiently if it can be solved in polynomial time. These languages are in the complexity class PP. Examples include:

Non-Deterministic Turing Machine (NTM)

Some problems known to be in NPNP (and may or may not be PP):

Problem Complexity

Compare problem complexities by reducing one to another:

NP-Complete Problems

If these two conditions are met, then every other NPNP problem is polynomial-time reducible to it. Thus, if one NPNP-complete problem is solved, then P=NPP = NP

Showing that a problem is NPNP-hard is difficult - there are a lot of NPNP problems.

The satisfiability problem (SAT) was the first problem shown to be NPNP complete:

Hence, there is a tree of reductions.

CLIQUE

Finding the largest subset of the graph that is complete - all pairs of nodes are connected.

SAT

Given logic formula:

=Variable  !Formula  Formula AND Formula  Formula OR Formula \begin{aligned} = &\text{Variable} \\ | \; &\text{!Formula} \\ | \; &\text{Formula AND Formula} \\ | \; &\text{Formula OR Formula} \end{aligned}

Output: can the variables be set such that the formula evaluates to true?

On a NTM, guess and verify:

3-SAT

Limit the structure of logic formulas using Conjunctive Normal Form (CNF):

CNF=Clause (AND Clause)Clause=Literal (OR Literal)Literal=Variable OR !Variable \begin{aligned} \text{CNF} &= \text{Clause } \text{(AND Clause)}^* \\ \text{Clause} &= \text{Literal } \text{(OR Literal)}^* \\ \text{Literal} &= \text{Variable OR !Variable} \end{aligned}

Every formula can be converted into CNF.

If the distributivity rule is used in highly nested formulas there is exponential blow-up as each use of it causes duplication: thus, it cannot be used.

S-SAT

Given formula in CNF with exactly 3 literals per clause: Clause = Literal OR Literal OR Literal\text{Clause = Literal OR Literal OR Literal}

Result is in 3-CNF, obtained in polynomial time, is satisfiable if and only if the original formula is satisfiable.