Exam Notes

Prune on insert/remove if the end node has been expanded.

A*: f(path)=cost(path)+h(path[1])f(path) = cost(path) + h(path[-1]).

Bidirectional search (BFS, LCFS, A*): 2bd/2bd2 * b^{d/2} \ll b^d; saves time and space.

Iterative deepening: max depth incremented until solution found. DFS is pow(b,k)\geq pow(b, k); iterative is bkbb12\leq b^k \cdot \frac{b}{b-1}^2.

Propositions and Inference

Soundness and Completeness

Bottom-up/forward-chaining

Fixed point: set of consequences generated.

If II is the interpretation where every element of the fixed point is true and every other one is false, II is the minimal model of the KB.

Top-down procedure

Answer clause: yesa1amyes \leftarrow a_1 \land \dots \land a_m.

Until the answer clause is an answer (yesyes \leftarrow \text{}), repeatedly run SLD resolution.

Prolog

Constraint Satisfaction Problems

A set of variables with domains for each, and set of constraints; a solution is an assignment that satisfies the constraints.

Constraint Network:

Arc Consistency

For a given constraint and variable XX in the scope of the constraint, the arc is arc consistent if, for each element in the domain of XX, there is a valid assignment for all other variables in the constraint’s scope.

Elements may need to be removed from the domain of XX to make it arc consistent.

Algorithm:

Empty domains: no solution; multiple values in domains: may or may not have a solution.

Domain Splitting

Halve the domain for some variable, create two instances of the problem with the new domain, and solve both, revisiting only constraints involving the split variable.

Variable Elimination

Eliminate variables by passing constraints on to their neighbors.

Roulette wheel: return the first individual whose fitness added to the running total is larger than a random number between 1 and sum of the fitnesses.

Probabilistic Inference and Belief Networks

P(xy,z)=P(yz)P(x,yz) P(x|y,z) = \frac{P(y|z)}{P(x,y|z)}

For a full assignment:

P(x1,,xn)=i=1nP(xiparents(Xi)) P(x_1, \dots, x_n) = \prod_{i=1}^{n}{P(x_i | parents(X_i))}

If it is the probability over all variables, it is called the joint probability distribution.

Basic Machine Learning

Artificial Neural Networks

Prediction: a=i=0nwixia = \sum_{i = 0}^{n}{w_i x_i}, where x0=1x_0 = 1.

Activation function: g(a):bool=a0g(a): bool = a \geq 0.

Learning: weightweight+η_learning_ratex(actualprediction)weight \leftarrow weight + \eta\_learning\_rate \cdot x(actual - prediction). Repeat for each training example and loop until no mis-classifications or limit reached.