Adversarial Search

initial state $s_0$
states
players: $PLAYER(s)$ defines which player has the move in state $s$
actions: $ACTIONS(s)$ returns set of legal moves in state $s$
transition model: $RESULT(s, a)$ returns state that results from the move $a$ in state $s$
terminal test $TERMINAL(s)=true$ iff game end
utility function $UTILITY(s, p)$: final numeric value for a game that ends in terminal state $s$ for a player $p$

A strategy $s$ for player $i$: what will player $i$ do at every node of the tree that they make a move in

A strategy $s_1^*$ for player 1 is called winning if for any strategy $s_2$ by player 2, the game ends with player 1 as the winner

A strategy $t_1^*$ for player 1 is called non-losing if for any strategy $s_2$ by player 2, the game ends in either a tie or a win for player 1.

$Minimax(s) = max_{a\in Actions(s)}Minmax(Result(s,a))$ if $Player(s)=MAX$

$Minimax(s) = min_{a\in Actions(s)}Minmax(Result(s,a))$ if $Player(s)=MIN$

Runs in time polynomial in tree size

Returns a sub-perfect Nash equilibrium: the best action at every choice node

Game trees are huge

Impossible to expand the entire tree

Maintain a lower bound $\alpha$ and upper bound $\beta$ of the values of, respectively, MAX’s and MIN’s nodes seen thus far

Prune subtrees that will never affect minimax decision

MAX node $n$: $\alpha(n)$ is highest observed value found on path from $n$; initially $\alpha(n) = -\infin$

MIN node $n$: $\beta(n)$ is the lowest observed value found on path from $n$; initially $\beta(n)=+\infin$

given a MIN node $n$, stop searching below $n$ if there is some MAX ancestor $i$ of $n$ with $\alpha(i) ≥ \beta(n)$
given a MAX node $n$, stop searching below $n$ if there is some MIN ancestor $i$ of $n$ with $\beta(i)≤\alpha(n)$

Run minimax until depth d; then start using the evaluation function to choose nodes

An evaluation function is a mapping from game states to real values: $f: S \rightarrow \R$

$f(s)=UTILITY(s)$ if $TERMINAL(s)$

$f(s)=0$ otherwise, i.e. no information on quality of non-terminal nodes

For non-terminal states, must be strongly correlated with actual chances of winning

Modify minimax or $\alpha-\beta$ pruning algorithms by replacing

Can also be combined with iterative deepening