Adversarial Search

 

Game

Components

  • initial state $s_0$
  • states
  • players: $PLAYER(s)$ defines which player has the move in state $s$

  • actions: $ACTIONS(s)$ returns set of legal moves in state $s$
  • transition model: $RESULT(s, a)$ returns state that results from the move $a$ in state $s$
  • terminal test $TERMINAL(s)=true$ iff game end
  • utility function $UTILITY(s, p)$: final numeric value for a game that ends in terminal state $s$ for a player $p$

Strategies

Player Strategy

A strategy $s$ for player $i$: what will player $i$ do at every node of the tree that they make a move in

Winning Strategy

A strategy $s_1^*$ for player 1 is called winning if for any strategy $s_2$ by player 2, the game ends with player 1 as the winner

A strategy $t_1^*$ for player 1 is called non-losing if for any strategy $s_2$ by player 2, the game ends in either a tie or a win for player 1.

Minimax

$Minimax(s) = max_{a\in Actions(s)}Minmax(Result(s,a))$ if $Player(s)=MAX$

$Minimax(s) = min_{a\in Actions(s)}Minmax(Result(s,a))$ if $Player(s)=MIN$

  • MAX chooses move to maximize the minimum payoff
  • MIN chooses move to minimize the maximum payoff
Complete Yes (if game tree if finite)
Optimal Yes (optimal gameplay)
Time $O(b^m)$
Space Like DFS: $O(bm)$

Runs in time polynomial in tree size

Returns a sub-perfect Nash equilibrium: the best action at every choice node

Drawbacks

Game trees are huge

  • $\alpha-\beta$ pruning

Impossible to expand the entire tree

  • evaluation functon = estimated expected utility of state
  • cutoff test: e.g., depth limit

$\alpha-\beta$ Pruning

Maintain a lower bound $\alpha$ and upper bound $\beta$ of the values of, respectively, MAX’s and MIN’s nodes seen thus far

Prune subtrees that will never affect minimax decision

MAX node $n$: $\alpha(n)$ is highest observed value found on path from $n$; initially $\alpha(n) = -\infin$

MIN node $n$: $\beta(n)$ is the lowest observed value found on path from $n$; initially $\beta(n)=+\infin$

Pruning
  • given a MIN node $n$, stop searching below $n$ if there is some MAX ancestor $i$ of $n$ with $\alpha(i) ≥ \beta(n)$

  • given a MAX node $n$, stop searching below $n$ if there is some MIN ancestor $i$ of $n$ with $\beta(i)≤\alpha(n)$

Heuristic Minimax Value

Run minimax until depth d; then start using the evaluation function to choose nodes

Evaluation Function

An evaluation function is a mapping from game states to real values: $f: S \rightarrow \R$

$f(s)=UTILITY(s)$ if $TERMINAL(s)$

$f(s)=0$ otherwise, i.e. no information on quality of non-terminal nodes

For non-terminal states, must be strongly correlated with actual chances of winning

Modify minimax or $\alpha-\beta$ pruning algorithms by replacing

  • TERMINAL(s) with CUTOFF(s, d)
  • UTILITY(s) is replaced by EVAL(s)

Can also be combined with iterative deepening