Motion Estimation

 
  Feature Matching Object Detection
scoring Feature matching across pairs of images and not feature detection (e.g. cornerness score).
A lower score is a better match, since we use distance measure for comparison.
Higher -> more likely to be an object
threshold Keep matches if below some threshold Keep detections if above some threshold
Precision
$TP \over TP+FP$
How accurate are the feature pairs declared as matches?
Lower threshold -> lower FP (also lower TP, but doesn’t matter) -> higher precision
How accurate are the detections?
Higher threshold -> lower FP -> high
precision
Recall
$TP \over TP+FN$
Was the algorithm able to find all the actual pairs of features?
Higher threshold -> more TP (again balanced in numerator and denominator) but also lower FN -> higher recall
Could we find all the objects?
Lower threshold -> less FN -> higher recall
Specificity
$TN \over TN+FP$
Can the algorithm correctly disregard the features which are not part of any pair?
Lower threshold -> lower FP, no impact on the TNs -> higher specificity
 

If threshold too high:

high precision: few false positives

low recall: many false negatives

Motion Estimation

Key Assumptions

  • Color Constancy

    Brightness constancy for intensity images

    Implication: allows for pixel to pixel comparison (not image features)

    $I(x(t), y(t), t) = C$

  • Small Motion

    Pixels only move a little bit

    Implication: linearization of the brightness constancy constraint

    $I(x + u\delta t, y+v\delta t, t + \delta t) = I(x, y, t)$

Approach

Look for nearby pixels with the same color

$I(x + u\delta t, y+v\delta t, t + \delta t) = I(x, y, t)$

$I(x, y, t) + {\partial I \over \partial x}\delta x + {\partial I \over \partial y}\delta y + {\partial I \over \partial t}\delta t = I(x, y, t)$

${\partial I \over \partial x}\delta x + {\partial I \over \partial y}\delta y + {\partial I \over \partial t}\delta t = 0$

$I_xu + I_yv + I_t = 0$

Horn-Schunck Optical Flow Lucas-Kanade Optical Flow
brightness constancy, small motion method of differences
smooth flow (flow can vary from pixel to pixel) constant flow (flow is constant for all pixels)
global method
(dense)
local method
(sparse)
Direct, dense methods
- Directly recover image motion at each pixel from spatio-temporal image brightness variations
- Dense motion fields, but sensitive to appearance variations
- Suitable for vedio and when image motion is small
Feature-based methods
- Extract visual features (corners, textured area) and track them over multiple frames
- Sparse motion fields, but more robust tracking
- Suitable when image motion is large (10s of pixels)

Lucas-Kanade Optical Flow

Assumption

$I_xu + I_yv + I_t = 0$

Assume that the surrounding patch has constant flow

$\begin{bmatrix} I_x(p_1) & I_y(p_1) \\ I_x(p_2) & I_y(p_2) \\ \vdots &\vdots \\ I_x(p_{25}) & I_y(p_{25})\end{bmatrix} \begin{bmatrix}u \\ v\end{bmatrix} = -\begin{bmatrix} I_t(p_1) \\ I_t(p_2) \\ \vdots \\ I_t(p_25) \end{bmatrix}$

$Ax = b$

Least Squares Approximation: $A^TA\hat x = A^Tb$

$x = (A^TA)^{-1}A^Tb$

  • $A^TA$ should be invertible
  • $A^TA$ shouldn’t be too small ($\lambda_1$ and $\lambda_2$ shouldn’t be too small)
  • $A^TA$ should be well conditioned ($\lambda_1 / \lambda_2$ shouldn’t be too large)

$A^TA$ was introduced in Harris Corner Detector

  • Corners are when λ1, λ2 are big; this is also when Lucas-Kanade optical flow works best
  • Corners are regions with two different directions of gradient (at least)
  • Corners are good places to compute flow!

Aperture Problem

Small visible image patch of line cannot tell the direction of movement

Want patches with different gradients to the avoid aperture problem

Aliasing

Temporal aliasing causes ambiguities since images can have many pixels with the same intensity and lead to wrong ‘correspondences’

Coarse-to-Fine Optical Flow Estimation

run iterative L-K -> wrap & upsample -> run iterative L-K -> …

Horn-Schunck Optical Flow

For every pixel,

Enforce brightness constancy: $min_{u,v}[I_xu_{ij} + I_yv_{ij} + I_t]^2$

Enforce smooth flow field: $min_u(u_{i,j}-u_{i+1,j})^2$

$min_{u,v}\sum_{i,j}{E_s(i,j)+\lambda E_d(i,j)}$

$E_s(i,j) $: smoothness

$E_d(i,j)$: brightness constancy

$\lambda$: weight

Compute partial derivative, derive update equations

Applications for Optical Flow

  • Segmentation of objects in space or time
  • Estimating 3D structure
  • Learning dynamical models – how things move
  • Recognizing events and activities
  • Improving video quality

Errors in assumptions

  • A point does not move like (all) its neighbors, e.g. at object boundaries
  • Brightness constancy does not (always) hold
  • The motion is large (larger than a pixel)
    • Not-linear: Iterative refinement
    • Local minima: coarse-to-fine estimation