The task of motion detection that was presented in the previous section is a relatively easy process, especially for simple scenes. The real challenge appears when we actually have to estimate the motion between two images; that is, come up with a motion vector that gives us a way to transform the first frame to the second and vice versa.
The motion vector usually comprises two numbers (or coordinates); one showing the length of the motion in pixels, r, and one showing the direction of the motion in degrees, θ. This pair of coordinates is called polar. An equivalent way to portray the motion of a pixel is by defining the length of the motion in pixels, in the vertical and horizontal direction. These coordinates are called cartesian. In the example of the following figure, you can see all the coordinates needed to describe the motion of a pixel moving from point (x1,y1) = (0,0) to point (x2,y2) = (4,3).