2D Motion Estimation in the Extraction of 3D Motion
and Structure of Rigid Objects
Gabriel Tsechpenakis, Yiannis Xirouhakis and Anastasios Delopoulos
Computer Science Div., Dept. of Electrical and Computer Eng.,
National Technical University of Athens,
9 Iroon Polytechniou str., Athens GR-15773, GREECE
tel:+301-7722521, fax:+301-7722492, e-mail:adelo@image.ntua.gr
Abstract
Two-dimensional motion estimation is a task that
emerges in several fields of research such as cod-
ing and compression, content-based video re-
trieval and computer vision. The problem has
been tackled by several authors and various al-
gorithms can be found in the literature focusing
either on accuracy or low computational cost.
Present work investigates the e cient applica-
tion of existing motion estimation algorithms in
3D reconstruction of rigid objects from 2D pro-
jections. The latter has been given considerable
attention lately due to the increasing interest in
relevant applications, including model-based cod-
ing, robot vision and 3D modeling among others.
Three well-known motion estimation tech-
niques are employed, namely block matching,
generalized block matching and optical flow
methods, and the resulting motion fields are given
as input to one of the existing 3D reconstruction
algorithms. At the same time, local confidence
criteria are defined for the 2D motion vectors in
order to improve 3D motion and structure esti-
mates. For each of the employed 2D motion es-
timation methods, the advantages and disadvan-
tages in 3D reconstruction are discussed w.r.t.
the available object and the kind of its motion.
1 Introduction
The extraction of 3D motion and structure of
rigid objects from monocular or stereo image se-
quences has been for years a problem for com-
puter vision researchers [1]. This task emerges
in several fields of research such as computer
vision, biomedical engineering, human-computer
interaction and video coding, especially after the
guidelines of the Motion Pictures Expert Group
regarding MPEG-4 and MPEG-7 standards. Re-
cent literature comprises a variety of schemes
tackling this problem, which differ in the descrip-
tion of structure employed, the projection model,
the input measurements and the employed data
processing techniques. Maybe the most popu-
lar type of input measurements employed in 3D
reconstruction are the corresponding 2D motion
fields or 2D feature correspondences.
Naturally, the 3D motion and structure of an
object determines the respective 2D motion (pro-
jected motion). Reversely, given su cient infor-
mation of the projected motion, 3D motion and
structure can be estimated. Exact theoretical so-
lutions, as well as solutions in the presence of
noise have been reported in the literature, includ-
ing among others [5] for the perspective and [6]
for the orthographic case.
Mainly due to the ill-posed nature of the prob-
lem, 3D reconstruction results are considerably
affected by the accuracy of the obtained 2D mo-
tion estimates. Thus, apart from developing ro-
bust algorithms for 3D reconstruction, it is crit-
ical that input 2D motion vectors are as close
as possible to the true ones. Deterministic and
stochastic, parametric and non-parametric mo-
tion estimation methods based on displacement,
velocity and optical flow have been reported in
literature. In this work, three popular motion es-
timation methods are employed to obtain 2D mo-
tion information which is next given as input to
a 3D motion and structure extraction algorithm.
The employed methods include block matching,
generalized block matching and a method based
on optical flow.
It is known that motion estimation perfor-
mance is limited by several factors such as the
presence of noise, the lack of su cient spatial
image gradient, the changes in external illumina-
tion, the occlusion and the aperture problems [4].
These factors affect motion estimation schemes
in a way that the resulting motion fields may
considerably differ from the expected ones (real
motion). However, it can be seen that not all
schemes are equally affected by the aforemen-
tioned factors. In this work, it is discussed
whether the existing methods can offer a satis-
factory approximation of the real motion, par-
ticularly w.r.t. the accuracy of the resulting 3D
reconstruction.
At the same time, appropriate local confidence
criteria are introduced to decide on the subset of
motion estimates to be employed in the extrac-
tion of 3D motion and structure; or even employ
all motion estimates with appropriate weights.
For the reconstruction scheme, several algorithms
have been employed; the results presented in this
paper where obtained through [6].
Experimental results for all three motion esti-
mation schemes, the application of confidence cri-
teria and the reconstructed objects are included.
2 Overview
2.1 Three-dimensional motion
Any movement of a rigid object in the 3D space is
a superposition of a 3D rotation and a 3D trans-
lation. Consequently, the movement of a point
p = (x; y; z) on the object to p' = (x
'
; y
'
; z
'
), can
be represented by,
2
64
x
'
y
'
z
'
3
75 = R
2
64
x
y
z
3
75 + T ;
(1)
where R,T denote the rotation matrix and trans-
lation vector respectively. Then, the correspond-
ing 3D velocity vector u is given by,
u =
x p+ V ;
(2)
where
,V denote the angular and translational
velocity vectors respectively. Let point p project
on P = (X; Y ) onto the image plane. Then
X,Y are determined by the projection model (e.g.
X = x, Y = y for orthographic projections). In
this way, the projected velocity vector (or motion
vector) U is obtained.
2.2 2D motion estimation techniques
In present work, as mentioned in Section 1, three
popular motion estimation techniques are em-
ployed and the e cacy of the obtained results
in 3D reconstruction is tested.
Block matching: Block matching is consid-
ered to be the most popular motion estimation
method, for it is relatively fast yielding satisfac-
tory results for coding and compression applica-
tions. This method is extensively described in
literature, while many researchers have proposed
algorithms in the direction of a faster and more
accurate realization [3]. In this work, we utilize
the three-step searching algorithm and the MAD
matching criterion for overlapping blocks. The
main advantage of employing motion estimates,
obtained through block-matching techniques, in
3D reconstruction, is that the former are avail-
able in MPEG-coded sequences.
Generalized block matching: The e -
ciency of block-matching techniques is reported
to be considerably improved when extended to
generalized block matching. In this case, the rect-
angular/square blocks of the Ik-1 frame are first
transformed to quadrangular ones before being
matched with rectangular blocks of the Ik frame.
Optical
ow: Several approaches for optical
flow estimation use partial differential equations
to model changes in image intensity throughout
time. In this work, the velocity components are
obtained by combining the block motion model
described in [2] and the Horn-Schunck iterative
method [4].
2.3 Motion condence and reconstruc-
tion
Since generally most of the motion vector esti-
mates are far from the true ones, a number of
criteria are defined to decide on the their rela-
tive confidence. These criteria are formed w.r.t.
to the expected motion vector confidence for uti-
lization in the 3D reconstruction algorithm.
Local intensity smoothness: For pixels be-
longing to a smooth region of the image, the mo-
tion estimation scheme is likely to fail. Thus,
we expect that `better' motion estimates can be
found in non-smooth regions or even near edges.
Such regions can be located by an edge detection
scheme (e.g. Sobel filters), which is applied using
a low threshold.
Movement direction: In problems involving
rigid objects, we expect that at least locally, there
is a clear direction of their movement. For an
image block, a measure of its direction may well
be the mean vector of the block. In this sense,
we expect that every motion vector's direction
in a block does not remarkably differ from the
direction of the block's mean motion vector.
Motion vector local variation: The local
variation of motion estimates is a good criterion
to decide whether motion estimation did well in a
small area (image block). For the case of rigid ob-
jects, we expect that, at least locally, movement
is relatively smooth (or else the object would be
deformed). Optical flow schemes usually impose
such constraints themselves.
Total confidence for each motion vector is cal-
culated as a weighted combination of the confi-
dence values corresponding to each of the criteria.
On the basis of the obtained confidence results,
one decides on the subset of motion estimates
to be employed in the extraction of 3D motion.
Moreover, it is possible to employ all motion es-
timates with the appropriate confidence weights,
if this is permitted by the particular 3D recon-
struction algorithm. In this work, we employ an
extension of the algorithm presented in [6].
2.4 Simulations
Experimental results for all three motion estima-
tion schemes will be included. At the same time
the e cacy of the suggested confidence criteria,
as well as the outputs of the 3D reconstruction
algorithm will be illustrated. As it will be seen,
optical flow methods result in relatively smooth
3D objects, however estimated 3D motion is not
generally accurate. Block matching performs well
in 3D motion estimation after imposing the pro-
posed confidence criteria, however yields noisy
3D surfaces. Generalized block matching, at the
expense of computational e ciency, yields simi-
lar but superior results to simple block matching.
Some preliminary results are illustrated in Fig-
ures 1 and 2 for a natural sequence, using gener-
alized block-matching.
References
[1] T. S. Huang and A. N. Netravali, \Motion
and structure from feature correspondences:
Figure 1: First frame of natural sequence
Figure 2: Reconstructed 3-D object
A review," Proc.IEEE, vol. 82, pp. 252-269,
Feb. 1994.
[2] B.D. Lucas and T. Kanade, \An iterative
image registration technique with an appli-
cation to stereo vision," Proc. DARPA Im-
age Underst. Workshop, pp.121-130, 1981.
[3] H. Gharavi and M. Mills, \Block-matching
motion estimation algorithms: New results,"
IEEE Trans. Circuits and Systems, vol. 37,
pp. 649-651, 1990.
[4] M.Tekalp, \Digital Video Processing," Pren-
tice Hall, 1995.
[5] J. Weng, N. Ahuja and T. S. Huang, \Op-
timal Motion and Structure Estimation,"
IEEE Trans. PAMI, vol. 15, no. 9, pp. 864-
884, Sept. 1993.
[6] A. Delopoulos and Y. Xirouhakis, \Robust
Estimation of Motion and Shape based on
Orthographic Projections of Rigid Objects,"
IMDSP98 (IEEE), Alpbach Austria, July
1998.