Computer Vision-Based Motion Capture of
 Body Language.
Applying Spatially-Based Pruning of the State-Space

Thomas B. Moeslund

Computer Vision and Media Technology Laboratory
Aalborg University, Denmark
Email: tbm@cvmt.dk

Abstract

Capturing the motion of a human body utilising computer vision is the focus of this thesis. Normally the capturing process is carried out by applying a priori knowledge, in the form of a geometrical model, i.e., applying a model-based approach. Different configurations of the model is synthesised and compared with the image data. The configuration most similar to the current image data defines the current state of the model, i.e., its pose. When first initialised this provides a very powerful pruning as long as the assumption of "smooth motion" is fulfilled. However, under practical circumstances the temporally-based pruning often breaks down. Hence, alternative or supplementary methods of pruning that are independent of the temporal context are of interest. The purpose of this thesis is to investigate possibilities for exploiting spatial information to achieve a similar pruning effect. The context of the investigation into spatially-based pruning is to capture the 3D pose of a human arm given one static camera.
The thesis is divided into three parts. In the first part motion capture in general is described and a comprehensive survey of the relevant literature is presented. In the second part spatially-based pruning is applied to derive a more compact state-space representation of the arm by including low-level image features. Concretely it is shown how the primary degrees-of-freedom (DoF) in the shoulder and arm can be efficiently modelled. Furthermore, this part also describes how to reduce the size of the state-space by introducing six spatially-based constraints. In part three the spatially-based pruning is implemented in different systems in order to demonstrate its effect.
The primary findings are first of all a method which allows the 12 primary DoF in the shoulder and arm to be modelled by just two DoF. Secondly, the six spatially-based constraints that allow for a pruning of the state-space of 97.3% in average. Both findings suggest that the proposed approach for spatially-based pruning is a realistic alternative for coping with the problems inherent in temporally-based pruning.