Research
MoPrim has two major research area:



One of the main goals of MoPrim is to find a way to build upper-body motion primitives. And, it is our extended goal to find a way to automate this process.

In the research done so far, the modeling of the upper-body have been simplified to modeling of a single arm, in belife that the methods developed can be scaled to a full upper-body model at a later state.

The currend gesture-database was made by applying four magnetic trackers on the right arm and torso of seven test subjects, preforming ten different command-gestures no less that 20 times.


Position of magnetic trackers. (sensors)



Testperson preforming a gesture.
The datacollection was done by use of the Polhemus FastTrac, which allows a samplerate of 25 Hz.

The FastTrac system delivers raw position data with six degrees of freedom (position and angles in 3D).

This resultes in raw input data in 24 dimentions (not including time).

To reduce the dimentionality and normalize data in relation to bodysize, all data is transformed into only four euler-angles. Tracked over time, this will generate a curve in a 4D space spaning from 0 to 360 degrees in all directions.





Since any point in the 4D space will represent a certain configuration of the arm, similar gestures should result in similar curves.

Modeling the curves and finding primitive building blocks might be solved in a number of ways. In MoPrim we chosen a key-frame approach to this problem. Meaning, that we aim to model each curve with only a few selected samples.

It is, however, our aim to apply several xtra attributes to each key-frame. Making both recognition and reconstruction posible with additional highlevel information and controlable parameters.


2D curve. (Dots = samples in 2D)



The Density Attribute




Six curves of similar gesture.





Six curves of similar gesture, with some key-frames.





Six curves of similar gesture, with some density-key-frames.

Besides the first four attributes of each key-frame (the four euler angles), we developed a new attribute: The density measure.

When placing the curves generated from the many recorded similar gestures in the same 4D space, it will look alot like the figures to the left. In some areas the curves are all very close, while in other ares very distant.

When a new unknown curve is presented it will almost certain be slightly different than the ones optained doing training. If we which to deside what gesture the new curve was generate of, we will have to compare it to the training data of all known gestures.

The hypotesis behind the density attribute is: Comparing curves in the key-frames where the density is highest will increase the difference between recognition scores of similar and different gestures.

We have chosen to use the mahalanobis distance to create hyper-elipsoid :



r=mahalanobis distance, x=sample, µ=mean, C=covariance matrix.




V=hyperelipsoid, |C|=det(C).


Since we are only interrested in the ratio, the equation can be reduced to:



We do not only use the density meature in recognition, but also in the primitives selection process. This is done in order to select the key-frames that will give the best combination for both recognition and reconstruction. The optimal weigthing used in the selection process is, however, yet to be found.



Optimization of B-spline



The first test, were conducted with the aim of investigate how the new density measure would function in a reconstruction case. We desided to let the reconstructed curve be made as a cubic B-spline of the selected primitives (key-frames).

The optimal position of the primitives can be calculate by brute force. However, since this method is exstremly time-consuming another approach was tested. In the first step, new primitives were placed in the place with the highest error-score. The error-score was calculated as a combination of curve misfit and the density measure. This resulted in an increasing improvement of the reconstruction as the number of primitives went up, but was by far the optimal solution.

As an attempt to improve the primitives position, an optimization step were implemented. It basicly allowes for the primitives to change position by moving one step back or forth along the curve, as long as it would make the total errorsum go down. This would be done after each new primitive was added, and repeated until all primitives have been moved to the best local minimum.

This method is fairly fast and will find a local minimum. But with the large number of variables and the fact that this is temporaly setup, the method was not fully investigated.




Density Attribute Tests



Preliminary tests show promising results for both the concept, as well as the error measurements calculated on our current trainingdata. Below are shown two graphs: (left): Automatic primitive selection without use of the optimization. (right): same results with use of optimization.

The first four graphs show the four euler-angles. (solid line): The original curve. (dashed-line): New curve build as cubic B-spline based on the selected primitives. (dotted-line): Illustrates the variance.

The last graph shows error between original curve and B-spline.




Selection without density measure.


Selection with density measure.


An additional test were conducted in order to find the best suitable number of primitives to be used to represent a gesture. The below graph shows the results of this tests. The graph shows error as a function of the number of primitives used to construct the B-spline, with four different setups. (solid): Reconstruction without density and optimization. (dashed): Reconstruction with density and without optimization. (dash-dot): Reconstruction without density and with optimization. (dotted): Reconstruction with density and with optimization