COMPUTER VISION and Media Technology (CVMT)
COMPUTER VISION
Automatic Acquisition of Visual Landmarks / Automatisk indlæring af visuelle
landemærker
A mobile robot can autonomously navigate through for example a production
facility by using a video camera to monitor known visual landmarks in the
environment. Landmarks can be light switches, posters, door signs, etc. An
especially flexible system is achieved if the robot can automatically acquire
(learn) such landmarks for future navigation use. This project develops techniques
for automatic learning of visual landmarks by acquiring images of the environment
and extracting positions and appearances of landmark candidates. These candidates
are then stored in the robot's memory for subsequent use in controlling robot
movements. CVMT has special focus on analysis of how accurately this navigation
method can perform, in order to optimise the use of the available landmarks.
In 2003 this effort resulted in a Ph.D. degree being awarded to Salvatore
Livatino.
Project under VIRGO, (EU, TMR).
(Claus B. Madsen; Salvatore Livatino, Italy)
Segmentation of skin colour under different illuminations and estimation
of the colour of illuminations / Segmentering af hudfarve under forskellige
lyskilder og estimering af lyskilders farve
Automatically detecting and tracking faces and hands of humans in motion
is important in applications such as interfaces for human computer interaction
(HCI), in‑ and outdoor surveillance, and automatic camera men. An often‑used
feature in face and human motion tracking is skin colour because it is fast
to compute and invariant to size and orientation. However, the colour of
skin changes as the illumination changes. This project aims to develop adaptive
methods for segmenting human skin under changing illumination conditions.
A physics based reflection model of human skin has been developed, and successfully
applied, e.g., to estimate the illumination colour and adapt statistical
models for skin colour, respectively. Furthermore, colour information has
been combined with near infrared information in order to improve the skin
detection robustness.
Partly funded by ARTHUR.
(Moritz Störring, Erik Granum, Hans J. Andersen)
Assessment of crops by outdoor computer vision / Afgrødevurdering og udendørs
computer vision
Visual information of crops has for centuries been used by mankind for assessment
of the growth condition of crops. By visual information the spatial distribution
of reflection from the canopy is used for detection of specific patterns related
to lack of nutrients or infestation by diseases. I.e. lack of manganese will
turn out as light brown spots on the leaves whereas lack of calcium will turn
out as miscolouring of leaves from the leaf edge and inwards.
So far, development of sensors for detection of plant nutrient deficiency
has been oriented towards nitrogen and with sensor types that do not take
the spatial pattern of the reflection into account.
The objective of this project is
to develop computer vision methods that may enable a more detailed analysis
of the reflection from canopies by: 1) identification of areas without specular
reflection and 2) correct images for uneven illumination conditions.
In this way a reliable quantitative analysis of the canopy may be obtained
for optimal growth control. The project was concluded
with a report early 2003 and providing the following major contributions:
- A pioneering investigation of computer vision analysis of outdoor scenes
with high dynamic range of intensities
- 3D reconstruction and description of plants for detection of reflection
patterns
- Modelling of "Red Edge Inflextion Point" using the Wiebull function for
accurate estimation of the "clorofyl" content and its distribution across
the leaf segments
The two first studies have given promising results and international publications.
All results will be followed up in the ACROSS project.
Funded by
The Danish Agricultural and Veterinary Research Council, February
2002-March 2003.
(Hans J. Andersen)
ACROSS: Autonomous spatial‑temporal crop and soil surveying / Autonom spatio
temporal afgrøde‑ og jordmonitorering
The general vision is effective precision farming, which in harmony with
the environment utilises resources optimally. This requires continuous selective
and adaptive control of growth, weeds, diseases and pest. In turn such control
is conditioned on corresponding continuous monitoring in the field using appropriate
methods of measuring the current conditions of and for the plant growth.
The objective of this project is to develop methods of measuring and managing
such information to support the above vision in a way that invites also new
innovative approaches to precision farming by providing the necessary information
on demand and in time for planning and decision making.
More concretely the project will develop methods and technology for: 1) Computer
vision and laser range methods for on‑site and real‑time monitoring of information
of the crop growth (nutrients, diseases, etc.). The methods will allow diagnostics
of crop condition based on reflection patterns (e.g. miscoloured areas) down
to single leaf scale. 2) Implementation and integration of the above methods
on an autonomous platform with a suite of existing crop and soil measuring
facilities for on‑site operation. 3) Repeated test and evaluation for development
and proof of concept: "Autonomous Crop and Soil Surveillance".
Hence, the project through new research and practical development will contribute
to a new scope precision agriculture, which until now has not been seen in
its full perspectives, due to the lack of precise and timely information.
The projekt is carried out of a consortium comprising CVMT and 3 departments
of the Danish Institute of Agricultural Science: Agricultural Engineering
(Bygholm), Agricultural Systems / Crop Physiology and Soil Science (Foulum),
and Plant Biology (Flakkebjerg).
Funded by: The Danish Technical Research and Agricultural and Veterinary
Research Councils and the Danish Ministry of Food, Agriculture and Fisheries.
(Erik Granum, Hans J. Andersen,
Michael Nielsen, Kristian Kirk)
Outdoor computer vision for analysis of vegetation / Udendørs computer vision til analyse af vegetation
This project deals with problems such as segmentation of vegetation from
a single image, and 3-D reconstruction from two or more images (stereopsis).
A major challenge in outdoor colour-based segmentation is to make it robust
towards changing illumination conditions. It will be examined how existing
methods for illumination invariant analysis perform when the assumptions
of these methods are violated in real-world situations (e.g., mixed illuminants,
non-diffuse objects). In the end of 2003, an experiment was started in collaboration
with Anton Thomsen, Foulum, with the purpose of comparing the use of laser
range measurements with computer vision for gap fraction (and leaf area index)
estimation.
(Kristian Kirk, Erik Granum, Hans Jørgen Andersen)
Computer vision analysis of biological semi transparent colour objects /
Computer vision analyse af biologiske semi transparente farveobjekter
This Ph.D. project is mainly concerned with the reconstruction of the 3D structure of plants, i.e. of all elements of the canopy with stems and leaves and their interconnections. Identification and analysis of the individual 3D elements of a canopy will be the basis for development of descriptive methods for the overall structure of the canopy. Such descriptions will provide a means for the analysis of the position and function of these elements in the plant structure as well as the characteristics (and health) of the overall structure itself. Currently, the co-operation also extents outside the ACROSS consortium. Together with KVL, experiments are conducted to detect lead leaf area index, lead leaf tips and -bases on barley for automation of sampling of hyper-spectral data.
Funded
through the ACROSS project
(Michael Nielsen, Hans Jørgen Andersen, Erik Granum)
Computer vision‑based human motion capture using kinematics constraints /
Computer vision registrering af humane bevægelser vha. kinematiske begrænsninger
In man‑machine‑interfaces there is a great need for interaction methods more
natural to humans, e.g. via speech and body language. The latter is based
on a computer capturing the movements of the individual body parts and recognizing
their meaning. In this project computer vision is utilized to investigate
the capturing problem. The key approach is to have a geometric model of the
articulated body parts. The model is used to predict possible “next”configurations
given the past configurations. The predicted configurations are compared
with the image measurements and the true configuration of the human body
is captured. To optimize this process detailed kinematic constraints related
to the articulated body parts are introduced to limit the search space of
possible configurations. The limited search space
is, however, still too large for a brute force search and therefore a probabilistic
approach, in the form of a sequential Monte Carlo method, is adapted to identify
the most likely configuration. In 2003 the project is concluded with a PhD
degree being awarded to Thomas B. Moeslund.
(Thomas B. Moeslund,
Erik Granum)
Computer vision based interface to virtual reality / Computer vision baseret
interface til virtual reality
The current interfaces in the VR‑Media Lab's CAVE is based on physical devices
with long wires. These wires are often disturbing for the user and, thus,
reduce the immersiveness. In this effort it has been investigated how computer
vision can be applied to create interfaces providing the same performance
as the wired devices. The approach is based on four infrared cameras tracking
markers attached to the user's VR-glasses and interface device, respectively,
in order to estimate the user's viewing direction and interface position.
This way of interaction is wireless and by that less disturbing for immersiveness.
A tracking method has been developed and implemented that performs robustly
with low latencies and update rates up to 200Hz.
Partly funded by ARTHUR and VR‑Media Lab.
(Niels Tjørnly Rasmussen,
Moritz Störring, Thomas D. Nielsen, Thomas B. Moeslund, Erik Granum)
FG‑NET ‑ Face and Gesture Recognition Working Group / Ansigts‑ og gestusgenkendelsesarbejdsgruppe
FG‑NET is a 6 partner EU‑IST Concerted Action/Thematic Network (IST‑2000‑26434) that started in 2001 and will run until 2004. The aim of this project is to encourage technology development in the area of face and gesture recognition. The precise goals are: (1) to act as a focus for the workers developing face and gesture recognition technology; (2) to create a set of foresight reports defining development roadmaps and future use scenarios for the technology in the medium (5‑7 years) and long (10‑20 years) term; (3) to specify, develop, and supply resources (e.g. image data sets) supporting these scenarios, and (4) to use these resources to encourage technology development. The use of shared resources and data sets to encourage the development of complex processes and recognition systems has been very successful in the speech analysis and recognition field, and also in the image analysis field in the few specific cases where it has been applied. The basis of this project is that, when properly defined and collected, such resources would also be of benefit in the development of solutions to wider problems in face and gesture recognition. Currently a large data set containing pointing gestures is being recorded and annotated.
(Thomas B. Moeslund, Moritz Störring, Lars Reng, Erik Granum)
Reconstruction of 3D surface models using video cameras / Rekonstruktion
af 3D overflademodeller vha. almindelige
videokameraer
The aim of this project is to develop an image processing based system for
construction of metrically correct 3D models from monochrome video camera
images. The method is based on calibrated
2D images captured under controlled illumination conditions, as changes in
the illumination play an active role in the processing of the images.
From the 2D images, a set of corresponding 2.5D surface patches
can be calculated and a surface based alignment method is being developed
for joining these surface patches into a common 3D model of the surface.
The generated 3D models are expected to have an accuracy compatible with
other methods, like MRI‑scanning, but with a significantly lower use of resources.
(Jørgen Bjørnstrup, Erik Granum)
VIRTUAL REALITY
VR MediaLab, Virtual Reality MediaLab
VR MediaLab is the Aalborg University Centre for Virtual Reality and Interactive
Media. It was inaugurated in August 1999 with unique computing and visualisation
facilities in a dedicated building complex of NOVI, the Science Park of Aalborg
University. The VR‑facilities comprise an sgi super computer (16 cpu's, 6
graphics pipes) and three visualisation arenas: (6‑sided CAVE; Panorama screen
of 160 dg., 7.1 m diameter; and 3D Power Wall, 8 m wide). The Centre hosts
research groups from various departments of the university wanting to operate
in the well‑supported interdisciplinary environment with both research and
teaching activities. CVMT played a major role in the establishment of the
VR Centre, and is now located within it, contributing to, and benefiting
from the interdisciplinary environment and the facilities. Upgrading of the
computer facilities to PC-cluster is initiated.
Supported by Det Obelske Familiefond, Spar Nord Fonden, and EU Funds for
Regional Development.
(Erik Kjems)
Virtual Reality Platform for Interactive, Inhabited Virtual Worlds / VR platform
til interaktive, beboede verdener
To construct interactive virtual worlds a software platform is needed to
1) simulate and maintain a dynamic 3D model of a scenario, 2) present this
simulated world for users using computer graphics and audio, and 3) provide
one or more users with interaction facilities. In conjunction with past projects
on interactive, inhabited virtual worlds (the STAGING and PUPPET projects)
CVMT has designed and developed such a Virtual Reality platform. The platform
supports arbitrary scenarios where computer controlled characters and humans
interact. Such computer‑controlled characters are called Autonomous Agents,
or simply agents. The user can freely navigate within this virtual world,
which the software platform visualises in real‑time on a computer screen
(or in either of the VR MediaLab's VR arenas). The platform also enables the
autonomous agents to move around (supported by path planning), play animations,
utter recorded sounds, and change facial expressions. In this way agents
can communicate themselves to, and interact with, other agents or the user(s).
The platform also enables agents to continuously sense the virtual world,
i.e., the agents have simulated vision, audio and tactile senses. A special
aspect of CVMT's VR platform is that it supports the user in being represented
by an avatar (an agent controlled by the user) in the virtual world. This
allows the user to interact with the virtual world and its inhabitants on
equal terms with the agents. The platform features unique sound related interaction
possibilities for the user. For example the user can communicate with the
autonomous agents using sound (the agents can 'hear' the user through a microphone),
and the user can record the sounds he/she wants the agents to use in particular
situations.
(Claus B. Madsen, Erik Granum)
Interaction with virtual worlds and their inhabitants / Interaktion med virtuelle
verdener og deres beboere
This project focuses on interaction in virtual reality by exploring and further
developing the facilities at the VR‑Centre. The effort is carried out in collaboration
with the VR Centre. In spite of the experience from a substantial history
of interaction with computers, the interaction with VR‑applications is often
rather primitive. A range of basic hardware and software problems regarding
interaction in the VR installation was uncovered and solved.
Interaction processes were analysed in the light of application
contexts and the relevant combinations of input devices and display types.
A series of interaction and navigation techniques for VR was developed, implemented
, and tested.
(Henrik R. Nagel,
Søren Bovbjerg, Erik Granum)
3D Visual Data Mining (3DVDM)
Both private companies and public institutions regularly collect large databases,
but much of the information content in those databases is difficult to extract.
With VR‑technology it is possible to create virtual visual worlds based on
the characteristics of the data, so that visual data explorers can be immersed
in these worlds, navigate around, and observe the data from within. The project
develops and investigates the applicability of temporal visualisation methods
for the purpose of detection of previously unknown structures and relationships
in data. A new and flexible VR visualisation system has been developed and
implemented, which allows visualisation of arbitrary temporal developments
of data, and furthermore makes it possible to study new forms of interaction.
The VR visualisation system is being used by the members of the 3D Visual
Data Mining project, as well as by other research groups at Aalborg University.
It has allowed the participants of the 3D Visual Data Mining project to develop
new methods for exploring data in Virtual Reality based on arbitrary temporal
data visualisation. The system has also allowed
for addition and integration of software facilities for 3D surround sound
generation in VR. Different methods for using sound in the 3D Visual Data
Mining system were investigated. It was found to be possible to use dynamic
interactive 3D soundscapes to support the visual representation of data,
and a series of methods were developed and tested. The result was two major
tools with which a soundscape was created based on either user navigation/data
windowing or direct querying of the data.
The project is interdisciplinary with participation of computer scientists,
statisticians, and psychologists, ‑ all from Aalborg University.
Supported by the Danish Research Councils, 1999‑2004.
(Henrik R. Nagel, Erik Granum; M. Böhlen, Department of Computer Science;
Peer Mylov, Department of Communication)
Development
of 3D surround sound software facililties for Virtual Reality/Udvikling af
3D surround softwarefaciliteter til Virtual Reality
This project was started as a
part of the 3DVisual Data Mining project with the purpose of investigating
possibilities for using sound as an additional tool for Data Mining. Generated
sound can relate to additional statistical properties of the visualised statistical
observations (objects) or to densities of objects with specific properties
and generally support navigation in the virtual world. For this purpose
a 3D sound engine was developed, which was able to run on the different hardware
configurations that exists at the VR-Centre, as well as on desktop PCs. The
sound engine was designed to simulate different important psychoacoustic properties,
such as position (using panning algorithms), motion (using doppler effect)
and environment (reverb), making it possible to create immersive soundscapes
and investigate user response in such environments. The sound engine was
also equipped with an advanced musical synthesizer, primarily for data mining
purposes, and the design allowed it to function as a stand alone application
as well as an integrated part of the VR++ software system. It has been used
extensively in the data mining project and as soundscape generator in the
Benogo project.
Supported by the Danish Research Councils (2002-2004) as part of the 3DVDM
project.
(Søren Bovbjerg, Henrik R. Nagel,
Erik Granum)
ARTHUR: Augmented Round Table for Architecture and Urban Planning / Augmenteret
"rundbords‑designværktøj" for arkitekter og byplanlæggere
ARTHUR is a 6 partner EU‑IST‑RTD project (IST‑2000‑28559), Key Action 4, Mixed Realities, that started in 2001 and will continue until 2004. ARTHUR bridges the gap between real and virtual worlds by enhancing the users' current working environment with virtual 3D objects. The project focuses on providing an intuitive environment supporting natural interaction with virtual objects while sustaining existing communication and interaction mechanisms. Real world objects are used as tangible interfaces to make 3D environments attractive even to non‑experts. ARTHUR developed new types of user‑friendly head mounted see‑through displays (HMD), non‑intrusive object tracking mechanisms, and intuitive user interface mechanisms within a location independent multi‑user real‑time augmented reality environment. CVMT develops object and head tracking mechanisms based on computer vision using cameras mounted on the HMD. A colour based tracking has been developed and implemented that is robust to illumination changes and that has real-time performance. Multi-view position and orientation estimation were developed and will be combined with other tracking methods. Furthermore, a computer vision based gesture interface has been developed allowing to recognize gestures without disturbing the user in his or her natural behaviour by cumbersome and wired hand tracking devices.
Funded by EU FP-5, IST-2000-28559, 2001-2004.
(Thomas B. Moeslund, Moritz Störring, Claus B. Madsen, Yong Liu, Erik Granum)
BENOGO: Being There – Without Going / At få oplevelsen uden at tage turen
BENOGO is a 6 partner EU‑FET project (IST‑2001‑39184), which started in 2002
and will continue until 2005. The project is coordinated by the Computer Vision
and Media Technology Group. BENOGO develops and investigates novel computer
graphics rendering techniques (the so‑called Image Based Rendering approach)
with the purpose of optimizing users’ sense of being present at a location
without actually being there. The BENOGO system visualizes existing, physical
locations in stereo on a Head Mounted Display, or in any of VRMediaLab’s
3D arenas. The project’s rendering
technique is based on forming new images in real‑time, and in response to
user movements, using only data from previously acquired real images of an
existing place. With this technology the project can circumvent the 3D modelling
problems traditionally associated with standard Virtual Reality, and at the
same time achieve a very high level of visual realism (photo‑realism). The
project consortium includes experts in the field of psycho‑physics, human
perception, and presence research, and these research domains continuously
evaluate the project’s rendering technology to optimize the performance,
and to develop a theoretical understanding of how the sense of presence is
best provided. In addition to visualizing a world to the user, the project
also investigates the use of 3D sound, and the visualization is also augmented
with virtual objects.
Funded by EU FP5, IST/FET-2001-39184, 2002-2005
(Erik Granum, Claus B. Madsen, Mads Sørensen, Michael Vittrup, Moritz Störring,
Henrik Nagel)
Real and Virtual Shadows in Augmented Reality / Virkelige og kunstige skygger
i Augmented Reality
In Augmented Reality virtual objects are visually combined with real objects
to create the illusion that the virtual objects are in fact just a part of
the real scenario. While this presents many interesting challenges, one area
has so far received very little attention, namely the issue of shadows, or
more precisely the issue of ensuring that the virtual objects are lit and
cast shadows in the same way as the real scene. CVMT is developing techniques
for estimating the positions of real scene light sources, and for estimating
the spectral properties of real shadows, in order to apply this information
in the rendering of the virtual objects. The result is that the virtual objects
mix with the real scenario in a much more realistic manner.
In the reporting period activities have primarily focused at estimating
the parameters of virtual lighting conditions so that these conditions match
the real scene lighting conditions in an optimal manner. This allows us to
realistically recreate very complicated lighting conditions in real-time.
Project under BENOGO (IST/FET), and ARTHUR (EU IST), 2002-2005
(Claus B. Madsen, Mads Sørensen, Rune Laursen)
Software Platform for Real‑Time Image Based Rendering / Software system til
real‑tids billedbaseret visualisering
Image Based Rendering (IBR) is a visualization technique offering some clear
advantages over traditional 3D model‑based computer graphics. Primarily IBR
provides a much higher level of visual realism. The disadvantage of IBR is
that it cannot (yet) be supported by fast purpose‑designed graphics hardware.
The Computer Vision and Media Technology group has developed a software platform
specifically for the IBR requirements of the BENOGO project. This platform
integrates IBR software, traditional model‑based computer graphics rendering,
and 3D sound rendering with user tracker technology. The platform enables
a user to visually explore (move around in) a world being presented through
visualization and sound. The primary challenge is to develop sufficient support
for the massive data exchange and transformation required to do Image Based
Rendering in real‑time. For this purpose the developed platform supports
processing and input data to be distributed on an arbitrary number of standard
computers, in order to achieve sufficient computing power and available memory.
Project under BENOGO (IST/FET),2002-2005
(Claus B. Madsen, Michael Vittrup, Henrik Nagel, Erik Granum)
Realistic Real-Time Visualization for Augmented Reality /
In support of real-time interactive Augmented Reality (AR) applications
a flexible software
system for handling the visualization processes associated with AR is being
developed. The system is aimed at supporting photo-realistic AR, and therefore
handles occlusions between virtual and real geometry, as well as consistent
lighting between virtual and real scene elements. For example the system
employs the techniques described under “Real and Virtual Shadows in Augmented
Reality” described above.
A particular feature of the system as that it operates with separate representations
of the real and virtual scene elements so as to ensure correct occlusions
and to be able to generate the virtual shadows of the virtual objects (virtual
shadows cast also on real objects). The system can operate in video-see-through
mode (where the real world is recorded with a camera and the image displayed
to the user), or in optical-see-through mode, where the user sees the real
world in see-through Head Mounted Displays.
The effort is aiming at developing a stand alone general-purpose AR system
based on a standard computer, a flat panel screen and a video camera. The
system will be used to demonstrate state-of-the-art realistic AR , e.g.,
for edutainment applications.
(Claus B. Madsen, Rune Laursen, Erik Granum)