Publications




Feel free to browse our previous publications to learn more about our advancing computer vision technology!

Pix2Face: Direct 3D Face Model Estimation

An efficient, fully automatic method for 3D face shape and pose estimation in unconstrained 2D imagery is presented. The proposed method jointly estimates a dense set of 3D landmarks and facial geometry using a single pass of a modified version of the popular “U-Net” neural network architecture. Additionally, we propose a method for directly estimating a set of 3D Morphable Model (3DMM) parameters, using the estimated 3D landmarks and geometry as constraints in a simple linear system. Qualitative modeling results are presented, as well as quantitative evaluation of predicted 3D face landmarks in unconstrained video sequences.

Pix2Face: Direct 3D Face Model Estimation
AuthorsDaniel Crispell and Maxim Bazik
Authors
Daniel Crispell and Maxim Bazik
Date
30 March 2018
Source
ICCV 2017: 300 3D Facial-Videos In-The-Wild Challenge Workshop
Global-Local Airborne Mapping (GLAM): Reconstructing a City from Aerial Videos

We present a feature-based visual SLAM system for aerial video whose simple design permits near real-time operation, and whose scalability permits large-area mapping using tens of thousands of frames, all on a single conventional computer. Our approach consists of two parallel threads: the first incrementally creates small locally consistent submaps and estimates camera poses at video rate; the second aligns these submaps with one another to produce a single globally consistent map via factor graph optimization over both poses and landmarks. Scale drift is minimized through the use of 7-degree-of-freedom similarity transformations during submap alignment. We quantify our system’s performance on both simulated and real data sets, and demonstrate city-scale map reconstruction accurate to within 2 meters using nearly 90,000 aerial video frames - to our knowledge, the largest and fastest such reconstruction to date.

Global-Local Airborne Mapping (GLAM): Reconstructing a City from Aerial Videos
AuthorsHasnain Vohra, Maxim Bazik, Matthew Antone, Joseph Mundy, William Stephenson
Authors
Hasnain Vohra, Maxim Bazik, Matthew Antone, Joseph Mundy, William Stephenson
Date
30 May 2017
Source
Tech Report
Dataset Augmentation for Pose and Lighting Invariant Face Recognition

The performance of modern face recognition systems is a function of the dataset on which they are trained. Most datasets are largely biased toward “near-frontal” views with benign lighting conditions, negatively effecting recognition performance on images that do not meet these criteria. The proposed approach demonstrates how a baseline training set can be augmented to increase pose and lighting variability using semi-synthetic images with simulated pose and lighting conditions. The semi-synthetic images are generated using a fast and robust 3D shape estimation and rendering pipeline which includes the full head and background. Various methods of incorporating the semi-synthetic renderings into the training procedure of a state of the art deep neural network-based recognition system without modifying the structure of the network itself are investigated. Quantitative results are presented on the challenging IJB-A identification dataset using a state of the art recognition pipeline as a baseline.

Dataset Augmentation for Pose and Lighting Invariant Face Recognition
AuthorsDaniel Crispell, Octavian Biris, Nate Crosswhite, Jeffrey Byrne, Joseph L. Mundy
Authors
Daniel Crispell, Octavian Biris, Nate Crosswhite, Jeffrey Byrne, Joseph L. Mundy
Date
30 May 2017
Source
2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)
Geo-localization using Volumetric Representations of Overhead Imagery

This paper addresses the problem of determining the location of a ground level image by using geo-referenced overhead imagery. The input query image is assumed to be given with no meta-data and the content of the image is to be matched to a priori constructed reference representations. The proposed 3D geo-localization framework performs better than the 2D approach for 75 % of the query images.

Geo-localization using Volumetric Representations of Overhead Imagery
AuthorsOzge C. Ozcanli, Yi Dong, Joseph L. Mundy
Authors
Ozge C. Ozcanli, Yi Dong, Joseph L. Mundy
Date
04 February 2016
Source
International Journal of Computer Vision (IJCV), Volume 116, Issue 3, pp 226-246
A Comparison of Stereo and Multiview 3D Reconstruction Using Cross-sensor Satellite Imagery

In this paper, an automatic geo-location correction framework that corrects multiple satellite images simultaneously is presented. As a result of the proposed correction process, all the images are effectively registered to the same absolute geodetic coordinate frame. The usability and the quality of the correction framework are shown through probabilistic 3D surface model reconstruction. The models given by original satellite geo-positioning meta-data and the corrected meta-data are compared and the quality difference is measured through an entropy-based metric applied onto the high resolution height maps given by the 3D models.

Authors
Ozge C. Ozcanli, Yi Dong, Joseph L. Mundy
Date
04 February 2016
Source
International Journal of Computer Vision (IJCV), Volume 116, Issue 3, pp 226-246
Automatic Geo-location Correction of Satellite Imagery

Modern satellites tag their images with geolocation information using GPS and star tracking systems. Depending on the quality of the geopositioning equipment, errors may range from a few meters to tens of meters on the ground. In this paper, an automatic geolocation correction framework that corrects images from multiple satellites simultaneously is presented. As a result of the proposed correction process, all the images are effectively registered to the same absolute geodetic coordinate frame.

Automatic Geo-location Correction of Satellite Imagery
AuthorsOzge C. Ozcanli, Yi Dong, Joseph L. Mundy, Helen Webb, Riad Hammoud, Victor Tom
Authors
Ozge C. Ozcanli, Yi Dong, Joseph L. Mundy, Helen Webb, Riad Hammoud, Victor Tom
Date
23 June 2014
Source
International Journal of Computer Vision (IJCV), Volume 116, Issue 3, pp 263-277
Automatic Geo-location Correction of Satellite Imagery

Modern satellites tag their images with geolocation information using GPS and star tracking systems. Depending on the quality of the geopositioning equipment, errors may range from a few meters to tens of meters on the ground. In this paper, an automatic geolocation correction framework that corrects images from multiple satellites simultaneously is presented. As a result of the proposed correction process, all the images are effectively registered to the same absolute geodetic coordinate frame.

Automatic Geo-location Correction of Satellite Imagery
AuthorsOzge C. Ozcanli, Yi Dong, Joseph L. Mundy, Helen Webb, Riad Hammoud, Victor Tom
Authors
Ozge C. Ozcanli, Yi Dong, Joseph L. Mundy, Helen Webb, Riad Hammoud, Victor Tom
Date
23 June 2014
Source
International Journal of Computer Vision (IJCV), Volume 116, Issue 3, pp 263-277
3D Modeling Using Miniscule Volume Elements

A new technique to optimize volumetric representation and advances in graphics processing have enabled efficient construction of 3D models from 2D imagery, while fully capturing the uncertainty in the data.

3D Modeling Using Miniscule Volume Elements
AuthorsOzge Ozcanli, Daniel Crispell, Joseph Mundy, Vishal Jain, and Tom Pollard
Authors
Ozge Ozcanli, Daniel Crispell, Joseph Mundy, Vishal Jain, and Tom Pollard
Date
01 August 2012
Source
SPIE Newsroom
Three-Dimensional Volume Representation for Geospatial Data in Voxel Models

Extracting useful geospatial data from imagery is a fundamental challenge that has seen significant growth over the years as technology advances have been brought to bear on the problem. An important component of this problem addresses how the data should be represented to ensure the information content is accurately captured, preserved, and conveyed to consumers. Much of the information contained in the imagery is redundant and should be transformed so that only the essential information is retained and stored, allowing the redundant data to be discarded. An efficient mechanism for achieving this goal is the 3D Voxel model.

Authors
F. Tanner, D. Crispell, and R. Isbell
Date
13 March 2012
Source
ASPRS 2012 Annual Conference
A Variable-Resolution Probabilistic Three-Dimensional Model for Change Detection

Given a set of high-resolution images of a scene, it is often desirable to predict the scene’s appearance from viewpoints not present in the original data for purposes of change detection. When significant 3D relief is present, a model of the scene geometry is necessary for accurate prediction to determine surface visibility relationships. In the absence of an a priori high-resolution model (such as those provided by LIDAR), scene geometry can be estimated from the imagery itself.

Authors
D. Crispell, J. L. Mundy, and G. Taubin
Date
19 January 2012
Source
IEEE Transactions on Geoscience and Remote Sensing
Real-Time Rendering and Dynamic Updating of 3D Volumetric Data

An efficient, fully automatic method for 3D face shape and pose estimation in unconstrained 2D imagery is presented. The proposed method jointly estimates a dense set of 3D landmarks and facial geometry using a single pass of a modified version of the popular “U-Net” neural network architecture. Additionally, we propose a method for directly estimating a set of 3D Morphable Model (3DMM) parameters, using the estimated 3D landmarks and geometry as constraints in a simple linear system. Qualitative modeling results are presented, as well as quantitative evaluation of predicted 3D face landmarks in unconstrained video sequences.

Real-Time Rendering and Dynamic Updating of 3D Volumetric Data
AuthorsAndrew Miller, Vishal Jain, & Joseph Mundy
Authors
Andrew Miller, Vishal Jain, & Joseph Mundy
Date
05 March 2011
Source
Fourth Workshop on General Purpose Processing on Graphics Processing Units