The dataset includes 300 triangulated, high-resolution non-watertight meshes of 10 different subjects, each scanned in 30 different poses.
Our acquisition system is a full-body 3D stereo capture system (3dMD, Atlanta, GA). It is composed by 22 modular scanning units; each unit contains a pair of stereo cameras, one or two speckle projectors, and a single RGB camera.
Sample scans with corresponding camera images.
We defined ground-truth correspondences between scans by bringing each scan into alignment with a common template mesh. Our template is a triangulated, watertight mesh with a resolution of 6890 vertices.
For definining dense correspondences, we painted the subjects with a high-frequency pattern and placed textured markers on key anatomical locations. The texture pattern is heavily exploited by our alignment technique.
Some of the correspondences defined by our alignments are less reliable than others, since scans are noisy and incomplete and our alignments are the results of an optimization process.
To ensure we have ground truth, we evaluated the quality of our alignments in terms of both geometry and color, discarding vertices that are not aligned within an accuracy of 2mm.
Our evaluation process ensures that 80% of the vertices of all the scans in the dataset are reliably aligned.
The main cause of discarded vertices seems missing data (especially for hands and feet).
Capturing skin stretching turned out to be problematic, in particular in areas like armpits, thighs, forearms and for old subjects.
Between different subjects, we defined only a set of sparse correspondences. We drew a set of 17 easily identifiable landmarks on specific body points where bones are palpable. We detected the position of each landmark in camera images, and back projected the identified 2D points to scan surface points. For evaluation purposes, we do not provide the exact landmark locations.
Note that our ground-truth correspondences are the result of an automated process. We reserve the right to modify them, whenever needed. Any change will be adequately highlighted on the main page of the website
The training set includes 100 scans (10 per subject) with their corresponding alignments.
For each scan vertex in the training set, we specify whether it is reliably aligned or not.
Overview of the 10 poses in the training set.
The training set includes 200 scans (20 per subject).
Alignments and information about "ground-truth" vertices for these scans are withheld for evaluation purposes.
Overview of the 20 poses in the test set.