Understanding human behavior requires understanding 3D human contact with the world. To study this, we introduce a dataset (RICH) and a method (BSTRO) that infers 3D contact on the body from a single image. #CVPR2022 rich.is.tue.mpg.de (1/8)
Conversation
Replying to
RICH uses multi-camera markerless mocap to obtain high-quality 3D human bodies in SMPL-X format. The SMPL-X body shape is fit to minimally clothed scans of 22 subjects. Captures include both single and multiple subjects. (2/8)
2
4
20
RICH also contains highly-accurate 3D scans of the scenes, many of which are outdoors (roughly 60m^2 in area). (3/8)
GIF
2
3
14
The precise ground truth scenes and bodies enable us to locate contact regions on the body in detail. Top: SMPL-X project into the videos. Middle: SMPL-X in 3D scene. Bottom: Green vertices on the mesh show contact. (4/8)
GIF
2
3
12
In total there are 577K images with 4K resolution. In addition to the fixed cameras, RICH contains moving cameras. The quality and complexity of RICH goes beyond existing datasets like PROX and BEHAVE, enabling research on human-scene contact. (5/8)
GIF
2
3
9
Using RICH, we train BSTRO to infer contact on the 3D body from a single image. The key observation is that contact surfaces are always occluded. To reason about them, we leverage a transformer that can attend to relevant parts of the image. (6/8)
2
6
11
Even though BSTRO is trained on RICH, it generalizes to very different images in the wild. The red regions on the body mesh show inferred contacts. (7/8)
2
3
12
By: , , , M. Safroshkin, , S. Polikovsky, D. Scharstein, me.
#CVPR2022 Poster 257a, Session 3.1, 23-Jun, 10:00AM-12:30PM
Project: rich.is.tue.mpg.de
Video: youtu.be/IbFc12L5Kc4
PDF: tinyurl.com/3hfw9euw (8/8)
GIF
1
4
17
Replying to
I'm happy to see there's a natural amount of flab on the bodies. It moves in a really interesting way.
1
1
7
Show replies
Replying to
This looks awesome! I'm curious about what is the motivation of using SMPL-X over STAR?
1


