Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild
April 2020, CVPR 2020 (Oral) Best Paper Award Nomination (Top 0.4%)
Dominik Kulon, Riza Alp Güler, Iasonas Kokkinos, Michael Bronstein, Stefanos Zafeiriou
We introduce a simple and effective network architecture for monocular 3D hand pose estimation consisting of an image encoder followed by a mesh convolutional decoder that is trained through a direct 3D hand mesh reconstruction loss. We train our network by gathering a large-scale dataset of hand action in YouTube videos and use it as a source of weak supervision. Our system largely outperforms state-of-the-art methods, even halving the errors on the in the wild benchmark.