|
Abstract
As humans, we learn a lot about how to interact with the world by observing others
interacting with their hands. To help AI systems obtain a better understanding of
hand interactions, we introduce a new model that produces a rich understanding
of hand interaction. Our system produces a richer output than past systems at a
larger scale. Our outputs include boxes and segments for hands, in-contact objects,
and second objects touched by tools as well as contact and grasp type. Supporting
this method are annotations of 257K images, 401K hands, 288K objects, and 19K
second objects spanning four datasets. We show that our method provides rich
information and performs and generalizes well.
|