Collision Replay: What Does Bumping Into Things Tell You About Scene Geometry?

University of Michigan


What does bumping into things in a scene tell you about scene geometry? In this paper, we investigate the idea of learning from collisions. At the heart of our approach is the idea of collision replay, where we use examples of a collision to provide supervision for observations at a past frame. We use collision replay to train convolutional neural networks to predict a distribution over collision time from new images. This distribution conveys information about the navigational affordances (e.g., corridors vs open spaces) and, as we show, can be converted into the distance function for the scene geometry. We analyze this approach with an agent that has noisy actuation in a photorealistic simulator.

Remote Prediction Results

Below, we show a video of predicted navigability maps while traversing four held-out test set environments from the Gibson dataset. We compare two versions of our model, denoted as 'Noisy Training' and 'Noiseless Training'. In the noisy case (our primary setting), we show results from a model trained using dead-reckoning with noisy estimates of the agent's egomotion. We show the second, noiseless, case as a visualization of what the method could achieve with access to more accurate egomotion.

Interactive Results

Explore our results! Our model can estimate the 2D distance function of new scenes, that is, the distance of each point to the closest obstacle. Examples of this distance prediction are shown on the left, and can be viewed for many example images. Clicking on one of the points reveals a plot of the underlying 'hitting time distribution' for that point, which indicates how many steps the model estimates an agent could travel in any given direction before colliding with the environment.



author    = {Raistrick, Alexander and Kulkarni, Nilesh and Fouhey, David F.},
title     = {Collision Replay: What Does Bumping Into Things Tell You About Scene Geometry?},
journal   = {BMVC},
year      = {2021},


This work was supported by the DARPA Machine Common Sense Program. Nilesh Kulkarni was supported by TRI. Toyota Research Institute (“TRI”) provided funds to assist the authors with their research but this article solely reflects the opinions and conclusions of its authors and not TRI or any other Toyota entity. This website was adapted from a template provided by Nerfies