Conversation

Replying to
There has been a lot of innovation in 2D-to-BEV "lifting" methods (MLPs, depth-based splatting, transformers, etc.), but we find that a parameter-free method is still very competitive: for each 3D coordinate, we simply take a bilinear sample at the image features it projects to.
Image
1
5
We found prior work reporting that radar (esp. in nuScenes) is too sparse to be useful. We were skeptical -- radar gives a metric map of the scene, which is very difficult to get from non-overlapping cameras. We show camera+radar fusion is indeed better than using cameras alone.
9