Speaker: Mikhail Belkin, Dept. of CSE
Time: Wed 11/20, 4pm-5pm
Location: Dreese Lab 480
Title: It’s time to think again
Abstract: Deep learning has drastically changed the practice in many applied areas of AI. But an equally profound and not yet fully recognized impact of deep learning is in forcing us to rethink many deeply held beliefs and theoretical assumptions. I will discuss some of these ideas, such as over-fitting and capacity control, and show why they fail to adequately describe modern machine learning. I will point to the types of analyses we need to understand and develop modern practice.
Bio: Mikhail Belkin is a Professor in the departments of Computer Science and Engineering and Statistics at the Ohio State University. He received a PhD in mathematics from the University of Chicago in 2003. His research focuses on understanding structure in data, the principles of recovering such structures, and their computational, mathematical and statistical properties. His notable work includes algorithms such as Laplacian Eigenmaps and Manifold Regularization, which use ideas of classical differential geometry for analyzing non-linear high-dimensional data. He is the recipient of a NSF Career Award, and has served on editorial boards of the Journal of Machine Learning Research and IEEE PAMI.
Speaker: Wei-Lun (Harry) Chao
Time: Wed 11/6, 4pm-5pm
Location: Dreese Lab 480
Title: “Pseudo-LiDAR for Image-based 3D Object Detection in Autonomous Driving”
Abstract: Detecting objects such as cars and pedestrians in 3D plays an indispensable role in autonomous driving. Recent techniques excel with highly accurate detection rates, provided that the 3D input data is obtained from precise but expensive LiDAR technology. Approaches based on cheaper monocular or stereo imagery data have, until now, resulted in drastically lower accuracies — a gap that is commonly attributed to poor image-based depth estimation.
In this talk, I will show that it is not the quality of the data but its representation that accounts for the majority of the performance gap. Taking the inner workings of ConvNets into consideration, we propose to convert image-based depth maps to pseudo-LiDAR representations — essentially mimicking the LiDAR signal. With this representation, we can apply different existing LiDAR-based detection algorithms. On the popular KITTI benchmark, our approach significantly outperforms the existing state of the art in image-based performance — leading to a 300% relative improvement and halving the gap between image-based and LiDAR-based systems.
I will then present methods to advance the pseudo-LiDAR framework through improvements in image-based depth estimation. Concretely, we adapt the network architecture and loss function to be more aligned with accurate depth estimation. Further, we explore the idea to leverage cheaper but extremely sparse LiDAR sensors, which alone provide insufficient information for 3D detection, to de-bias our depth estimation. On the KITTI benchmark, our combined approach yields substantial improvements — leading to a 40% relative improvement for far-away objects and achieving comparable performance to LiDAR-based systems for nearby objects. I will conclude the talk with research challenges and opportunities in robust perception for autonomous driving.
Bio: Wei-Lun (Harry) Chao is an Assistant Professor in Computer Science and Engineering at the Ohio State University. His research interests are in machine learning and its applications to computer vision, natural language processing, artificial intelligence, and healthcare. His recent work has focused on robust robotic perception and large-scale visual understanding in the wild. Prior to joining OSU, he was a Postdoctoral Associate in Computer Science at Cornell University working with Prof. Kilian Q. Weinberger and Prof. Mark Campbell. He received a Ph.D. degree in Computer Science from the University of Southern California, supervised by Prof. Fei Sha.