Invited Talk: Scene Labeling and more – Deep Neural Nets for Autonomous Vehicles (Uwe Franke, Daimler AG)
in
Workshop: Machine Learning for Intelligent Transportation Systems
Abstract
Abstract: For about 80 years, people have been dreaming of cars that are able to drive by themselves. These days, this vision is starting to become reality. For the first time, cars found their way over a long distance in the DARPA Grand Challenge in 2005. Two years later, the famous DARPA Urban Challenge took place. In both events, all finalists based their systems on active sensors, and Google also started their impressive work with a high-end laser scanner accompanied by radars.
In 2013, we let a new S-class vehicle (a.k.a. Bertha) drive itself from Mannheim to Pforzheim, following the route that Bertha Benz took 125 years ago. Bertha’s environment perception was based on close to production radars and (stereo) cameras. For the visual object recognition classical box-based classifiers based on HOG and SVM or shallow neural nets were used. The experiment showed that despite the fact that the used stereo system allows for fully autonomous emergency braking in today’s Mercedes-Benz production cars, the state-of-the-art in computer vision around 2013 was not sufficient to deliver the deep understanding of the scene that we need for cars driving themselves safely in complex urban traffic. The advent of Deep Neural Networks and the fact that GPUs allow to run powerful nets like the GoogLeNet in real-time totally changed the situation. In our current vision system about 80% of all tasks are solved by DNNs or use information delivered by them. The talk sketches the most important building blocks of this system.
Since we do not believe in a purely box-based recognition system we use a Fully Convolutional Network as the core of our vision system. For training and benchmarking we have introduced the Cityscapes Dataset and benchmark suite, publicly available since early 2016. In September, we registered the 1000th download. Within only one year, the pixel level semantic segmentation performance raised up from 65% IoU to more than 77% (October 2016). The results of the semantic labeling stage are subsequently fused with the stereo based Stixel-World, a super-pixel representation of the depth image using small rectangular shaped regions. The result is a very compact representation of the traffic scene including geometry, motion and semantics. In addition, safety demands to watch out for unexpected small objects (down to a height of 5cm) on the street. We fuse the results of a specially trained FCN with a boosted stereo analysis to detect more than 80% of all targets at distances up to 100m at a false positive rate of 1/min only. If depth is not available from stereo or Lidar, it has to be derived from monocular images. We solve the depth-from-mono problem jointly with scene labeling and instance segmentation. It turns out that these sub-tasks support each other well, resulting in close to ground truth results. All schemes run in real-time on a standard GPU. Given the fact that many suppliers have efficient HW components for CNNs on their roadmap, this raises hope that we can use these powerful techniques in the near future in our cars, both for driver assistance and autonomous driving.
Bio: Uwe Franke received the Ph.D. degree in electrical engineering from the Technical University of Aachen, Germany, in 1988 for his work on content based image coding.
Since 1989 he has been with Daimler Research and Development and has been constantly working on the development of vision based driver assistance systems. He developed Daimler’s lane departure warning system introduced in 2000. Since 2000 he has been head of Daimler’s Image Understanding Group. The stereo technology developed by his group is the basis for the Mercedes Benz stereo camera system introduced in 2013. Recent work is on image understanding for autonomous driving, in particular Deep Neural Networks.
He was nominated for the “Deutscher Zukunftspreis”, Germany’s most prestigious award for Technology and Innovation given by the German President and awarded the Karl-Heinz Beckurts Prize 2012.