Timezone: »

2017 NIPS Workshop on Machine Learning for Intelligent Transportation Systems
Li Erran Li · Anca Dragan · Juan Carlos Niebles · Silvio Savarese

Sat Dec 09 08:00 AM -- 06:30 PM (PST) @ 201 A
Event URL: https://sites.google.com/site/nips2017mlits/ »

Our transportation systems are poised for a transformation as we make progress on autonomous vehicles, vehicle-to-vehicle (V2V) and vehicle-to-everything (V2X) communication infrastructures, and smart road infrastructures such as smart traffic lights.
There are many challenges in transforming our current transportation systems to the future vision. For example, how to make perception accurate and robust to accomplish safe autonomous driving? How to learn long term driving strategies (known as driving policies) so that autonomous vehicles can be equipped with adaptive human negotiation skills when merging, overtaking and giving way, etc? how do we achieve near-zero fatality? How do we optimize efficiency through intelligent traffic management and control of fleets? How do we optimize for traffic capacity during rush hours? To meet these requirements in safety, efficiency, control, and capacity, the systems must be automated with intelligent decision making.

Machine learning will be essential to enable intelligent transportation systems. Machine learning has made rapid progress in self-driving, e.g. real-time perception and prediction of traffic scenes, and has started to be applied to ride-sharing platforms such as Uber (e.g. demand forecasting) and crowd-sourced video scene analysis companies such as Nexar (understanding and avoiding accidents). To address the challenges arising in our future transportation system such as traffic management and safety, we need to consider the transportation systems as a whole rather than solving problems in isolation. New machine learning solutions are needed as transportation places specific requirements such as extremely low tolerance on uncertainty and the need to intelligently coordinate self-driving cars through V2V and V2X.

The goal of this workshop is to bring together researchers and practitioners from all areas of intelligent transportations systems to address core challenges with machine learning. These challenges include, but are not limited to
accurate and efficient pedestrian detection, pedestrian intent detection,
machine learning for object tracking,
unsupervised representation learning for autonomous driving,
deep reinforcement learning for learning driving policies,
cross-modal and simulator to real-world transfer learning,
scene classification, real-time perception and prediction of traffic scenes,
uncertainty propagation in deep neural networks,
efficient inference with deep neural networks
predictive modeling of risk and accidents through telematics, modeling, simulation and forecast of demand and mobility patterns in large scale urban transportation systems,
machine learning approaches for control and coordination of traffic leveraging V2V and V2X infrastructures,

The workshop will include invited speakers, panels, presentations of accepted papers and posters. We invite papers in the form of short, long and position papers to address the core challenges mentioned above. We encourage researchers and practitioners on self-driving cars, transportation systems and ride-sharing platforms to participate. Since this is a topic of broad and current interest, we expect at least 150 participants from leading university researchers, auto-companies and ride-sharing companies.

Sat 8:45 a.m. - 9:00 a.m.
Opening Remarks (Openning)
Li Erran Li
Sat 9:00 a.m. - 9:30 a.m.

Bio: Raquel Urtasun is the Head of Uber ATG Toronto. She is also an Associate Professor in the Department of Computer Science at the University of Toronto, a Raquel Urtasun is the Head of Uber ATG Toronto. She is also an Associate Professor in the Department of Computer Science at the University of Toronto, a Canada Research Chair in Machine Learning and Computer Vision and a co-founder of the Vector Institute for AI. Prior to this, she was an Assistant Professor at the Toyota Technological Institute at Chicago (TTIC), an academic computer science institute affiliated with the University of Chicago. She was also a visiting professor at ETH Zurich during the spring semester of 2010. She received her Bachelors degree from Universidad Publica de Navarra in 2000, her Ph.D. degree from the Computer Science department at Ecole Polytechnique Federal de Lausanne (EPFL) in 2006 and did her postdoc at MIT and UC Berkeley. She is a world leading expert in machine perception for self-driving cars. Her research interests include machine learning, computer vision, robotics and remote sensing. Her lab was selected as an NVIDIA NVAIL lab. She is a recipient of an NSERC EWR Steacie Award, an NVIDIA Pioneers of AI Award, a Ministry of Education and Innovation Early Researcher Award, three Google Faculty Research Awards, an Amazon Faculty Research Award, a Connaught New Researcher Award and two Best Paper Runner up Prize awarded at the Conference on Computer Vision and Pattern Recognition (CVPR) in 2013 and 2017 respectively. She is also an Editor of the International Journal in Computer Vision (IJCV) and has served as Area Chair of multiple machine learning and vision conferences (i.e., NIPS, UAI, ICML, ICLR, CVPR, ECCV).

Raquel Urtasun
Sat 9:30 a.m. - 10:00 a.m.

Abstract: Reinforcement learning and imitation learning have seen success in many domains, including autonomous helicopter flight, Atari, simulated locomotion, Go, robotic manipulation. However, sample complexity of these methods remains very high. In this talk I will present two ideas towards effective data collection, and initial findings indicating promise for both: (i) Domain Randomization, which relies on extensive variation (none of it necessarily realistic) in simulation, aiming at generalization to the real world per the real world (hopefully) being like just another random sample. (ii) Self-supervised Deep RL, which considers the problem of autonomous data collection. We evaluate our approach on a real-world RC car and show it can learn to navigate through a complex indoor environment with a few hours of fully autonomous, self-supervised training.

BIO: Pieter Abbeel (Professor at UC Berkeley [2008- ], Co-Founder Embodied Intelligence [2017- ], Co-Founder Gradescope [2014- ], Research Scientist at OpenAI [2016-2017]) works in machine learning and robotics, in particular his research focuses on making robots learn from people (apprenticeship learning), how to make robots learn through their own trial and error (reinforcement learning), and how to speed up skill acquisition through learning-to-learn. His robots have learned advanced helicopter aerobatics, knot-tying, basic assembly, and organizing laundry. His group has pioneered deep reinforcement learning for robotics, including learning visuomotor skills and simulated locomotion. He has won various awards, including best paper awards at ICML, NIPS and ICRA, the Sloan Fellowship, the Air Force Office of Scientific Research Young Investigator Program (AFOSR-YIP) award, the Office of Naval Research Young Investigator Program (ONR-YIP) award, the DARPA Young Faculty Award (DARPA-YFA), the National Science Foundation Faculty Early Career Development Program Award (NSF-CAREER), the Presidential Early Career Award for Scientists and Engineers (PECASE), the CRA-E Undergraduate Research Faculty Mentoring Award, the MIT TR35, the IEEE Robotics and Automation Society (RAS) Early Career Award, IEEE Fellow, and the Dick Volz Best U.S. Ph.D. Thesis in Robotics and Automation Award.

Pieter Abbeel, Greg Kahn
Sat 10:00 a.m. - 10:30 a.m.

Abstract: The talk will include a brief overview of methods for planning and perception developed at Zoox, and focus on some recent results for learning-based system identification and decision making.

Marin Kobilarov is principal engineer for planning and control at Zoox and assistant professor in Mechanical Engineering at the Johns Hopkins University where he leads the Laboratory for Autonomous Systems, Control, and Optimization. His research focuses on planning and control of robotic systems, on approximation methods for optimization and statistical learning, and applications to autonomous vehicles. Until 2012, he was a postdoctoral fellow in Control and Dynamical Systems at the California Institute of Technology. He obtained a Ph.D. from the University of Southern California in Computer Science (2008) and a B.S. in Computer Science and Applied Mathematics from Trinity College, Hartford, CT (2003).

Marin Kobilarov
Sat 10:30 a.m. - 11:00 a.m.
Poster and Coffee (Coffee)
Sat 11:00 a.m. - 11:10 a.m.
Hesham M. Eraqi, Mohamed N. Moustafa, Jens Honer, End-to-End Deep Learning for Steering Autonomous Vehicles Considering Temporal Dependencies (Contributed Talk)
Sat 11:10 a.m. - 11:20 a.m.
Abhinav Jauhri (CMU), Carlee Joe-Wong, John Paul Shen, On the Real-time Vehicle Placement Problem (Contributed Talk)
Abhinav Jauhri, John Shen
Sat 11:20 a.m. - 11:30 a.m.
Andrew Best (UNC), Sahil Narang, Lucas Pasqualin, Daniel Barber, Dinesh Manocha, AutonoVi-Sim: Autonomous Vehicle Simulation Platform with Weather, Sensing, and Traffic control (Contributed Talk)
Andrew Best
Sat 11:30 a.m. - 11:40 a.m.
Nikita Japuria (MIT), Golnaz Habibi, Jonathan P. How, CASNSC: A context-based approach for accurate pedestrian motion prediction at intersections (Contributed Talk)
Nikita Jaipuria
Sat 11:40 a.m. - 12:00 p.m.
  1. Mennatullah Siam, Heba Mahgoub, Mohamed Zahran, Senthil Yogamani, Martin Jagersand, Ahmad El-Sallab, Motion and appearance based Multi-Task Learning network for autonomous driving

  2. Dogancan Temel, Gukyeong Kwon, Mohit Prabhushankar, Ghassan AlRegib, CURE-TSR: Challenging Unreal and Real Environments for Traffic Sign Recognition

  3. Priyam Parashar, Akansel Cosgun, Alireza Nakhaei and Kikuo Fujimura, Modeling Preemptive Behaviors for Uncommon Hazardous Situations From Demonstrations

  4. Mustafa Mukadam, Akansel Cosgun, Alireza Nakhaei, Kikuo Fujimura, Tactical Decision Making for Lane Changing with Deep Reinforcement Learning

  5. Hengshuai Yao, Masoud S. Nosrati, Kasra Rezaee, Monte-Carlo Tree Search vs. Model-Predictive Controller: A Track-Following Example

  6. Ransalu Senanayake, Thushan Ganegedara, Fabio Ramos, Deep occupancy maps: a continuous mapping technique for dynamic environments

Mennatullah Siam, Mohit Prabhushankar, Priyam Parashar, Mustafa Mukadam, hengshuai yao, Ransalu Senanayake
Sat 12:00 p.m. - 1:30 p.m.
Sat 1:30 p.m. - 2:00 p.m.

Learning of layered or "deep" representations has provided significant advances in computer vision in recent years, but has traditionally been limited to fully supervised settings with very large amounts of training data, where the model lacked interpretability. New results in adversarial adaptive representation learning show how such methods can also excel when learning across modalities and domains, and further can be trained or constrained to provide natural language explanations or multimodal visualizations to their users. I'll present recent long-term recurrent network models that learn cross-modal description and explanation, using implicit and explicit approaches, which can be applied to domains including fine-grained recognition and visuomotor policies.

Trevor Darrell
Sat 2:00 p.m. - 2:30 p.m.

Abstract: I'll cover cover some of the discrepancies between machine learning problems in academia and those in the industry, especially in the context of autonomous vehicles. I will also cover Tesla's approach to massive fleet learning and some of the associated open research problems.

Bio: Andrej is a Director of AI at Tesla, where he focuses on computer vision for the Autopilot. Previously he was a research scientist at OpenAI working on Reinforcement Learning and a PhD student at Stanford working on end-to-end learning of Convolutional/Recurrent neural network architectures for images and text.

Sat 2:30 p.m. - 3:00 p.m.

Abstract: In this talk, we will focus on the broader angle of applying machine learning to different aspects of transportation - ranging from traffic congestion, real-time speed estimation, image based localization, and active map making as examples. In particular, as we grow the portfolio of models, we see an unique opportunity in building out a unified framework with a number of micro-perception services for intelligent transport which allows for portability and optimization across multiple transport use cases. We also discuss implications for existing ride-sharing transport as well as potential impact to autonomous.

Bio: Dr. Ramesh Sarukkai currently heads up the Geo teams (Mapping, Localization & Perception) at Lyft. Prior to that he was a Director of Engineering at Facebook and Google/YouTube where he led a number of platform & products initiatives including applied machine learning teams, consumer/advertising video products and core payments/risk/developer platforms. He has given a number of talks/keynotes/panelist at major conferences/workshops such as W3C WWW Conferences, ACM Multimedia, and published/presented papers at leading journals/conferences on internet technologies, speech/audio, computer vision and machine learning, in addition to authoring a book on “Foundations of Web Technology” (Kluwer/Springer). He also holds a large number of patents in the aforementioned areas and graduated with a PhD in computer science from the University of Rochester.

Ramesh Sarukkai
Sat 3:00 p.m. - 3:30 p.m.
Posters and Coffe (Coffee)
Sat 3:30 p.m. - 4:00 p.m.

Abstract: Fully-autonomous driving has long been an sought after. DARPA’s efforts dating back a decade has ignited the first spark, showcasing the possibilities. Then, the AI revolution pushed the boundaries. These lead to the creation of a rapidly-growing ecosystem around developing self-driving capabilities. In this talk, we briefly summarize our experience in the DARPA Urban Challenge as Team MIT, one of the six finishers of the race. We then highlight few of our recent research results at MIT, including end-to-end deep learning for parallel autonomy and sparse-to-dense depth estimation for autonomous driving. We conclude with a few questions that may be relevant in the near future.

Bio: Sertac Karaman is an Associate Professor of Aeronautics and Astronautics at the Massachusetts Institute of Technology (since Fall 2012). He has obtained B.S. degrees in mechanical engineering and in computer engineering from the Istanbul Technical University, Turkey, in 2007; an S.M. degree in mechanical engineering from MIT in 2009; and a Ph.D. degree in electrical engineering and computer science also from MIT in 2012. His research interests lie in the broad areas of robotics and control theory. In particular, he studies the applications of probability theory, stochastic processes, stochastic geometry, formal methods, and optimization for the design and analysis of high-performance cyber-physical systems. The application areas of his research include driverless cars, unmanned aerial vehicles, distributed aerial surveillance systems, air traffic control, certification and verification of control systems software, and many others. He is the recipient of an IEEE Robotics and Automation Society Early Career Award in 2017, an Office of Naval Research (ONR) Young Investigator Award in 2017, Army Research Office (ARO) Young Investigator Award in 2015, National Science Foundation Faculty Career Development (CAREER) Award in 2014, AIAA Wright Brothers Graduate Award in 2012, and an NVIDIA Fellowship in 2011.

Sertac Karaman
Sat 4:00 p.m. - 4:30 p.m.

Abstract: Consider learning a policy from example expert behavior, without interaction with the expert or access to a reward or cost signal. One approach is to recover the expert’s cost function with inverse reinforcement learning, then compute an optimal policy for that cost function. This approach is indirect and can be slow. In this talk, I will discuss a new generative modeling framework for directly extracting a policy from data, drawing an analogy between imitation learning and generative adversarial networks. I will derive a model-free imitation learning algorithm that obtains significant performance gains over existing methods in imitating complex behaviors in large, high-dimensional environments. Our approach can also be used to infer the latent structure of human demonstrations in an unsupervised way. As an example, I will show a driving application where a model learned from demonstrations is able to both produce different driving styles and accurately anticipate human actions using raw visual inputs.


Stefano Ermon is currently an Assistant Professor in the Department of Computer Science at Stanford University, where he is affiliated with the Artificial Intelligence Laboratory. He completed his PhD in computer science at Cornell in 2015. His research interests include techniques for scalable and accurate inference in graphical models, large-scale combinatorial optimization, and robust decision making under uncertainty, and is motivated by a range of applications, in particular ones in the emerging field of computational sustainability. Stefano's research has won several awards, including three Best Paper Awards, a World Bank Big Data Innovation Challenge, and was selected by Scientific American as one of the 10 World Changing Ideas in 2016. He is a recipient of the Sony Faculty Innovation Award and NSF CAREER Award.

Stefano Ermon
Sat 4:30 p.m. - 5:00 p.m.

Photo-realistic simulation is rapidly gaining momentum for visual training and test data generation in autonomous driving and general robotic contexts. This is particularly the case for video analysis, where manual labeling of data is extremely difficult or even impossible. This scarcity of adequate labeled training data is widely accepted as a major bottleneck of deep learning algorithms for important video understanding tasks like segmentation, tracking, and action recognition. In this talk, I will describe our use of modern game engines to generate large scale, densely labeled, high-quality synthetic video data with little to no manual intervention. In contrast to approaches using existing video games to record limited data from human game sessions, we build upon the more powerful approach of “virtual world generation”. Pioneering this approach, the recent Virtual KITTI [1] and SYNTHIA [2] datasets are among the largest fully-labelled datasets designed to boost perceptual tasks in the context of autonomous driving and video understanding (including semantic and instance segmentation, 2D and 3D object detection and tracking, optical flow estimation, depth estimation, and structure from motion). With our recent PHAV dataset [3], we push the limits of this approach further by providing stochastic simulations of human actions, camera paths, and environmental conditions. I will describe our work on these synthetic 4D environments to automatically generate potentially infinite amounts of varied and realistic data. I will also describe how to measure and mitigate the domain gap when learning deep neural networks for different perceptual tasks needed for self-driving. I will finally show some recent results on more interactive simulation for autonomous driving and adversarial learning to automatically improve the output of simulators.

Adrien Gaidon
Sat 5:00 p.m. - 6:00 p.m.

Dmitry Chichkov - NVIDIA Adrien Gaidon - TRI Gregory Kahn - Berkeley Sertac Karaman - MIT Ramesh Sarukkai - Lyft

Greg Kahn, Ramesh Sarukkai, Adrien Gaidon, Sertac Karaman

Author Information

Li Erran Li (Pony.ai)

Li Erran Li is the head of machine learning at Scale and an adjunct professor at Columbia University. Previously, he was chief scientist at Pony.ai. Before that, he was with the perception team at Uber ATG and machine learning platform team at Uber where he worked on deep learning for autonomous driving, led the machine learning platform team technically, and drove strategy for company-wide artificial intelligence initiatives. He started his career at Bell Labs. Li’s current research interests are machine learning, computer vision, learning-based robotics, and their application to autonomous driving. He has a PhD from the computer science department at Cornell University. He’s an ACM Fellow and IEEE Fellow.

Anca Dragan (UC Berkeley)
Juan Carlos Niebles (Stanford University)
Silvio Savarese (Stanford University)

More from the Same Authors