Timezone: »

 
Workshop
Towards an Artificial Intelligence for Data Science
Charles Sutton · James Geddes · Zoubin Ghahramani · Padhraic Smyth · Chris Williams

Fri Dec 11:00 PM -- 09:30 AM PST @ Room 114
Event URL: http://workshops.inf.ed.ac.uk/nips2016-ai4datasci/ »

Machine learning methods have applied beyond their origins in artificial intelligence to a wide variety of data analysis problems in fields such as science, health care, technology, and commerce. Previous research in machine learning, perhaps motivated by its roots in AI, has primarily aimed at fully-automated approaches for prediction problems. But predictive analytics is only one step in the larger pipeline of data science, which includes data wrangling, data cleaning, exploratory visualization, data integration, model criticism and revision, and presentation of results to domain experts.


An emerging strand of work aims to address all of these challenges in one stroke is by automating a greater portion of the full data science pipeline. This workshop will bring together experts in machine learning, data mining, databases and statistics to discuss the challenges that arise in the full end-to-end process of collecting data, analysing data, and making decisions and building new methods that support, whether in an automated or semi-automated way, more of the full process of analysing real data.


Considering the full process of data science raises interesting questions for discussion, such as: What aspects of data analysis might potentially be automated and what aspects seem more difficult? Statistical model building often emphasizes interpretability and human understanding, while machine learning often emphasizes predictive modeling --- are ML methods truly suitable for supporting the full data analysis pipeline? Do recent advances in ML offer help here? Finally, are there low hanging fruit, i.e., how much time is wasted on routine tasks in scientific data analysis that could be automated?

Specific topics of interest include: data cleaning, exploratory data analysis, semi-supervised learning, active learning, interactive machine learning, model criticism, automated and semi-automated model construction, usable machine learning, interpretable prediction methods and automatic methods to explain predictions. We are especially interested in contributions that take a broader perspective, i.e., that aim toward supporting the process of data science more holistically.

12:10 AM Automated Data Cleaning via Multi-View Anomaly Detection (Talk)|| Tom Dietterich
12:50 AM Automatic Discovery of the Statistical Types of Variables in a Dataset (Talk)|| Isabel Valera, Zoubin Ghahramani
01:10 AM Poster spotlights (Talk)||
02:00 AM Invited talk, Christian Steinruecken (Talk)|| Christian Steinruecken
02:40 AM Probabilistic structure discovery in time series data (Talk)|| David Janz, Brooks Paige, Tom Rainforth, Jan-Willem van de Meent
03:00 AM Poster session
05:00 AM Invited talk, Carlos Guestrin (Talk)|| Carlos Guestrin
05:40 AM An Overview of the DARPA Data Driven Discovery of Models (D3M) Program (Talk)|| Richard Lippmann, William Campbell
06:30 AM Invited talk, Frank Hutter (Talk)|| Frank Hutter
07:10 AM Data Analytics as Data: A Semantic Workflow Approach (Talk)|| Kristin P Bennett
07:30 AM General-Purpose Inductive Programming for Data Wrangling Automation (Talk)||

Author Information

Charles Sutton (Google)
James Geddes (The Alan Turing Institute)
Zoubin Ghahramani (Uber and University of Cambridge)

Zoubin Ghahramani is Professor of Information Engineering at the University of Cambridge, where he leads the Machine Learning Group. He studied computer science and cognitive science at the University of Pennsylvania, obtained his PhD from MIT in 1995, and was a postdoctoral fellow at the University of Toronto. His academic career includes concurrent appointments as one of the founding members of the Gatsby Computational Neuroscience Unit in London, and as a faculty member of CMU's Machine Learning Department for over 10 years. His current research interests include statistical machine learning, Bayesian nonparametrics, scalable inference, probabilistic programming, and building an automatic statistician. He has held a number of leadership roles as programme and general chair of the leading international conferences in machine learning including: AISTATS (2005), ICML (2007, 2011), and NIPS (2013, 2014). In 2015 he was elected a Fellow of the Royal Society.

Padhraic Smyth (University of California, Irvine)
Chris Williams (University of Edinburgh)

More from the Same Authors