Timezone: »
While the machine learning community has primarily focused on analysing the output of a single data source, there has been relatively few attempts to develop a general framework, or heuristics, for analysing several data sources in terms of a shared dependency structure. Learning from multiple data sources (or alternatively, the data fusion problem) is a timely research area. Due to the increasing availability and sophistication of data recording techniques and advances in data analysis algorithms, there exists many scenarios in which it is necessary to model multiple, related data sources, i.e. in fields such as bioinformatics, multi-modal signal processing, information retrieval, sensor networks etc. The open question is to find approaches to analyse data which consists of more than one set of observations (or view) of the same phenomenon. In general, existing methods use a discriminative approach, where a set of features for each data set is found in order to explicitly optimise some dependency criterion. However, a discriminative approach may result in an ad hoc algorithm, require regularisation to ensure erroneous shared features are not discovered, and it is difficult to incorporate prior knowledge about the shared information. A possible solution is to overcome these problems is a generative probabilistic approach, which models each data stream as a sum of a shared component and a private component that models the within-set variation. In practice, related data sources may exhibit complex co-variation (for instance, audio and visual streams related to the same video) and therefore it is necessary to develop models that impose structured variation within and between data sources, rather than assuming a so-called 'flat' data structure. Additional methodological challenges include determining what is the 'useful' information to extract from the multiple data sources, and building models for predicting one data source given the others. Finally, as well as learning from multiple data sources in an unsupervised manner, there is the closely related problem of multitask learning, or transfer learning where a task is learned from other related tasks.
Author Information
David R Hardoon (SAS)
Gayle Leen (Helsinki University of Technology)
Samuel Kaski (Aalto University and University of Helsinki)
John Shawe-Taylor (UCL)
John Shawe-Taylor has contributed to fields ranging from graph theory through cryptography to statistical learning theory and its applications. However, his main contributions have been in the development of the analysis and subsequent algorithmic definition of principled machine learning algorithms founded in statistical learning theory. This work has helped to drive a fundamental rebirth in the field of machine learning with the introduction of kernel methods and support vector machines, driving the mapping of these approaches onto novel domains including work in computer vision, document classification, and applications in biology and medicine focussed on brain scan, immunity and proteome analysis. He has published over 300 papers and two books that have together attracted over 60000 citations. He has also been instrumental in assembling a series of influential European Networks of Excellence. The scientific coordination of these projects has influenced a generation of researchers and promoted the widespread uptake of machine learning in both science and industry that we are currently witnessing.
More from the Same Authors
-
2021 : Progress in Self-Certified Neural Networks »
Maria Perez-Ortiz · Omar Rivasplata · Emilio Parrado-Hernández · Benjamin Guedj · John Shawe-Taylor -
2023 : Can Reinforcement Learning support policy makers? A preliminary study with Integrated Assessment Models »
Theodore Wolf · Nantas Nardelli · John Shawe-Taylor · Maria Perez-Ortiz -
2020 Poster: PAC-Bayes Analysis Beyond the Usual Bounds »
Omar Rivasplata · Ilja Kuzborskij · Csaba Szepesvari · John Shawe-Taylor -
2018 Poster: PAC-Bayes bounds for stable algorithms with instance-dependent priors »
Omar Rivasplata · Emilio Parrado-Hernandez · John Shawe-Taylor · Shiliang Sun · Csaba Szepesvari -
2018 Poster: Empirical Risk Minimization Under Fairness Constraints »
Michele Donini · Luca Oneto · Shai Ben-David · John Shawe-Taylor · Massimiliano Pontil -
2018 Tutorial: Statistical Learning Theory: a Hitchhiker's Guide »
John Shawe-Taylor · Omar Rivasplata -
2017 : John Shawe-Taylor - Distribution Dependent Priors for Stable Learning »
John Shawe-Taylor -
2017 : An Efficient Method to Impose Fairness in Linear Models »
Massimiliano Pontil · John Shawe-Taylor -
2017 Workshop: Workshop on Prioritising Online Content »
John Shawe-Taylor · Massimiliano Pontil · Nicolò Cesa-Bianchi · Emine Yilmaz · Chris Watkins · Sebastian Riedel · Marko Grobelnik -
2017 Workshop: From 'What If?' To 'What Next?' : Causal Inference and Machine Learning for Intelligent Decision Making »
Ricardo Silva · Panagiotis Toulis · John Shawe-Taylor · Alexander Volfovsky · Thorsten Joachims · Lihong Li · Nathan Kallus · Adith Swaminathan -
2016 Workshop: "What If?" Inference and Learning of Hypothetical and Counterfactual Interventions in Complex Systems »
Ricardo Silva · John Shawe-Taylor · Adith Swaminathan · Thorsten Joachims -
2014 Poster: Multilabel Structured Output Learning with Random Spanning Trees of Max-Margin Markov Networks »
Mario Marchand · Hongyu Su · Emilie Morvant · Juho Rousu · John Shawe-Taylor -
2012 Workshop: Multi-Trade-offs in Machine Learning »
Yevgeny Seldin · Guy Lever · John Shawe-Taylor · Nicolò Cesa-Bianchi · Yacov Crammer · Francois Laviolette · Gabor Lugosi · Peter Bartlett -
2011 Workshop: New Frontiers in Model Order Selection »
Yevgeny Seldin · Yacov Crammer · Nicolò Cesa-Bianchi · Francois Laviolette · John Shawe-Taylor -
2011 Poster: PAC-Bayesian Analysis of Contextual Bandits »
Yevgeny Seldin · Peter Auer · Francois Laviolette · John Shawe-Taylor · Ronald Ortner -
2010 Talk: Opening Remarks and Awards »
Richard Zemel · Terrence Sejnowski · John Shawe-Taylor -
2009 Workshop: Learning from Multiple Sources with Applications to Robotics »
Barbara Caputo · Nicolò Cesa-Bianchi · David R Hardoon · Gayle Leen · Francesco Orabona · Jaakko Peltonen · Simon Rogers -
2009 Workshop: Grammar Induction, Representation of Language and Language Learning »
Alex Clark · Dorota Glowacka · John Shawe-Taylor · Yee Whye Teh · Chris J Watkins -
2009 Mini Symposium: Assistive Machine Learning for People with Disabilities »
Fernando Perez-Cruz · Emilio Parrado-Hernandez · David R Hardoon · Jaisiel Madrid-Sanchez -
2008 Workshop: New Challanges in Theoretical Machine Learning: Data Dependent Concept Spaces »
Maria-Florina F Balcan · Shai Ben-David · Avrim Blum · Kristiaan Pelckmans · John Shawe-Taylor -
2008 Poster: Theory of matching pursuit »
Zakria Hussain · John Shawe-Taylor -
2007 Workshop: Music, Brain and Cognition. Part 1: Learning the Structure of Music and Its Effects On the Brain »
David R Hardoon · Eduardo Reck-Miranda · John Shawe-Taylor -
2007 Poster: Variational Inference for Diffusion Processes »
Cedric Archambeau · Manfred Opper · Yuan Shen · Dan Cornford · John Shawe-Taylor -
2006 Workshop: Dynamical Systems, Stochastic Processes and Bayesian Inference »
Manfred Opper · Cedric Archambeau · John Shawe-Taylor -
2006 Poster: Tighter PAC-Bayes Bounds »
Amiran Ambroladze · Emilio Parrado-Hernandez · John Shawe-Taylor