While the machine learning community has primarily focused on analysing the output of a single data source, there has been relatively few attempts to develop a general framework, or heuristics, for analysing several data sources in terms of a shared dependency structure. Learning from multiple data sources (or alternatively, the data fusion problem) is a timely research area. Due to the increasing availability and sophistication of data recording techniques and advances in data analysis algorithms, there exists many scenarios in which it is necessary to model multiple, related data sources, i.e. in fields such as bioinformatics, multimodal signal processing, information retrieval, sensor networks etc. The open question is to find approaches to analyse data which consists of more than one set of observations (or view) of the same phenomenon. In general, existing methods use a discriminative approach, where a set of features for each data set is found in order to explicitly optimise some dependency criterion. However, a discriminative approach may result in an ad hoc algorithm, require regularisation to ensure erroneous shared features are not discovered, and it is difficult to incorporate prior knowledge about the shared information. A possible solution is to overcome these problems is a generative probabilistic approach, which models each data stream as a sum of a shared component and a private component that models the withinset variation. In practice, related data sources may exhibit complex covariation (for instance, audio and visual streams related to the same video) and therefore it is necessary to develop models that impose structured variation within and between data sources, rather than assuming a socalled 'flat' data structure. Additional methodological challenges include determining what is the 'useful' information to extract from the multiple data sources, and building models for predicting one data source given the others. Finally, as well as learning from multiple data sources in an unsupervised manner, there is the closely related problem of multitask learning, or transfer learning where a task is learned from other related tasks.
Author Information
David R Hardoon (SAS)
Gayle Leen (Helsinki University of Technology)
Samuel Kaski (Aalto University and University of Helsinki)
John ShaweTaylor (UCL)
John ShaweTaylor has contributed to fields ranging from graph theory through cryptography to statistical learning theory and its applications. However, his main contributions have been in the development of the analysis and subsequent algorithmic definition of principled machine learning algorithms founded in statistical learning theory. This work has helped to drive a fundamental rebirth in the field of machine learning with the introduction of kernel methods and support vector machines, driving the mapping of these approaches onto novel domains including work in computer vision, document classification, and applications in biology and medicine focussed on brain scan, immunity and proteome analysis. He has published over 300 papers and two books that have together attracted over 60000 citations. He has also been instrumental in assembling a series of influential European Networks of Excellence. The scientific coordination of these projects has influenced a generation of researchers and promoted the widespread uptake of machine learning in both science and industry that we are currently witnessing.
More from the Same Authors

2018 Poster: PACBayes bounds for stable algorithms with instancedependent priors »
Omar Rivasplata · Emilio ParradoHernandez · John ShaweTaylor · Shiliang Sun · Csaba Szepesvari 
2018 Poster: Empirical Risk Minimization Under Fairness Constraints »
Michele Donini · Luca Oneto · Shai BenDavid · John ShaweTaylor · Massimiliano Pontil 
2018 Tutorial: Statistical Learning Theory: a Hitchhiker's Guide »
John ShaweTaylor · Omar Rivasplata 
2017 Workshop: Workshop on Prioritising Online Content »
John ShaweTaylor · Massimiliano Pontil · Nicolò CesaBianchi · Emine Yilmaz · Chris Watkins · Sebastian Riedel · Marko Grobelnik 
2017 Workshop: From 'What If?' To 'What Next?' : Causal Inference and Machine Learning for Intelligent Decision Making »
Ricardo Silva · Panagiotis Toulis · John ShaweTaylor · Alexander Volfovsky · Thorsten Joachims · Lihong Li · Nathan Kallus · Adith Swaminathan 
2016 Workshop: "What If?" Inference and Learning of Hypothetical and Counterfactual Interventions in Complex Systems »
Ricardo Silva · John ShaweTaylor · Adith Swaminathan · Thorsten Joachims 
2014 Poster: Multilabel Structured Output Learning with Random Spanning Trees of MaxMargin Markov Networks »
Mario Marchand · Hongyu Su · Emilie Morvant · Juho Rousu · John ShaweTaylor 
2012 Workshop: MultiTradeoffs in Machine Learning »
Yevgeny Seldin · Guy Lever · John ShaweTaylor · Nicolò CesaBianchi · Yacov Crammer · Francois Laviolette · Gabor Lugosi · Peter Bartlett 
2011 Workshop: New Frontiers in Model Order Selection »
Yevgeny Seldin · Yacov Crammer · Nicolò CesaBianchi · Francois Laviolette · John ShaweTaylor 
2011 Poster: PACBayesian Analysis of Contextual Bandits »
Yevgeny Seldin · Peter Auer · Francois Laviolette · John ShaweTaylor · Ronald Ortner 
2010 Talk: Opening Remarks and Awards »
Richard Zemel · Terrence J Sejnowski · John ShaweTaylor 
2009 Workshop: Learning from Multiple Sources with Applications to Robotics »
Barbara Caputo · Nicolò CesaBianchi · David R Hardoon · Gayle Leen · Francesco Orabona · Jaakko Peltonen · Simon Rogers 
2009 Workshop: Grammar Induction, Representation of Language and Language Learning »
Alex Clark · Dorota Glowacka · John ShaweTaylor · Yee Whye Teh · Chris J Watkins 
2009 Mini Symposium: Assistive Machine Learning for People with Disabilities »
Fernando PerezCruz · Emilio ParradoHernandez · David R Hardoon · Jaisiel MadridSanchez 
2008 Workshop: New Challanges in Theoretical Machine Learning: Data Dependent Concept Spaces »
MariaFlorina F Balcan · Shai BenDavid · Avrim Blum · Kristiaan Pelckmans · John ShaweTaylor 
2008 Poster: Theory of matching pursuit »
Zakria Hussain · John ShaweTaylor 
2007 Workshop: Music, Brain and Cognition. Part 1: Learning the Structure of Music and Its Effects On the Brain »
David R Hardoon · Eduardo ReckMiranda · John ShaweTaylor 
2007 Poster: Variational Inference for Diffusion Processes »
Cedric Archambeau · Manfred Opper · Yuan Shen · Dan Cornford · John ShaweTaylor 
2006 Workshop: Dynamical Systems, Stochastic Processes and Bayesian Inference »
Manfred Opper · Cedric Archambeau · John ShaweTaylor 
2006 Poster: Tighter PACBayes Bounds »
Amiran Ambroladze · Emilio ParradoHernandez · John ShaweTaylor