Timezone: »
Flaws in machine learning (ML) algorithm validation are an underestimated global problem. Particularly in automatic biomedical image analysis, chosen performance metrics often do not reflect the domain interest, thus failing to adequately measure scientific progress and hindering translation of ML techniques into practice. A large international expert consortium now created Metrics Reloaded, a comprehensive framework guiding researchers towards problem-aware metric selection. The framework is based on the novel concept of a problem fingerprint - a structured representation of the given problem that captures all aspects relevant for metric selection, from the domain interest to properties of the target structure(s), data set and algorithm output. It supports image-level classification, object detection, semantic and instance segmentation tasks. Users are guided through the process of selecting and applying appropriate validation metrics while being made aware of pitfalls. To improve the user experience, we implemented the framework in an online tool, which also provides a common point of access to explore metric weaknesses and strengths. An instantiation of the framework for various biomedical image analysis use cases demonstrates its broad applicability across domains.
Author Information
Annika Reinke (German Cancer Research Center)
Lena Maier-Hein (German Cancer Research Center (dkfz))
Patrick Scholz (German Cancer Research Center)
Minu D. Tizabi (German Cancer Research Center (DKFZ))
Evangelia Christodoulou (German Cancer Research Center)
Ben Glocker (Imperial College London)
Fabian Isensee (German Cancer Research Center (DKFZ))
Jens Kleesiek (Institute for AI in Medicine (IKIM), University Medicine Essen)
Michal Kozubek (Masaryk University)
Mauricio Reyes (University of Bern)
Michael A. Riegler (SimulaMet)
Manuel Wiesenfarth (German Cancer Research Center)
Michael Baumgartner (German Cancer Research Center (DKFZ))
Matthias Eisenmann (German Cancer Research Center (DKFZ))
Doreen Heckmann-Nötzel (German Cancer Research Center)
A. Kavur
Tim Rädsch (German Cancer Research Center (DKFZ))
Laura Acion (University of Buenos Aires - CONICET)
Michela Antonelli (King's College London)
Tal Arbel (McGill University)
Spyridon Bakas (University of Pennsylvania)
Pete Bankhead (University of Edinburgh)
Arriel Benis (Holon Institute of Technology)
Florian Buettner (Goethe University Frankfurt German Cancer Research Center (DKFZ))
M. Jorge Cardoso (King's College London)
Veronika Cheplygina (IT University of Copenhagen)
Beth Cimini (Broad Institute of MIT and Harvard)
Gary Collins (University of Oxford)
Keyvan Farahani (Division of Cancer Treatment and Diagnosis, National Cancer Institute)
Luciana Ferrer (CONICET - University of Buenos Aires)
Luciana Ferrer is a researcher at the Computer Science Institute, from the National Scientific and Technical Research Council (CONICET) and the University of Buenos Aires (UBA), Argentina. Prior to her current position, Luciana worked at the Speech Technology and Research Laboratory, SRI International, USA. Her current research interests include speaker and language identification, mental state detection, and pronunciation scoring for second language learning. Luciana received the B.S. degree from the University of Buenos Aires, Argentina, in 2001, and her Ph.D. degree from Stanford University, USA, in 2009.
Adrian Galdran (Universitat Pompeu Fabra)
Bram van Ginneken (Radboud University)
Robert Haase (DFG Cluster of Excellence „Physics of Life" and Center for Systems Biology)
Daniel Hashimoto (University of Pennsylvania)
Michael Hoffman (University Health Network/University of Toronto)
Merel Huisman (Meander Medisch Centrum)
Pierre Jannin ("Universit� de Rennes 1, France")
Charles Kahn (University of Pennsylvania)
Dagmar Kainmueller (BIH/MDC)
Alexandros Karargyris (IHU Strasbourg)
Bernhard Kainz (Imperial College London,)

I am Professor at Friedrich-Alexander-University Erlangen-Nuremberg where I head the Image Data Exploration and Analysis Lab (IDEA Lab) and I am Reader (= US/EU Associate Professor++) in the Department of Computing at Imperial College London where I lead the human-in-the-loop computing group and co-lead the biomedical image analysis research group (BioMedIA). We are a post-pandemic, borderless research group, across nations and institutions. Our research is about intelligent algorithms in healthcare, especially Medical Imaging. We are working on self-driving medical image acquisition that can guide human operators in real-time during diagnostics. Artificial Intelligence is currently used as a blanket term to describe research in these areas.
Alan Karthikesalingam (Google)
Hannes Kenngott (University of Heidelberg)
Florian Kofler (Helmholtz AI TU Munich)
Annette Kopp-Schneider
Anna Kreshuk (EMBL)
Tahsin Kurc (Stony Brook University)
Bennett Landman (Vanderbilt University)
Geert Litjens (Radboud University Nijmegen Medical Center)
Geert Litjens studied Biomedical Engineering at Eindhoven University of Technology. Subsequently, he completed his PhD in the Diagnostic Image Analysis Group. He worked with Henkjan Huisman on Computer-aided detection of prostate cancer. He spent 2015 as a postdoctoral researcher at the National Center for Tumor Diseases in Heidelberg, Germany on an Alexander von Humboldt Society Postdoctoral Fellowship. He is currently an Assistant Professor in Computational Pathology at the Department of Pathology. His research focus is applying machine learning to solve important questions in oncology: - How to improve efficiency and accuracy through automation of diagnostics? - How to quantify (un)known biomarkers for cancer progression and treatment success using machine learning For more details on his research group: https://www.computationalpathologygroup.eu/
Amin Madani (University Health Network)
Klaus H. Maier-Hein (German Cancer Research Center (DKFZ))
Anne Martel (Sunnybrook Research Institute, Toronto)
Peter Mattson (Google)
Leads ML Performance Metrics team at Google Brain. General Chair of MLPerf. Ph.D. Stanford University.
Erik Meijering (University of New South Wales)
Bjoern Menze (TU Munich)
David Moher (Ottawa Hospital Research Institute and University of Ottawa, )
Karel G.M. Moons (UMC Utrecht, University Utrecht)
Henning Mueller (HES-SO)
Brennan Nichyporuk (Mila)
Felix Nickel (University Hospital of Heidelberg)
Jens Petersen (German Cancer Research Center (DKFZ))
Nasir Rajpoot (University of Warwick)
Nicola Rieke (Nvidia)
Julio Saez-Rodriguez (Heidelberg University)
Clarisa Sanchez (University of Amsterdam)
Shravya Shetty (Google, LLC)
Maarten van Smeden (University Medical Center Utrecht)
Carole Sudre (King's College London)
Ronald Summers (NIH)
Abdel Aziz Taha (Data Science Studio, Research Studios Austria FG, Vienna, Austria)
Sotirios Tsaftaris (University of Edinburgh)
Ben Ben Van Calster (Katholieke Universiteit (KU) Leuven)
Gaël Varoquaux (INRIA)
Paul Jäger (DKFZ)
More from the Same Authors
-
2020 : Quantification of task similarity for efficient knowledge transfer in biomedical image analysis »
Patrick Scholz -
2021 : Multilingual Spoken Words Corpus »
Mark Mazumder · Sharad Chitlangia · Colby Banbury · Yiping Kang · Juan Ciro · Keith Achorn · Daniel Galvez · Mark Sabini · Peter Mattson · David Kanter · Greg Diamos · Pete Warden · Josh Meyer · Vijay Janapa Reddi -
2021 : Whole Brain Vessel Graphs: A Dataset and Benchmark for Graph Learning and Neuroscience »
Johannes C. Paetzold · Julian McGinnis · Suprosanna Shit · Ivan Ezhov · Paul Büschl · Chinmay Prabhakar · Anjany Sekuboyina · Mihail Todorov · Georgios Kaissis · Ali Ertürk · Stephan Günnemann · Bjoern Menze -
2021 : Attention Shift: Interpretability Study of Texture-based Data Augmentation in Training U-Net Models for Brain Image Segmentation »
Suhang You · Mauricio Reyes -
2021 : Streaming Convolutional Attention Models »
Stephan Dooper · Geert Litjens · Johannes Pinckaers -
2021 : Ranking Loss based Weakly Supervised Model for Prediction of HPV Infection Status from Multi-Gigapixel Histology Images »
Ruoyu Wang · Amina Asif · Raja Muhammad Saad Bashir · Ali Khurram · Nasir Rajpoot -
2021 : An Optimal Architecture for Semantic Segmentation in Multi-Gigapixel Images of Oral Dysplasia »
Neda Azarmehr · Adam Shephard · Nasir Rajpoot · Ali Khurram -
2021 : Synthesis of Colon Cancer Tissue Images from Glandular Structure Layout »
Srijay Deshpande · Fayyaz Minhas · Nasir Rajpoot -
2022 : Shortcuts in Public Medical Image Datasets »
Amelia Jiménez-Sánchez · Andreas Skovdal · Frederik Bechmann Faarup · Kasper Thorhauge Grønbek · Veronika Cheplygina -
2022 : A Framework for Generating 3D Shape Counterfactuals »
Rajat Rasal · Daniel C. Castro · Nick Pawlowski · Ben Glocker -
2022 : Labeling instructions matter in biomedical image analysis »
Tim Rädsch · Annika Reinke · Vivienn Weru · Minu D. Tizabi · Nicholas Schreck · A. Kavur · Bünyamin Pekdemir · Tobias Roß · Annette Kopp-Schneider · Lena Maier-Hein -
2022 : Structured Priors for Disentangling Pathology and Anatomy in Patient Brain MRI »
Anjun Hu · Jean-Pierre Falet · Changjian Shui · Brennan Nichyporuk · Sotirios Tsaftaris · Tal Arbel -
2022 : How do 3D image segmentation networks behave across the context versus foreground ratio trade-off? »
Amith Kamath · Yannick Suter · Suhang You · Michael Mueller · Jonas Willmann · Nicolaus Andratschke · Mauricio Reyes -
2022 : Segmentation of Ascites on Abdominal CT Scans for the Assessment of Ovarian Cancer »
Benjamin Hou · Manas Nag · Jung-Min Lee · Christopher Koh · Ronald Summers -
2022 : Semi-Supervised Cross-Consistency Contrastive Learning for Nuclei Segmentation in Histology Images »
Raja Muhammad Saad Bashir · Talha Qaiser · Shan Raza · Nasir Rajpoot -
2022 : Transformer Utilization in Medical Image Segmentation Networks »
Saikat Roy · Gregor Köhler · Michael Baumgartner · Constantin Ulrich · Jens Petersen · Fabian Isensee · Klaus H. Maier-Hein -
2022 : Transformer-based normative modelling for anomaly detection of early schizophrenia »
Pedro Ferreira da Costa · Jessica Dafflon · Sergio Mendes · João Sato · M. Jorge Cardoso · Robert Leech · Emily Jones · Walter Lopez Pinaya -
2022 : Transformer-based normative modelling for anomaly detection of early schizophrenia »
Pedro Ferreira da Costa · Jessica Dafflon · Sergio Mendes · João Sato · M. Jorge Cardoso · Robert Leech · Emily Jones · Walter Lopez Pinaya -
2022 Spotlight: Lightning Talks 2B-4 »
Feiyi Xiao · Amrutha Saseendran · Kwangho Kim · Keyu Yan · Changjian Shui · Guangxi Li · Shikun Li · Edward Kennedy · Man Zhou · Gezheng Xu · Ruilin Ye · Xiaobo Xia · Junjie Tang · Kathrin Skubch · Stefan Falkner · Hansong Zhang · Jose Zubizarreta · Huaying Fang · Xuanqiang Zhao · Jie Huang · Qi CHEN · Yibing Zhan · Jiaqi Li · Xin Wang · Ruibin Xi · Feng Zhao · Margret Keuper · Charles Ling · Shiming Ge · Chengjun Xie · Tongliang Liu · Tal Arbel · Chongyi Li · Danfeng Hong · Boyu Wang · Christian Gagné -
2022 Spotlight: On Learning Fairness and Accuracy on Multiple Subgroups »
Changjian Shui · Gezheng Xu · Qi CHEN · Jiaqi Li · Charles Ling · Tal Arbel · Boyu Wang · Christian Gagné -
2022 Poster: Diagnosing failures of fairness transfer across distribution shift in real-world medical settings »
Jessica Schrouff · Natalie Harris · Sanmi Koyejo · Ibrahim Alabdulmohsin · Eva Schnider · Krista Opsahl-Ong · Alexander Brown · Subhrajit Roy · Diana Mincu · Christina Chen · Awa Dieng · Yuan Liu · Vivek Natarajan · Alan Karthikesalingam · Katherine Heller · Silvia Chiappa · Alexander D'Amour -
2022 Poster: On Learning Fairness and Accuracy on Multiple Subgroups »
Changjian Shui · Gezheng Xu · Qi CHEN · Jiaqi Li · Charles Ling · Tal Arbel · Boyu Wang · Christian Gagné -
2021 : DataPerf - Peter Mattson and Praveen Paritosh »
Peter Mattson -
2021 : Session 1 Keynote 1 »
Bram van Ginneken -
2021 Workshop: Medical Imaging meets NeurIPS »
DOU QI · Marleen de Bruijne · Ben Glocker · Aasa Feragen · Herve Lombaert · Ipek Oguz · Jonas Teuwen · Islem Rekik · Darko Stern · Xiaoxiao Li -
2020 Workshop: Medical Imaging Meets NeurIPS »
Jonas Teuwen · Qi Dou · Ben Glocker · Ipek Oguz · Aasa Feragen · Hervé Lombaert · Ender Konukoglu · Marleen de Bruijne -
2020 : Introduction by Ben Glocker »
Ben Glocker -
2020 Poster: Deep Structural Causal Models for Tractable Counterfactual Inference »
Nick Pawlowski · Daniel Coelho de Castro · Ben Glocker -
2020 Poster: Stochastic Segmentation Networks: Modelling Spatially Correlated Aleatoric Uncertainty »
Miguel Monteiro · Loic Le Folgoc · Daniel Coelho de Castro · Nick Pawlowski · Bernardo Marques · Konstantinos Kamnitsas · Mark van der Wilk · Ben Glocker -
2019 : Poster Session »
Pravish Sainath · Mohamed Akrout · Charles Delahunt · Nathan Kutz · Guangyu Robert Yang · Joseph Marino · L F Abbott · Nicolas Vecoven · Damien Ernst · andrew warrington · Michael Kagan · Kyunghyun Cho · Kameron Harris · Leopold Grinberg · John J. Hopfield · Dmitry Krotov · Taliah Muhammad · Erick Cobos · Edgar Walker · Jacob Reimer · Andreas Tolias · Alexander Ecker · Janaki Sheth · Yu Zhang · Maciej Wołczyk · Jacek Tabor · Szymon Maszke · Roman Pogodin · Dane Corneil · Wulfram Gerstner · Baihan Lin · Guillermo Cecchi · Jenna M Reinen · Irina Rish · Guillaume Bellec · Darjan Salaj · Anand Subramoney · Wolfgang Maass · Yueqi Wang · Ari Pakman · Jin Hyung Lee · Liam Paninski · Bryan Tripp · Colin Graber · Alex Schwing · Luke Prince · Gabriel Ocker · Michael Buice · Benjamin Lansdell · Konrad Kording · Jack Lindsey · Terrence Sejnowski · Matthew Farrell · Eric Shea-Brown · Nicolas Farrugia · Victor Nepveu · Jiwoong Im · Kristin Branson · Brian Hu · Ramakrishnan Iyer · Stefan Mihalas · Sneha Aenugu · Hananel Hazan · Sihui Dai · Tan Nguyen · Doris Tsao · Richard Baraniuk · Anima Anandkumar · Hidenori Tanaka · Aran Nayebi · Stephen Baccus · Surya Ganguli · Dean Pospisil · Eilif Muller · Jeffrey S Cheng · Gaël Varoquaux · Kamalaker Dadi · Dimitrios C Gklezakos · Rajesh PN Rao · Anand Louis · Christos Papadimitriou · Santosh Vempala · Naganand Yadati · Daniel Zdeblick · Daniela M Witten · Nicholas Roberts · Vinay Prabhu · Pierre Bellec · Poornima Ramesh · Jakob H Macke · Santiago Cadena · Guillaume Bellec · Franz Scherr · Owen Marschall · Robert Kim · Hannes Rapp · Marcio Fonseca · Oliver Armitage · Jiwoong Im · Thomas Hardcastle · Abhishek Sharma · Wyeth Bair · Adrian Valente · Shane Shang · Merav Stern · Rutuja Patil · Peter Wang · Sruthi Gorantla · Peter Stratton · Tristan Edwards · Jialin Lu · Martin Ester · Yurii Vlasov · Siavash Golkar -
2019 : Coffee Break + Poster Session I »
Wei-Hung Weng · Simon Kohl · Aiham Taleb · Arijit Patra · Khashayar Namdar · Matthias Perkonigg · Shizhan Gong · Abdullah-Al-Zubaer Imran · Amir Abdi · Ilja Manakov · Johannes C. Paetzold · Ben Glocker · Dushyant Sahoo · Shreyas Fadnavis · Karsten Roth · Xueqing Liu · Yifan Zhang · Alexander Preuhs · Fabian Eitel · Anusua Trivedi · Tomer Weiss · Darko Stern · Liset Vazquez Romaguera · Johannes Hofmanninger · Aakash Kaku · Oloruntobiloba Olatunji · Anastasia Razdaibiedina · Tao Zhang -
2019 Workshop: Medical Imaging meets NeurIPS »
Hervé Lombaert · Ben Glocker · Ender Konukoglu · Marleen de Bruijne · Aasa Feragen · Ipek Oguz · Jonas Teuwen -
2019 : Opening Remarks »
Hervé Lombaert · Ben Glocker · Ender Konukoglu · Marleen de Bruijne · Aasa Feragen · Ipek Oguz · Jonas Teuwen -
2019 Poster: Domain Generalization via Model-Agnostic Learning of Semantic Features »
Qi Dou · Daniel Coelho de Castro · Konstantinos Kamnitsas · Ben Glocker -
2018 : Closing remarks »
Ender Konukoglu · Ben Glocker · Hervé Lombaert · Marleen de Bruijne -
2018 : Is your machine learning method solving a real clinical problem? »
Tal Arbel -
2018 : Welcome »
Ender Konukoglu · Ben Glocker · Hervé Lombaert · Marleen de Bruijne -
2018 Workshop: Medical Imaging meets NIPS »
Ender Konukoglu · Ben Glocker · Hervé Lombaert · Marleen de Bruijne -
2017 : Closing »
Ben Glocker · Ender Konukoglu · Hervé Lombaert · Kanwal Bhatia -
2017 : Poster session - Afternoon »
Yongchan Kwon · Young-geun Kim · Ender Konukoglu · Peter Li · John Guibas · Tejpal Virdi · Kuldeep Kumar · Morteza Mardani · Jelmer Wolterink · Enhao Gong · Natalia Antropova · Johannes Stelzer · Rene Bidart · Wei-Hung Weng · Martin Rajchl · Marc Górriz · Vineeta Singh · Christopher Sandino · Hiba Chougrad · Bob Hu · Isaac Godfried · Ke Xiao · Heliodoro Tejeda Lemus · Jordan Harrod · ILSANG WOO · Vincent Chen · Joseph Cheng · Vikash Gupta · Chuck-Hou Yee · Ben Glocker · Hervé Lombaert · Maximilian Ilse · Aneta Lisowska · Andrew Doyle · Milad Mckie -
2017 : Machine learning for cognitive mapping »
Gaël Varoquaux -
2017 : The Multimodal Brain Tumor Segmentation Challenge (TU Munich) »
Bjoern Menze -
2017 : Poster session - Morning »
Yongchan Kwon · Young-geun Kim · Ender Konukoglu · Peter Li · John Guibas · Tejpal Virdi · Kuldeep Kumar · Morteza Mardani · Jelmer Wolterink · Enhao Gong · Natalia Antropova · Johannes Stelzer · Rene Bidart · Wei-Hung Weng · Martin Rajchl · Marc Górriz · Vineeta Singh · Christopher Sandino · Hiba Chougrad · Bob Hu · Isaac Godfried · Ke Xiao · Heliodoro Tejeda Lemus · Jordan Harrod · ILSANG WOO · Vincent Chen · Joseph Cheng · Vikash Gupta · Chuck-Hou Yee · Ben Glocker · Hervé Lombaert · Maximilian Ilse · Aneta Lisowska · Andrew Doyle · Milad Mckie -
2017 : Opening »
Ben Glocker · Ender Konukoglu · Hervé Lombaert · Kanwal Bhatia -
2017 Workshop: Medical Imaging meets NIPS »
Ben Glocker · Ender Konukoglu · Hervé Lombaert · Kanwal Bhatia -
2016 : Evaluation-as-a-Service: a serious game »
Henning Mueller -
2016 Poster: Testing for Differences in Gaussian Graphical Models: Applications to Brain Connectivity »
Eugene Belilovsky · Gaël Varoquaux · Matthew Blaschko -
2016 Oral: Testing for Differences in Gaussian Graphical Models: Applications to Brain Connectivity »
Eugene Belilovsky · Gaël Varoquaux · Matthew Blaschko -
2010 Poster: Brain covariance selection: better individual functional connectivity models using population prior »
Gaël Varoquaux · Alexandre Gramfort · Jean-Baptiste Poline · Bertrand Thirion