Timezone: »
Query by Example is a well-known information retrieval task in which a document is chosen by the user as the search query and the goal is to retrieve relevant documents from a large collection. However, a document often covers multiple aspects of a topic. To address this scenario we introduce the task of faceted Query by Example in which users can also specify a finer grained aspect in addition to the input query document. We focus on the application of this task in scientific literature search. We envision models which are able to retrieve scientific papers analogous to a query scientific paper along specifically chosen rhetorical structure elements as one solution to this problem. In this work, the rhetorical structure elements, which we refer to as facets, indicate objectives, methods, or results of a scientific paper. We introduce and describe an expert annotated test collection to evaluate models trained to perform this task. Our test collection consists of a diverse set of 50 query documents in English, drawn from computational linguistics and machine learning venues. We carefully follow the annotation guideline used by TREC for depth-k pooling (k = 100 or 250) and the resulting data collection consists of graded relevance scores with high annotation agreement. State of the art models evaluated on our dataset show a significant gap to be closed in further work. Our dataset may be accessed here: https://github.com/iesl/CSFCube
Author Information
Sheshera Mysore (University of Massachusetts Amherst)
Tim O'Gorman (Thorn)
Andrew McCallum (UMass Amherst)
Hamed Zamani (University of Massachusetts, Amherst)
More from the Same Authors
-
2023 Poster: Learning List-Level Domain-Invariant Representations for Ranking »
Ruicheng Xian · Honglei Zhuang · Zhen Qin · Hamed Zamani · Jing Lu · Ji Ma · Kai Hui · Han Zhao · Xuanhui Wang · Michael Bendersky -
2022 Poster: Modeling Transitivity and Cyclicity in Directed Graphs via Binary Code Box Embeddings »
Dongxu Zhang · Michael Boratko · Cameron Musco · Andrew McCallum -
2022 Poster: Structured Energy Network As a Loss »
Jay Yoon Lee · Dhruvesh Patel · Purujit Goyal · Wenlong Zhao · Zhiyang Xu · Andrew McCallum -
2021 Poster: Capacity and Bias of Learned Geometric Embeddings for Directed Graphs »
Michael Boratko · Dongxu Zhang · Nicholas Monath · Luke Vilnis · Kenneth L Clarkson · Andrew McCallum -
2020 Poster: Improving Local Identifiability in Probabilistic Box Embeddings »
Shib Dasgupta · Michael Boratko · Dongxu Zhang · Luke Vilnis · Xiang Li · Andrew McCallum -
2019 : Coffee Break & Poster Session 2 »
Juho Lee · Yoonho Lee · Yee Whye Teh · Raymond A. Yeh · Yuan-Ting Hu · Alex Schwing · Sara Ahmadian · Alessandro Epasto · Marina Knittel · Ravi Kumar · Mohammad Mahdian · Christian Bueno · Aditya Sanghi · Pradeep Kumar Jayaraman · Ignacio Arroyo-Fernández · Andrew Hryniowski · Vinayak Mathur · Sanjay Singh · Shahrzad Haddadan · Vasco Portilheiro · Luna Zhang · Mert Yuksekgonul · Jhosimar Arias Figueroa · Deepak Maurya · Balaraman Ravindran · Frank NIELSEN · Philip Pham · Justin Payan · Andrew McCallum · Jinesh Mehta · Ke SUN -
2019 : Opening Remarks »
Manzil Zaheer · Nicholas Monath · Ari Kobren · Junier Oliva · Barnabas Poczos · Ruslan Salakhutdinov · Andrew McCallum -
2019 Workshop: Sets and Partitions »
Nicholas Monath · Manzil Zaheer · Andrew McCallum · Ari Kobren · Junier Oliva · Barnabas Poczos · Ruslan Salakhutdinov -
2019 : Andrew McCallum: Learning DAGs and Trees with Box Embeddings and Hyperbolic Embeddings »
Andrew McCallum -
2019 Poster: Search-Guided, Lightly-Supervised Training of Structured Prediction Energy Networks »
Amirmohammad Rooshenas · Dongxu Zhang · Gopal Sharma · Andrew McCallum -
2018 : Contributed Work »
Thaer Moustafa Dieb · Aditya Balu · Amir H. Khasahmadi · Viraj Shah · Boris Knyazev · Payel Das · Garrett Goh · Georgy Derevyanko · Gianni De Fabritiis · Reiko Hagawa · John Ingraham · David Belanger · Jialin Song · Kim Nicoli · Miha Skalic · Michelle Wu · Niklas Gebauer · Peter Bjørn Jørgensen · Ryan-Rhys Griffiths · Shengchao Liu · Sheshera Mysore · Hai Leong Chieu · Philippe Schwaller · Bart Olsthoorn · Bianca-Cristina Cristescu · Wei-Cheng Tseng · Seongok Ryu · Iddo Drori · Kevin Yang · Soumya Sanyal · Zois Boukouvalas · Rishi Bedi · Arindam Paul · Sambuddha Ghosal · Daniil Bash · Clyde Fare · Zekun Ren · Ali Oskooei · Minn Xuan Wong · Paul Sinz · Théophile Gaudin · Wengong Jin · Paul Leu -
2018 Poster: Compact Representation of Uncertainty in Clustering »
Craig Greenberg · Nicholas Monath · Ari Kobren · Patrick Flaherty · Andrew McGregor · Andrew McCallum -
2017 : Invited Talk: "Light Supervision of Structured Prediction Energy Networks" »
Andrew McCallum -
2017 : Poster session 1 »
Van-Doan Nguyen · Stephan Eismann · Haozhen Wu · Garrett Goh · Kristina Preuer · Thomas Unterthiner · Matthew Ragoza · Tien-Lam PHAM · Günter Klambauer · Andrea Rocchetto · Maxwell Hutchinson · Qian Yang · Rafael Gomez-Bombarelli · Sheshera Mysore · Brooke Husic · Ryan-Rhys Griffiths · Masashi Tsubaki · Emma Strubell · Philippe Schwaller · Théophile Gaudin · Michael Brenner · Li Li -
2017 Poster: Active Bias: Training More Accurate Neural Networks by Emphasizing High Variance Samples »
Haw-Shiuan Chang · Erik Learned-Miller · Andrew McCallum -
2014 Workshop: 4th Workshop on Automated Knowledge Base Construction (AKBC) »
Sameer Singh · Fabian M Suchanek · Sebastian Riedel · Partha Pratim Talukdar · Kevin Murphy · Christopher Ré · William Cohen · Tom Mitchell · Andrew McCallum · Jason E Weston · Ramanathan Guha · Boyan Onyshkevych · Hoifung Poon · Oren Etzioni · Ari Kobren · Arvind Neelakantan · Peter Clark -
2012 Poster: MAP Inference in Chains using Column Generation »
David Belanger · Alexandre T Passos · Sebastian Riedel · Andrew McCallum -
2011 Workshop: Big Learning: Algorithms, Systems, and Tools for Learning at Scale »
Joseph E Gonzalez · Sameer Singh · Graham Taylor · James Bergstra · Alice Zheng · Misha Bilenko · Yucheng Low · Yoshua Bengio · Michael Franklin · Carlos Guestrin · Andrew McCallum · Alexander Smola · Michael Jordan · Sugato Basu -
2011 Poster: Query-Aware MCMC »
Michael Wick · Andrew McCallum -
2009 Poster: FACTORIE: Probabilistic Programming via Imperatively Defined Factor Graphs »
Andrew McCallum · Karl Schultz · Sameer Singh -
2009 Poster: Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference »
Michael Wick · Khashayar Rohanimanesh · Sameer Singh · Andrew McCallum -
2009 Spotlight: Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference »
Michael Wick · Khashayar Rohanimanesh · Sameer Singh · Andrew McCallum -
2009 Poster: Rethinking LDA: Why Priors Matter »
Hanna Wallach · David Mimno · Andrew McCallum -
2009 Spotlight: Rethinking LDA: Why Priors Matter »
Hanna Wallach · David Mimno · Andrew McCallum