Table Representation Learning Workshop

Workshop

Table Representation Learning Workshop

Madelon Hulsebos · Bojan Karlaš · Haoyu Dong · Gael Varoquaux · Laurel Orr · Pengcheng Yin

Room 235 - 236

Fri 15 Dec, 6:30 a.m. PST

[ Abstract ] Workshop Website

Tables are a promising modality for representation learning with too much application potential to ignore. However, tables have long been overlooked despite their dominant presence in the data landscape, e.g. data management and analysis pipelines. The majority of datasets in Google Dataset Search, for example, resembles typical tabular file formats like CSVs. Similarly, the top-3 most-used database management systems are all relational (RDBMS). Representation learning over tables (TRL), possibly combined with other modalities such as text or SQL, has shown impressive performance for tasks like table-based question answering, table understanding, and data preparation. More recently, TRL was shown to be effective for tabular ML as well, while researchers also started exploring the impressive capabilities of LLMs for table encoding and data manipulation. Follow our Twitter feed for updates: https://twitter.com/TrlWorkshop.

The first edition of the Table Representation Learning (TRL) workshop at NeurIPS 2022 gathered an enthusiastic community and stimulated new research and collaborations, which we aim to continue in 2023. The TRL workshop has three main goals:

(1) Motivate tables as a primary modality for representation and generative learning and advance the area further.
(2) Showcase impactful applications of pretrained table models and discussing future opportunities.
(3) Foster discussion and collaboration across the ML, NLP and DB communities.

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Fri 6:30 a.m. - 6:45 a.m.	Opening notes ( Talk ) > SlidesLive Video	🔗
Fri 6:45 a.m. - 7:15 a.m.	Invited talk: Co-Designing LLMs and LLM-Powered Data Management Tools ( Talk ) > SlidesLive Video	Simran Arora 🔗
Fri 7:15 a.m. - 7:22 a.m.	High-Performance Transformers for Table Structure Recognition Need Early Convolutions ( Spotlight ) > link SlidesLive Video Link	ShengYun Peng · Seongmin Lee · Xiaojing Wang · Rajarajeswari Balasubramaniyan · Duen Horng Chau 🔗
Fri 7:23 a.m. - 7:30 a.m.	Pool-Search-Demonstrate: Improving Data-wrangling LLMs via better in-context examples ( Spotlight ) > link SlidesLive Video Link	Joon Suk Huh · Changho Shin · Elina Choi 🔗
Fri 7:31 a.m. - 7:38 a.m.	TabPFGen – Tabular Data Generation with TabPFN ( Spotlight ) > link SlidesLive Video Link	Jeremy (Junwei) Ma · Apoorv Dankar · George Stein · Guangwei Yu · Anthony Caterini 🔗
Fri 7:38 a.m. - 7:45 a.m.	Data Ambiguity Strikes Back: How Documentation Improves GPT's Text-to-SQL ( Spotlight ) > link SlidesLive Video Link	Zachary Huang · Pavan Kalyan Damalapati · Eugene Wu 🔗
Fri 7:46 a.m. - 7:53 a.m.	MultiTabQA: Generating Tabular Answers for Multi-Table Question Answering ( Spotlight ) > link SlidesLive Video Link	Vaishali Pal · Andrew Yates · Evangelos Kanoulas · Maarten Rijke 🔗
Fri 8:00 a.m. - 8:20 a.m.	Coffee break + poster setup	🔗
Fri 8:20 a.m. - 9:00 a.m.	Poster Session 1 ( Poster session ) >	🔗
Fri 9:00 a.m. - 9:30 a.m.	Invited talk: Advances in In-Context Learning for Tabular Datasets ( Talk ) >	Frank Hutter 🔗
Fri 9:30 a.m. - 9:37 a.m.	Self-supervised Representation Learning from Random Data Projectors ( Spotlight ) > link SlidesLive Video Link	Yi Sui · Tongzi Wu · Jesse Cresswell · Ga Wu · George Stein · Xiao Shi Huang · Xiaochen Zhang · Maksims Volkovs 🔗
Fri 9:38 a.m. - 9:45 a.m.	GCondNet: A Novel Method for Improving Neural Networks on Small High-Dimensional Tabular Data ( Spotlight ) > link SlidesLive Video Link	Andrei Margeloiu · Nikola Simidjievski · Pietro Lió · Mateja Jamnik 🔗
Fri 9:46 a.m. - 9:53 a.m.	HyperFast: Instant Classification for Tabular Data ( Spotlight ) > link SlidesLive Video Link	David Bonet · Daniel Mas Montserrat · Xavier Giró-i-Nieto · Alexander Ioannidis 🔗
Fri 9:54 a.m. - 10:01 a.m.	Training-Free Generalization on Heterogeneous Tabular Data via Meta-Representation ( Spotlight ) > link SlidesLive Video Link	Han-Jia Ye · Qile Zhou · De-Chuan Zhan 🔗
Fri 10:00 a.m. - 11:30 a.m.	Lunch Break	🔗
Fri 11:30 a.m. - 12:00 p.m.	Invited talk: Next-Generation Data Management with Large Language Models ( Talk ) > SlidesLive Video	Immanuel Trummer 🔗
Fri 12:00 p.m. - 12:07 p.m.	Tabular Representation, Noisy Operators, and Impacts on Table Structure Understanding Tasks in LLMs ( Spotlight ) > link SlidesLive Video Link	Ananya Singha · José Cambronero · Sumit Gulwani · Vu Le · Chris Parnin 🔗
Fri 12:08 p.m. - 12:15 p.m.	How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain Settings ( Spotlight ) > link SlidesLive Video Link	Shuaichen Chang · Eric Fosler-Lussier 🔗
Fri 12:16 p.m. - 12:23 p.m.	IngesTables: Scalable and Efficient Training of LLM-Enabled Tabular Foundation Models ( Spotlight ) > link SlidesLive Video Link	Scott Yak · Yihe Dong · Javier Gonzalvo · Sercan Arik 🔗
Fri 12:30 p.m. - 1:00 p.m.	Invited talk: Advancing Natural Language Interfaces to Data with Language Models as Agents ( Talk ) > SlidesLive Video	Tao Yu 🔗
Fri 1:00 p.m. - 1:20 p.m.	Coffee Break + poster setup	🔗
Fri 1:20 p.m. - 2:00 p.m.	Poster Session 2 ( Poster Session ) >	🔗
Fri 2:00 p.m. - 2:30 p.m.	Invited talk: Enabling Large Language Models to Reason with Tables ( Talk ) > SlidesLive Video	Wenhu Chen 🔗
Fri 2:30 p.m. - 3:15 p.m.	Panel - TBA ( Panel ) > SlidesLive Video	🔗
Fri 3:15 p.m. - 3:30 p.m.	Closing notes ( Talk ) > SlidesLive Video	🔗
-	MultiTabQA: Generating Tabular Answers for Multi-Table Question Answering ( Poster ) > link Link	Vaishali Pal · Andrew Yates · Evangelos Kanoulas · Maarten Rijke 🔗
-	Generating Data Augmentation Queries Using Large Language Models ( Poster ) > link Link	Christopher Buss · Jasmin Mousavi · Mikhail Tokarev · Arash Termehchy · David Maier · Stefan Lee 🔗
-	ReConTab: Regularized Contrastive Representation Learning for Tabular Data ( Poster ) > link Link	Suiyao Chen · Jing Wu · NAIRA HOVAKIMYAN · Handong Yao 🔗
-	Unlocking the Transferability of Tokens in Deep Models for Tabular Data ( Poster ) > link Link	Qile Zhou · Han-Jia Ye · Leye Wang · De-Chuan Zhan 🔗
-	Augmentation for Context in Financial Numerical Reasoning over Textual and Tabular Data with Large-Scale Language Model ( Poster ) > link Link	Yechan Hwang · Jinsu Lim · Young-Jun Lee · Ho-Jin Choi 🔗
-	TabContrast: A Local-Global Level Method for Tabular Contrastive Learning ( Poster ) > link Link	Hao Liu · Yixin Chen · Bradley A Fritz · Christopher King 🔗
-	Explaining Explainers: Necessity and Sufficiency in Tabular Data ( Poster ) > link Link	Prithwijit Chowdhury · Mohit Prabhushankar · Ghassan AlRegib 🔗
-	Beyond Individual Input for Deep Anomaly Detection on Tabular Data ( Poster ) > link Link	Hugo Thimonier · Fabrice Popineau · Arpad Rimmel · Bich-Liên DOAN 🔗
-	GradTree: Learning Axis-Aligned Decision Trees with Gradient Descent ( Poster ) > link Link	Sascha Marton · Stefan Lüdtke · Christian Bartelt · Heiner Stuckenschmidt 🔗
-	Elephants Never Forget: Testing Language Models for Memorization of Tabular Data ( Poster ) > link Link	Sebastian Bordt · Harsha Nori · Rich Caruana 🔗
-	InterpreTabNet: Enhancing Interpretability of Tabular Data Using Deep Generative Models and Large Language Models ( Poster ) > link Link	Jacob Yoke Hong Si · Rahul Krishnan · Michael Cooper · Wendy Yusi Cheng 🔗
-	On Incorporating new Variables during Evaluation ( Poster ) > link Link	Harsimran Bhasin · Soumyadeep Ghosh 🔗
-	GCondNet: A Novel Method for Improving Neural Networks on Small High-Dimensional Tabular Data ( Poster ) > link Link	Andrei Margeloiu · Nikola Simidjievski · Pietro Lió · Mateja Jamnik 🔗
-	High-Performance Transformers for Table Structure Recognition Need Early Convolutions ( Poster ) > link Link	ShengYun Peng · Seongmin Lee · Xiaojing Wang · Rajarajeswari Balasubramaniyan · Duen Horng Chau 🔗
-	Unnormalized Density Estimation with Root Sobolev Norm Regularization ( Poster ) > link Link	Mark Kozdoba · Binyamin Perets · Shie Mannor 🔗
-	Self-supervised Representation Learning from Random Data Projectors ( Poster ) > link Link	Yi Sui · Tongzi Wu · Jesse Cresswell · Ga Wu · George Stein · Xiao Shi Huang · Xiaochen Zhang · Maksims Volkovs 🔗
-	Tree-Regularized Tabular Embeddings ( Poster ) > link Link	Xuan Li · Yun Wang · Bo Li 🔗
-	Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains ( Poster ) > link Link	Kyungeun Lee · Ye Seul Sim · Hyeseung Cho · Suhee Yoon · Sanghyu Yoon · Woohyung Lim 🔗
-	A Deep Learning Blueprint for Relational Databases ( Poster ) > link Link	Lukáš Zahradník · Jan Neumann · Gustav Šír 🔗
-	Scaling TabPFN: Sketching and Feature Selection for Tabular Prior-Data Fitted Networks ( Poster ) > link Link	Benjamin Feuer · Niv Cohen · Chinmay Hegde 🔗
-	Modeling string entries for tabular data prediction: do we need big large language models? ( Poster ) > link Link	Leo Grinsztajn · Myung Jun Kim · Edouard Oyallon · Gael Varoquaux 🔗
-	HyperFast: Instant Classification for Tabular Data ( Poster ) > link Link	David Bonet · Daniel Mas Montserrat · Xavier Giró-i-Nieto · Alexander Ioannidis 🔗
-	Hopular: Modern Hopfield Networks for Tabular Data ( Poster ) > link Link	Bernhard Schäfl · Lukas Gruber · Angela Bitto · Sepp Hochreiter 🔗
-	Training-Free Generalization on Heterogeneous Tabular Data via Meta-Representation ( Poster ) > link Link	Han-Jia Ye · Qile Zhou · De-Chuan Zhan 🔗
-	NeuroDB: Efficient, Privacy-Preserving and Robust Query Answering with Neural Networks ( Poster ) > link Link	Sepanta Zeighami · Cyrus Shahabi 🔗
-	A DB-First approach to query factual information in LLMs ( Poster ) > link Link	Mohammed SAEED · Nicola De Cao · Paolo Papotti 🔗
-	A Performance-Driven Benchmark for Feature Selection in Tabular Deep Learning ( Poster ) > link Link	Valeriia Cherepanova · Roman Levin · Gowthami Somepalli · Jonas Geiping · C. Bayan Bruss · Andrew Wilson · Tom Goldstein · Micah Goldblum 🔗
-	Incorporating LLM Priors into Tabular Learners ( Poster ) > link Link	Max Zhu · Siniša Stanivuk · Andrija Petrovic · Mladen Nikolic · Pietro Lió 🔗
-	CHORUS: Foundation Models for Unified Data Discovery and Exploration ( Poster ) > link Link	Moe Kayali · Anton Lykov · Ilias Fountalis · Nikolaos Vasiloglou · Dan Olteanu · Dan Suciu 🔗
-	Tabular Representation, Noisy Operators, and Impacts on Table Structure Understanding Tasks in LLMs ( Poster ) > link Link	Ananya Singha · José Cambronero · Sumit Gulwani · Vu Le · Chris Parnin 🔗
-	Introducing the Observatory Library for End-to-End Table Embedding Inference ( Poster ) > link Link	Tianji Cong · Zhenjie Sun · Paul Groth · H. V. Jagadish · Madelon Hulsebos 🔗
-	Scaling Experiments in Self-Supervised Cross-Table Representation Learning ( Poster ) > link Link	Maximilian Schambach · Dominique Paul · Johannes Otterbach 🔗
-	Benchmarking Tabular Representation Models in Transfer Learning Settings ( Poster ) > link Link	Qixuan Jin · Talip Ucar 🔗
-	Exploring the Retrieval Mechanism for Tabular Deep Learning ( Poster ) > link Link	Felix den Breejen · Sangmin Bae · Stephen Cha · Tae-Young Kim · Seoung Hyun Koh · Se-Young Yun 🔗
-	In Defense of Zero Imputation for Tabular Deep Learning ( Poster ) > link Link	John Van Ness · Madeleine Udell 🔗
-	Data Ambiguity Strikes Back: How Documentation Improves GPT's Text-to-SQL ( Poster ) > link Link	Zachary Huang · Pavan Kalyan Damalapati · Eugene Wu 🔗
-	IngesTables: Scalable and Efficient Training of LLM-Enabled Tabular Foundation Models ( Poster ) > link Link	Scott Yak · Yihe Dong · Javier Gonzalvo · Sercan Arik 🔗
-	Pool-Search-Demonstrate: Improving Data-wrangling LLMs via better in-context examples ( Poster ) > link Link	Joon Suk Huh · Changho Shin · Elina Choi 🔗
-	How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain Settings ( Poster ) > link Link	Shuaichen Chang · Eric Fosler-Lussier 🔗
-	TabPFGen – Tabular Data Generation with TabPFN ( Poster ) > link Link	Jeremy (Junwei) Ma · Apoorv Dankar · George Stein · Guangwei Yu · Anthony Caterini 🔗
-	Multitask-Guided Self-Supervised Tabular Learning for Patient-Specific Survival Prediction ( Poster ) > link Link	You Wu · Omid Bazgir · Yongju Lee · Tommaso Biancalani · James Lu · Ehsan Hajiramezanali 🔗
-	Testing the Limits of Unified Sequence to Sequence LLM Pretraining on Diverse Table Data Tasks ( Poster ) > link Link	Soumajyoti Sarkar · Leonard Lausen 🔗