firstbacksecondback
14 Results
Poster
|
Tue 15:15 |
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset Sneha Kudugunta · Isaac Caswell · Biao Zhang · Xavier Garcia · Derrick Xin · Aditya Kusupati · Romi Stella · Ankur Bapna · Orhan Firat |
|
Workshop
|
Navigating Dataset Documentation in ML: A Large-Scale Analysis of Dataset Cards on Hugging Face Xinyu Yang · Weixin Liang · James Zou |
||
Workshop
|
Latent Diffusion for Document Generation with Sequential Decoding Zihuiwen Ye · Elle Michelle Yang · Phil Blunsom |
||
Workshop
|
Knowledge Graph Prompting for Multi-Document Question Answering Yu Wang · Nedim Lipka · Ryan Rossi · Alexa Siu · Ruiyi Zhang · Tyler Derr |
||
Workshop
|
Learning Interpretable Libraries by Compressing and Documenting Code Gabriel Grand · Catherine Wong · Matthew Bowers · Theo X. Olausson · Muxin Liu · Josh Tenenbaum · Jacob Andreas |
||
Workshop
|
Fri 7:38 |
Data Ambiguity Strikes Back: How Documentation Improves GPT's Text-to-SQL Zachary Huang · Pavan Kalyan Damalapati · Eugene Wu |
|
Competition
|
Fri 9:25 |
Presentation from Participants – Winner FL Only Track: Communication Tuned Low-Rank Adaptation of Document Encoder Aashiq Muhamed |
|
Competition
|
Fri 7:00 |
Privacy Preserving Federated Learning Document VQA Dimosthenis Karatzas · Rubèn Tito · Lei Kang · Mohamed Ali Souibgui · Khanh Nguyen · Raouf Kerkouche · Kangsoo Jung · Marlon Tobaben · Joonas Jälkö · Vincent Poulain d'Andecy · Aurélie JOSEPH · Ernest Valveny · Josep Llados · Antti Honkela · Mario Fritz |
|
Poster
|
Thu 15:00 |
D4: Improving LLM Pretraining via Document De-Duplication and Diversification Kushal Tirumala · Daniel Simig · Armen Aghajanyan · Ari Morcos |
|
Poster
|
Tue 8:45 |
M5HisDoc: A Large-scale Multi-style Chinese Historical Document Analysis Benchmark Yongxin Shi · Chongyu Liu · Dezhi Peng · Cheng Jian · Jiarong Huang · Lianwen Jin |
|
Poster
|
Thu 15:00 |
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Hugo Laurençon · Lucile Saulnier · Leo Tronchon · Stas Bekman · Amanpreet Singh · Anton Lozhkov · Thomas Wang · Siddharth Karamcheti · Alexander Rush · Douwe Kiela · Matthieu Cord · Victor Sanh |
|
Poster
|
Tue 8:45 |
WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data Maurice Weber · Carlo Siebenschuh · Rory Butler · Anton Alexandrov · Valdemar Thanner · Georgios Tsolakis · Haris Jabbar · Ian Foster · Bo Li · Rick Stevens · Ce Zhang |