Skip to yearly menu bar Skip to main content


Search All 2023 Events
 

14 Results

<<   <   Page 1 of 2   >   >>
Poster
Tue 15:15 MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Sneha Kudugunta · Isaac Caswell · Biao Zhang · Xavier Garcia · Derrick Xin · Aditya Kusupati · Romi Stella · Ankur Bapna · Orhan Firat
Workshop
Navigating Dataset Documentation in ML: A Large-Scale Analysis of Dataset Cards on Hugging Face
Xinyu Yang · Weixin Liang · James Zou
Workshop
Latent Diffusion for Document Generation with Sequential Decoding
Zihuiwen Ye · Elle Michelle Yang · Phil Blunsom
Workshop
Knowledge Graph Prompting for Multi-Document Question Answering
Yu Wang · Nedim Lipka · Ryan Rossi · Alexa Siu · Ruiyi Zhang · Tyler Derr
Workshop
Learning Interpretable Libraries by Compressing and Documenting Code
Gabriel Grand · Catherine Wong · Matthew Bowers · Theo X. Olausson · Muxin Liu · Josh Tenenbaum · Jacob Andreas
Workshop
Fri 7:38 Data Ambiguity Strikes Back: How Documentation Improves GPT's Text-to-SQL
Zachary Huang · Pavan Kalyan Damalapati · Eugene Wu
Competition
Fri 9:25 Presentation from Participants – Winner FL Only Track: Communication Tuned Low-Rank Adaptation of Document Encoder
Aashiq Muhamed
Competition
Fri 7:00 Privacy Preserving Federated Learning Document VQA
Dimosthenis Karatzas · Rubèn Tito · Lei Kang · Mohamed Ali Souibgui · Khanh Nguyen · Raouf Kerkouche · Kangsoo Jung · Marlon Tobaben · Joonas Jälkö · Vincent Poulain d'Andecy · Aurélie JOSEPH · Ernest Valveny · Josep Llados · Antti Honkela · Mario Fritz
Poster
Thu 15:00 D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Kushal Tirumala · Daniel Simig · Armen Aghajanyan · Ari Morcos
Poster
Tue 8:45 M5HisDoc: A Large-scale Multi-style Chinese Historical Document Analysis Benchmark
Yongxin Shi · Chongyu Liu · Dezhi Peng · Cheng Jian · Jiarong Huang · Lianwen Jin
Poster
Thu 15:00 OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Hugo Laurençon · Lucile Saulnier · Leo Tronchon · Stas Bekman · Amanpreet Singh · Anton Lozhkov · Thomas Wang · Siddharth Karamcheti · Alexander Rush · Douwe Kiela · Matthieu Cord · Victor Sanh
Poster
Tue 8:45 WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data
Maurice Weber · Carlo Siebenschuh · Rory Butler · Anton Alexandrov · Valdemar Thanner · Georgios Tsolakis · Haris Jabbar · Ian Foster · Bo Li · Rick Stevens · Ce Zhang