Document Intelligence
Nigel Duffy · Rama Akkiraju · Tania Bedrax Weiss · Paul Bennett · Hamid Reza Motahari-Nezhad

Sat Dec 14th 08:00 AM -- 06:00 PM @ West 208 + 209
Business documents are central to the operation of business. Such documents include sales agreements, vendor contracts, mortgage terms, loan applications, purchase orders, invoices, financial statements, employment agreements and a wide many more. The information in such business documents is presented in natural language, and can be organized in a variety of ways from straight text, multi-column formats, and a wide variety of tables. Understanding these documents is made challenging due to inconsistent formats, poor quality scans and OCR, internal cross references, and complex document structure. Furthermore, these documents often reflect complex legal agreements and reference, explicitly or implicitly, regulations, legislation, case law and standard business practices.
The ability to read, understand and interpret business documents, collectively referred to here as “Document Intelligence”, is a critical and challenging application of artificial intelligence (AI) in business. While a variety of research has advanced the fundamentals of document understanding, the majority have focused on documents found on the web which fail to capture the complexity of analysis and types of understanding needed across business documents. Realizing the vision of document intelligence remains a research challenge that requires a multi-disciplinary perspective spanning not only natural language processing and understanding, but also computer vision, knowledge representation and reasoning, information retrieval, and more -- all of which have been profoundly impacted and advanced by neural network-based approaches and deep learning in the last few years.
We propose to organize a workshop for AI researchers, academics and industry practitioners to discuss the opportunities and challenges for document intelligence.

08:00 AM Opening Remarks (Discussion)
08:10 AM David Lewis: Artificial Intelligence in Legal Discovery (Invited Talk) Dave Lewis
09:05 AM Ndapa Nakashole: Generalizing Representations of Language for Documents Analysis across Different Domains (Invited Talk) Ndapa Nakashole
10:00 AM Coffee Break (Break)
10:30 AM Poster Teaser Presentations (Spotlights)
12:05 PM Posters (Poster Session / Lunch)
Timo I. Denk, Ion Androutsopoulos, Oleg Bakhteev, Hassan Kane, Petar Stojanov, Seunghyun Park, Bharat Mamidibathula, Kostiantyn Liepieshov, Johannes Höhne, Song Feng, Zikri Bayraktar, Kehinde Aruleba, ALEKSANDR OGALTSOV, Rita Kuznetsova, Paul Bennett, , Kshtij Fadnis, Luis Lastras, Mehrdad Jabbarzadeh Gangeh, Christian Reisswig, Emad Elwany, Ilias Chalkidis, Jonathan DeGange, Kaixuan Zhang, Luke de Oliveira, Muhammed Koçyiğit, Haoyu Dong, Vera Liao, Wonseok Hwang
01:30 PM Rajasekar Krishnamurthy: Document Intelligence for Enterprise AI Applications: Requirements & Research Challenges (Invited Talk) Rajasekar Krishnamurthy
02:30 PM Asli Celikyilmaz: Learning Structure in Text Generation (Invited Talk) Asli Celikyilmaz
03:30 PM Coffee Break (Break)
04:00 PM Discussion: Document Intelligence Research Challenges & Directions (Discussion)
05:00 PM Best Paper Talk: BERTGrid Contextualized Embedding for 2D Document Representation and Understanding (Talk)
05:30 PM Summary of Workshop and Closing Remarks (Discussion)

Author Information

Nigel Duffy (EY)
Rama Akkiraju (IBM Research - Almaden)
Tania Bedrax Weiss (Google)
Paul Bennett (Microsoft Research)
Hamid Reza Motahari-Nezhad (EY AI Lab, USA)