Cross-Lingual Technologies: Text to Logic Mapping, Search and Classification over 100 Languages
Jan Rupnik · Andrej Muhic · Blaz Fortuna · Janez Starc · Marko Grobelnik · Michael J Witbrock

We demonstrate two approaches that enable language independent document representations. The first approach is based on factorization techniques where documents are expressed in terms of multi-lingual topic vectors. The second approach is based on mapping documents to statements in first-order logic. Our demonstration is composed of two parts: the first part demonstrates a scalable solution to cross-lingual document retrieval and classification over 100 languages. The users will be able to input a document in any of the supported languages, find similar documents in other languages, and classify the document in the Open Directory Project taxonomy. The second part demonstrates a solution for the problem of text understanding aiming to enable machines to “understand” and reason about the semantics of a human composed text. The goal is to extract inferentially capable knowledge from textual documents of the given domain at economically viable cost. The audience will be able to participate in the process of making “text to logic” mappings and reasoning about the extracted knowledge.

Jan Rupnik (Jozef Stefan Institute)
Andrej Muhic (Jozef Stefan Institute)
Blaz Fortuna (Jozef Stefan Institute)
Janez Starc (Jozef Stefan Institute)
Marko Grobelnik (Jozef Stefan Institute)
Michael J Witbrock (Cycorp)

