Timezone: »

Knowledge Extraction from Text (KET)
Marko Grobelnik · Blaz Fortuna · Estevam Hruschka · Michael J Witbrock

Tue Dec 10 07:30 AM -- 06:30 PM (PST) @ Harvey's Emerald Bay 2

Text understanding is an old yet-unsolved AI problem consisting of a number of nontrivial steps. The critical step in solving the problem is knowledge acquisition from text, i.e. a transition from a non-formalized text into a formalized actionable language (i.e. capable of reasoning). Other steps in the text understanding pipeline include linguistic processing, reasoning, text generation, search, question answering etc. which are more or less solved to the degree which allows composition of a text understanding service. On the other hand, we know that knowledge acquisition, as the key bottleneck, can be done by humans, while automating of the process is still out of reach in its full breadth.

After failed attempts in the past (due to a lack of theoretical and technological prerequisites), in the recent years the interest for the text understanding and knowledge acquisition form text is growing. There is a number of AI research groups dealing with the various aspects in the areas of computational linguistics, machine learning, probabilistic & logical reasoning, and semantic web. The commonality among all the newer approaches is the use of machine learning to deal with representational change. To list some of the groups working in the area:

• Carnegie Mellon University (Never-Ending Language Learning: http://rtw.ml.cmu.edu/rtw/)
• Cycorp (Semantic Construction Grammar: http://www.cyc.com/)
• IBM Research (Watson project: http://www.ibm.com/watson)
• IDIAP Research Institute (Deep Learning for NLP: http://publications.idiap.ch/index.php/authors/show/336)
• Jozef Stefan Institute (Cross-Lingual Knowledge-Extraction: http://xlike.org)
• KU Leuven (Spatial Role Labelling via Machine Learning for SEMEVAL)
• Max Planck Institut (YAGO project: http://www.mpi-inf.mpg.de/yago-naga/yago/)
• MIT Media Lab (ConceptNet: http://conceptnet5.media.mit.edu/)
• University Washington (Open Information Extraction: http://openie.cs.washington.edu/)
• Vulcan Inc. (Semantic Inferencing on Large Knowledge: http://silk.semwebcentral.org/)

Apart from the above projects, there is noticeable increase of interest in the technology companies (such as Google, Microsoft, IBM) as well as big publishers (such as NYTimes, BBC, Bloomberg) to employ semantic technologies into their services leading towards understanding unstructured data beyond shallow, representation poor Text-Mining and Information-Retrieval techniques.

Workshop objective: Since all of the above listed attempts use extensively machine learning and probabilistic approaches, the goal of the workshop is to collect key researchers and practitioners from the area to exchange ideas, approaches and techniques used to deal with text understanding and related knowledge acquisition problems.

Author Information

Marko Grobelnik (Jozef Stefan Institute)
Blaz Fortuna (Jozef Stefan Institute)
Estevam Hruschka (Amazon)
Michael J Witbrock (Cycorp Inc)

More from the Same Authors