UCT CS Research Document Archive

Multilingual Querying

Mustafa Ali, Mohammed and Hussein Suleman (2011) Multilingual Querying. In Proceedings Arabic Language Technology International Conference (ALTIC) 2011, Alexandria, Egypt.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

Non-English-speaking users, such as Arabic speakers, are not always able to express terminology in their native languages, especially in scientific domains. Such difficulty forces many Arabic authors and scholars to use English terms in order to explain precise concepts, resulting in mixed/multilingual queries with both English and Arabic terms. Current CLIR techniques are optimized for monolingual queries, even if they are translated, but neither mixed-language queries nor searches for mixed-language documents have yet been adequately studied. This paper attempts to address the problem of multilingual querying in CLIR. It shows experimentally that current search engines and IR systems are not language-aware and are not adequate for multilingual querying. The paper then presents the main ingredients that every language-aware solution should take care of.

EPrint Type:Conference Paper
Keywords:Multilingual Query, Mixed document, Multilingual Information Retrieval, Language-aware
Subjects:H Information Systems: H.3 INFORMATION STORAGE AND RETRIEVAL
ID Code:746
Deposited By:Suleman, Hussein
Deposited On:12 December 2011