UCT CS Research Document Archive

Combining Dictionary and Web-mined Resources for Arabic-English Cross Language Information Retrieval

Abdalla, Hisham and Hussein Suleman (2008) Combining Dictionary and Web-mined Resources for Arabic-English Cross Language Information Retrieval. In Proceedings 10th Annual Conference of WWW Applications, University of Cape Town, Cape Town.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

In cross-language information retrieval (CLIR), queries in a source language are used to retrieve relevant documents in a target language. Dictionary-based translation is a traditional approach used by cross-language information retrieval systems. However, it has bad performance when queries contain words that do not appear in the dictionary (Out Of Vocabulary -OOV). Combinations of resources such as dictionaries and Web mining techniques can play an important role in solving these problems.
In this research we propose incorporating Machine Readable Dictionaries (MRD) and Web mining techniques. Previous work and experimentation in other language pairs such as Chinese-English showed that OOV can be effectively alleviated using these techniques. This project is therefore adapting these techniques to Arabic-English CLIR.

EPrint Type:Conference Poster
Subjects:H Information Systems: H.3 INFORMATION STORAGE AND RETRIEVAL
ID Code:470
Deposited By:Abdalla, Hisham
Deposited On:18 September 2008