ISSUES AND PROBLEMS IN THE RECOGNITION OF ARABIC PRINTED TEXTS

Authors

  • Ali M. Obaid

Abstract

Nowadays, Arabic text recognition bears witness to a wave of interest after a long period of moderate activity. The reason is the complexity of the problem manifested in both cursive shapes and close similarity of Arabic characters. Optical character recognition this is performed usually by detecting and quantifying isolated characters, which implies that the text is meaningfully segmented into more simple shapes. In this paper we study the properties of the Arabic script and review the problems encountered in its segmentation. To pass by the need for segmentation a new technique, the so-called N-markers, is proposed. It unifies the advantages of both global and structural recognition methods and is intuitively close to the human recognition process. The technique is tailored to single-font printed texts rich in ligatures, a problem encountered in good quality books and journals. It can be extended, in a straightforward way, to other fonts and also to handle degraded texts. Preliminary experiments show encouraging results.

Keywords:

Arabic optical character recognition, pattern recognition, global methods, ligatures

How to Cite

Obaid, A. M. “ISSUES AND PROBLEMS IN THE RECOGNITION OF ARABIC PRINTED TEXTS”, Periodica Polytechnica Electrical Engineering, 41(4), pp. 315–335, 1997.

Issue

Section

Articles