Arabic CWR Based on Correlation of Normalized Signatures of Words Images
Hala S. Zaghloul, Taymoor Nazmy
The traditional methods for Arabic OCR (AOCR) based
on segmentation of each word into a set of characters. The
Arabic language is of cursive nature, and the character's
shape depends on its position in the word. There are about
100 shape of the characters have to be classified, and some
of them may be overlapped.
Our approach use a normalized signature of the time
signal of the pulse coupled neural network PCNN,
supported with some shape primitives to represent the
number of the word complementary and their positions
within the image of the word. A lookup dictionary of
words with its signatures was constructed, and structured
in groups using a decision tree.
The tested signature was routed through the tree to the
nearest group, and then the signature and its related word
with higher correlation within the selected group will be
the classified. This method overcome many difficulties
arise in cursive word recognition CWR for printed script
with different font type and size; also it shows higher
accuracy for the classification process, 96%.