Tifinagh Character Recognition Using Geodesic Distances , Decision Trees & Neural Networks

The recognition of Tifinagh characters cannot be perfectly carried out using the conventional methods which are based on the invariance, this is due to the similarity that exists between some characters which differ from each other only by size or rotation, hence the need to come up with new methods to remedy this shortage. In this paper we propose a direct method based on the calculation of what is called Geodesic Descriptors which have shown significant reliability vis-à-vis the change of scale, noise presence and geometric distortions. For classification, we have opted for a method based on the hybridization of decision trees and neural networks. Keywords-component ; Tifinagh character recognition; Neural networks ; Decision trees, Riemannian geometry ; Geodesic distances.


I. INTRODUCTION
Recently, computer vision has become one of the most appealing fields of research where shape recognition stands as one of its main pillars.
In the classical scheme of shape recognition, we distinguish basically two major phases: (i) the extraction and (ii) the classification of descriptors.[1] [2] The descriptors extraction can be defined as a particular form of downsizing, which aims to simplify the amount of resources needed to describe a large set of data accurately.Different techniques have been used [3][4] [5].
In this paper, we present a new approach for the extraction process which is based on the calculation of geodesic distances within images containing Tifinagh characters.The geodesic distance is one of the basic concepts of Riemannian geometry that comes out in many contexts to compensate the insufficiency of Euclidean geometry.For instance, it is used in mapping to calculate the length of a path on a spherical surface, it is also used for adaptive mesh generation and 3D objects representation [6][7].The objective is to adapt all these tools in order to use them for Tifinagh character recognition.
To test our approach, we have opted for a classifier based on the hybridization of neural networks (NN) and decision trees.
This paper is organized as follows: section two provides an overview on Tifinagh characters, section three describes some of the basic notions of Riemannian geometry and explains the method we applied, section four emphasizes on the classification process and the last section is dedicated to experimental results.

II. THE TIFINAGH CHARACTERS
Historically, Tifinagh characters were popular with Moroccan theologians under the name "Khath Ramal", that is "sand characters".That was the writing of caravan traders who used it to exchange messages by leaving signs on caravan routes.Tifinagh characters have almost become mystical due to the importance of communication in finding paths during journeys in desert.
Those characters are kept by Saharan community and represent today the ancient writing of "Touaregue".Archeologists have found texts in Tifinagh in different shapes: geometrical, human or even divine.They have also noticed resemblance to other characters from foreign civilizations: Phoenicians, Russian and Aramaic.
According to researchers, the name Tifinagh is compound of two words: Tifi (that is "discovering") and Nagh (that is "one"s self").The Royal institute of Amazigh Culture (ICRAM) has proposed a standardization of Tifinagh characters composed of 33 elements.[8], [9] Figure 1.Tifinagh characters adopted by the ICRAM

A. Theoretical approach 1) Basic concept of Riemannian geometry
Riemannian geometry was first put forward by Bernhard Riemann in the nineteenth century.It deals with a broad range of geometries which metric properties vary from a point to another.We define Riemannian geometry as the studies of www.ijacsa.thesai.orgRiemannian manifolds: smooth manifolds with a Riemannian metric.[10] To better understand this, we present some basic definitions:  A manifold is a topological space that is locally Euclidean (i.e., around every point, there is a neighboring area that is topologically the same).[11][12]  An inner product is a generalization of the dot product.In a vector space, it is a way to multiply vectors together, with the result of this multiplication being a scalar.The inner product of two vectors u and v is given by: <u, v>M = t u M v. (  The collection of all inner products of a manifold is called the Riemannian metric.

2) Geodesic distance
In a Riemannian metric space (x, M (x)) the length of a path [a,b] is calculated using the parameterization '(t) = a +t ab, where t belongs to [0, 1].[13] Then: (2) The geodesic distance (Dl M ) is the shortest path between two points a and b, or one of the shortest paths if there are many:

B. Proposed method
The proposed extraction process is based on the calculation of geodesic distances between the four geometric extremities of the sought character.

1) Pretreatment
The pretreatment that we have integrated is composed of two standard processes, (i) the noise elimination and (ii) the contour detection.[14] (Figure 2)

2) Extremities detection
In order to detect extremities, we have used an algorithm that browses the character contour and detects the closest points to each of the image angles.

3) Geodesic descriptors
We named "Geodesic Descriptors" geodesic distances between the four extremities of the image divided by their Euclidean distances.To compute geodesic distances on a binary image, we have applied an algorithm that uses a scan function where each iteration has sequences that go forward and backward so as to determine the shortest path.The used algorithm considers orthogonal and diagonal pixel distances by using a weight of 1 to orthogonal pixels and a weight of square root of 2 for the diagonal markers.[15] [16] To insure resistance to scale changes of the proposed descriptors, we divided the geodesic distance of each path according to the Euclidian distance.Notice that the proposed descriptors have allowed:  clear distinction between the tested characters ; and  Distinction between characters which are geometrically close (obtained by rotation, like "Yars" & "Yass" characters, see Table II).
Table II.Geodesic descriptors for the ""Yars"" & ""Yass"" characters The proposed descriptors have also shown considerable resistance to scale changes.(Table III) IV.CLASSIFICATION At first glance, it seems that the proposed descriptors can distinguish between all Tifinagh characters (Figures 5, 6 & 7).However, confusion still remains when it comes to composed characters (Figure 6) or other characters that have a circular shape (Figure 7).
To deal with these particular cases, we have chosen to operate with a hybrid classifier made of decision trees and neural networks.
On the one hand, decision trees have a discriminatory characteristic which allowed us to separate characters in four classes (Figure 8).On the other hand, neural networks allow character recognition, thanks to their ability to implicitly detect complex nonlinear relationships between dependent and independent variables, and to detect all possible interactions between predictor variables.[17][18].
In practice, we used a multilayer neural network (two layers) with supervised learning, driven by the back propagation of the gradient.This consists in determining the error made by each neuron and then modifying values of weight in order to minimize this error.
For the decision tree, we used the following rules:  R1: after detecting the number of motifs N in the image.If N >1, then: R22, if not: R21. R22: if the size of the first motif if twice (or more) bigger than the size of the second motif, then: N3, if not: N4.  R21: if the ratio of geodesic distances (D1/D3) is between 0.8 & 1.2 and (D2/D6) is between 0.8 & 1.2, then: N2, if not: N1.

V. EXPERIMENTAL RESULTS
We tested our approach of Tifinagh character recognition on the database "Y.Ouguengay" [12].This database includes 2175 characters printed in different sizes and writing styles.(Figure 9) Each character will be determined using geodesic descriptors, identification by neural networks and by combined neural networks (using decision trees).
We tested our approach on different characters of the database.Table IV gives an idea about recognition ratios of the database objects.Notice that despite the size of the database which is of 16 samples of each character, the suggested descriptors have proven effective using neural networks.The integration of decision trees has brought the recognition ratios remarkably higher.
In order to test the reliability of our recognition approach, we used it on images presenting different kinds of alterations.As noticed on Table V, recognition ratios are excellent vis-àvis noise presence and handwritten characters.Ratios are good when it comes to variation of luminosity and changes in scale.

VI. CONCLUSION
In this study, we have used the geodesic distances as a new approach for shape descriptors extraction and we have opted for a hybridization of neural networks and decision trees for classification.The robustness of our recognition system was tested and illustrated on a Tifinagh database supplemented by images with different alterations such as the luminance variation, the presence of white Gaussian noise with a variance of 10% and alterations due to handwriting.The recognition system proved efficient as we obtained: Considering:  Dl M ( xy): the geodesic distance between x and y  dxy: the Euclidean distance between x and y  a,b,c & d the geometric extremities of each character (Figure 2) We will name:  1st metric descriptor D1= DlM( ab) / dab  2nd metric descriptor D2= DlM( ac) / dac  3rd metric descriptor D3= DlM( ad) / dad  4th metric descriptor D4= DlM( bc) / dbc  5th metric descriptor D5= DlM( bd) / dbd  6th metric descriptor D6= DlM( cd) / dcd

Table I
illustrates the obtained results for the six descriptors used in this article:

Table III .
Metric descriptors calculated for different sizes of the character ""Yass""

Table IV .
Recognition rate depending on the number of characters to identify *: Results obtained for centered images

Table V
With: Hm: Handwritten characters, Lu : Variation of luminosity, Pn : Presence of noise , Sc : Scale change