Neural Network Classification of White Blood Cell using Microscopic Images

With the technological advances in medical field, the need for faster and more accurate analysis tools becomes essential for better patients’ diagnosis. In this work, the image recognition problem of white blood cells (WBC) is investigated. Five types of white blood cells are classified using a feed forward back propagation neural network. After segmentation of blood cells that are obtained from microscopic images, the most 16 significant features of these cells are fed as inputs to the neural network. Half of the 100 of the WBC sub-images that are found after segmentation are used to train the neural network, while the other half is used for test. The results found are promising with classification accuracy being 96%. Keywords—White Blood Cell; Neural networks; Image analysis; Leukocytes; Lymphocyte; Feature extraction


I. INTRODUCTION
In the fields of haematology and infectious diseases, classifying different kinds of blood cells can be used as a tool in diagnosis.By counting certain cells' relative frequencies and comparing to what is normal, conclusions can be made about possible blood diseases.Blood consists of several elements which are white blood cell (WBCs), red blood cell (RBCs), platelets, and plasma.The quantity of blood cells plays important role to ensure the healthiness of a person.Human blood contains five major types of WBC or what is referred to as leukocytes.The WBC types, which are illustrated in Figure 1, together with their typical relative frequencies are: neutrophils, basophils, eosinophils, lymphocytes and monocytes.In a human adult, the normal average number of WBC is about 7000/micro litre, which forms about 1% of the total blood cell in the body.The increase in the number of WBC in the body is referred to as leucocytosis, while decrease in the number of WBC is called leucopenia, with leucocytosis being the most likely to occur compared to leucopenia [1].
Due to the different morphological features of the white blood cells, manual classification of such cells is a cumbersome process, which is time-consuming and susceptible to human error as it is mostly related to the haematologists' experience.This fact actually emphasise a crucial need for fast and automated method for identifying the different blood cells.
Implementation techniques of automated differential blood cells counting systems are of two kinds [2]: One technique is based on the flow cytometry, while the other is based on image processing.Image processing techniques are having the advantage over the flow cytometry based systems.In that the images of the blood samples can be saved and hence referred to for further verification in case some abnormal conditions were detected.In this work, the processing of microscopic images of blood cells using neural networks as an efficient decision maker for proper white blood cell type recognition is adopted.Neural networks have powerful features in analysing complex data, and among the wide and variant application areas of neural networks are the system identification and control [3], image recognition and decision making [4], speech and pattern recognition [5] as well as financial applications [6].Artificial neural networks have also been successfully used in medical applications to diagnose several cancers [7].
As for the interest in this paper, which is the segmenting and classifying of blood cells microscopic images using neural network, a number of scientific researches have been published.Ongun et al. [8] have applied the multilayer perceptron network trained using conjugate gradient descent (CGD), linear vector quantisation (LVQ) and k-nearest neighbour classifier which produced 89.74%, 83.33%and 80.76% of accuracy, respectively.Hsieh et al. [9] used the information gain technique based on support vector machine (SVM) for feature selection.Determining entropy and calculating the correlation in a training dataset, the usefulness of a feature was estimated while classifying the training data.The proposed technique was used to classify two types of leukaemia, which are acute lymphoblastic leukaemia (ALL) and acute myelogenous leukaemia (AML).
Abdul Nasir et al [10], proposed application of MLP and simplified fuzzy ARTMAP (SFAM) neural networks for classifying the individual WBC as lymphoblast, myeloblast and normal cell based on the extracted features from both acute lymphoblastic leukaemia (ALL) and acute myelogenous leukaemia (AML) blood samples.A total of 42 features (6 size, 24 shape, and 12 colour) were used and a classification accuracy of 93.82% is achieved.
In this work, the multilayer perceptron back-propagation MLP-BP neural network is used to classify the most known five types of WBC that have been segmented from blood smear microscopic images using the most distinguishing features.The adopted algorithmic comprises three stages.The first stage is image segmentation, the second stage is labelling that returns the number and location of each WBC, and the third stage is extracting descriptive features measured from the segmented cells.

II. PRE-PROCESSING AND SEGMENTATION
The algorithm adopted in this paper for image preprocessing and segmentation is basically proposed in [11].The three main steps of this algorithm, which are segmentation, labelling, and feature extraction, are illustrated in Figure 2. The next image processing step is the WBC subtype recognition and this will be achieved by the use of neural networks.

III. FEATURE EXTRACTION OF WBC
The choice of features immensely affects the classifier performance.The features must characterize each WBC subtype and must be independent of each other for robust classification, better judgment and comparison.Indeed, an extensive work had been focused on determining different features that crucially distinguishing each type or groups of types of WBC.These features can be grouped into shape features, intensity features, and texture features.

A. Shape Features
There are many techniques of shape description and recognition.These techniques can be broadly categorised into two types: (1) boundary-based and (2) region-based [12].The most successful representatives for these two categories are Fourier descriptor and moment invariants whereas moment invariants are to use region-based moments, which are invariant to transformations as the shape feature.
The regular moment invariants are presented by Hu [13] who derived a set of invariants using algebraic invariants.
In [11], it was found that compared with the set of invariant moments; the seventh moment invariant feature 7  has a noticeable effect on classification performance [11].www.ijacsa.thesai.org The two-dimensional seventh moment of a digitally sampled M × M image that has gray function f (x, y), (x, y = 0, . . .M − 1) is given as: where, ( X , Y ) are

B. Intensity Features
These features are based only on the absolute value of the intensity measurements in the image.A histogram describes the occurrence relative frequency of the intensity values of the pixels in an image.The intensity features that will be considered are the first four central moments of this histogram: mean, standard deviation, skewness, and kurtosis.
For a grayscale image, the mean of the blood image is equal to the average brightness or intensity and it is given by: where, X is the mean, N number of pixels, X 1 …X n are the grayscale image data.
The image variance, gives an estimate of the spread of pixel values around the image mean.The skewness measures the symmetry about the mean [14].It is defined as:

N Skewness
The kurtosis (K) is a measure of whether the data are peaked or flattened, relative to a normal distribution and can be computed as: [15]  

N Kurtosis C. Textural Features
These features contain information about the spatial distribution of tonal variations within a band.Texture representation methods can be classified into three categories: (1) statistical techniques, (2) structural techniques, and (3) spectral techniques.Statistical techniques are most important for texture classification because these techniques result in computing texture properties [16].

1) The statistical techniques using gray level cooccurrence matrix (GLCM)
The identification of specific textures in an image is achieved primarily by modelling texture as a two-dimensional gray level variation.This two dimensional array is called gray level co-occurrence matrix (GLCM).The GLCM was originally proposed by R.M. Haralick, therefore features generated are known as Haralick features [17].Co-occurrence matrix which is a tabulation of how often different combinations of pixel values occur in an image.Based on the co-occurrence matrices five texture features, namely, (1) contrast, (2) homogeneity, (3) entropy, (4) energy, and (5) correlation are calculated as in [16].
) , ( 2 , where, i and j are the horizontal and vertical cell coordinates and p is the cell value.where, P(i, j) is the (i j)th entry of the normalised cooccurrence matrix, N g is the number of gray levels of the blood image.
In Equation (12), where, µ x, µ y,σ x, and σ y are the means and standard deviations of the marginal probabilities P x (i) and P y ( j) obtained by summing up the rows or the columns of matrix P ij ( co-occurrence matrix), respectively.

2) The statistical techniques using colour moments
The colour moments are the statistical moments of the probability distributions of colours which are the first order moment (mean) and the second order moment that are used in variance computation .
The mean of colour intensity of the RGB model is defined as: where, the i th colour channel is defined at the j th image pixel as p ij and N is the number of pixels in the image [18].
The variance and the standard deviation are defined mathematically by equations ( 14) and ( 15): where, ij f is the value of the i th colour components of the image pixel j, N is the number of features over all database, i  is the mean of the colour i .

I. NEURAL NETWORK CLASSIFICATION
The features that are considered significant to represent an image of white blood cells are extracted and accumulated in a vector, which we refer to as the features vector.Features vector is then transformed into a set of classes using neural networks as a technique to solve a WBC classification problem.This technique adopts a learning algorithm to identify a model that best fits the relationship between the feature set and class label of the input data.Therefore, a key objective of the learning algorithm is to build predictive model that accurately predict the class labels of previously unknown records.
The feed forward back propagation neural network, which is a very popular model in biological and biomedical applications, is used.This type of neural network configuration does not have feedback connections, but errors are propagated back during training using least mean squared error.The back propagation neural network is a multi-layer, feed-forward supervised learning, which requires pairs of input and target vectors.A feed forward neural network can consist of three layers, namely, (1) an input layer, (2) a number of hidden layers, and an output layer.The input layer and the hidden layer are connected by synaptic links called weights and likewise the hidden layer and output layer also have connection weights.
The input layer contains 16 neurons representing the 16 extracted features.The output layer contains 5 neurons which represents the WBC types.It was found that 10 nodes in a single hidden layer are adequate to reach a minimum error (less than .The learning rate is 0.35 and number of epochs is set to 1000.

II. THE EXPERIMENTAL RESULTS
To illustrate our proposed procedure, 40 microscopic images are shot from a stained blood smear.The WBCs are then segmented from those images, where the total sub-images that indicate the whole WBC types are found to be 100 subimages.
In order to classify the WBCs into five classes, the target value for the WBC neural network classifier is considered as shown in Table 1.Firstly, 50 sub-images are selected for network training, while the other 50 sub-images are used for testing.
The classification test results are illustrated in Table 2. From this table, it is clear that the overall correct classification is 96%, while 4% being the overall false classification.The percentages of the correct classification for each WBC type are shown in Table 3.
Finally, Table 4 illustrates the classification performance of our proposed technique compared with some similar researcher work published in literature.In this table, it is evident that our technique gives better overall correct classification for the considered five types of white blood cells.

III. CONCLUSIONS
The MLP trained by Back Propagation (BP) algorithm have been used to classify five types of WBC, namely, (1) neutrophils, (2) basophils, (3) eosinophils, (4) lymphocytes, and (5) monocytes.The 16 features are used as an input to the neural network.These features are categorised as shape features (the seventh moment invariant), intensity features (the mean, standard deviation, skewness, kurtosis of the intensity histogram), and textural features (the energy, the entropy, the correlation, the contrast, the homogeneity, and the mean and variance for each colour).The choice of features and the type of classifier play a significant role in classification accuracy results.With the above selected feature and the proposed neural network classifier a 100% classification accuracy have been obtained for the neutrophils, lymphocytes, and basophils types of WBC, while 90% accuracy have been obtained for the other two types.For comparison purposes, the achieved results have been compared with other relative research work.This is clearly demonstrated by the results of Table 4.It is clear that the proposed neural network classifier has better classification accuracy with less number of features relative to the number of classified WBC types.As a future work, the authors would like to focus on using these WBC and RBC images to increase the capability of diagnosing some popular regional blood diseases.

Fig. 1 .
Fig. 1.Typical Images of Common White Blood Cells

Fig. 2 .
Fig. 2. The Pre-processing and Segmentation Procedure are the centre of the image.

TABLE I .
NEURAL NETWORK TARGET VALUES FOR WBC CLASSIFICATION