Classification of Freshwater Zooplankton by Pre-trained Convolutional Neural Network in Underwater Microscopy

Zooplankton is enormously diverse and fundamental group of microorganisms that exists in almost every freshwater body, determining its ecology and play a vital role in food chain. Considering the significance of zooplankton, the study of freshwater zooplankton is very essential which intensely relies on the classification of images. However, the routine manual analysis and classification is laborious, time consuming and expensive, and poses a significant challenge to experts. Thus, for recent decade much research is focused on the development of underwater imaging technologies and intelligent classification system of zooplankton. This work presents devotion to observation of freshwater zooplankton by designed underwater microscope and modeling the system for automatic classification among four different taxa. Unlike most of the existing zooplankton image classification systems, this model is trained on a comparatively small dataset collected from freshwater by designed underwater microscope. Transfer learning of pretrained AlexNet Convolutional Neural Network (CNN) model proved to be a potential approach in the system design. Among four networks trained over two datasets, the best overall classification accuracy of up to 93.1%, comparable to other existing systems was achieved on test dataset (92.5% for Calanoid and Cyclopoid (Female), 90% for Cyclopoid (Male) and 97.5% for Daphnia). Graphical User Interface (GUI) of the model constructed on MATLAB, makes it easy for the users to collect images for building database, train network and to classify images of different taxa. Moreover, the designed system is adaptable to the addition of more classes in the future. Keywords—AlexNet; automatic image classification; Convolutional Neural Networks (CNN); freshwater zooplankton; transfer learning; underwater microscope


I. INTRODUCTION
Zooplankton belongs to the class of microorganisms, also known as "drifters", can be found in loads, suspended in freshwater bodies and other huge aquatic ecosystems [1]. Freshwater zooplankton community is diverse (>20 types), and occur in almost every lake, with the body size ranging from few tens of microns to >2mm. Mostly crustaceans and rotifers are the dominant group of zooplankton found in freshwater [2]. They act as a bioindicator to monitor the change in the aquatic behavior as they are very sensitive to ecological variations. Zooplankton, as a marker of determination of water quality, are also considered as integral component of global carbon cycle as well as the foundation of food chain for aquatic life [3] [4] [5]. To efficiently observe the changes in aquatic ecology and to protect it, the crucial distribution of zooplankton population cannot be left unnoticed, as it can cause appalling mutilation to aquatic ecosystem as well as undesirable communal terrestrial effects [6].
The area of underwater microscopic studies of zooplankton is an on ongoing and much focused research which is mostly linked to its taxonomic classification. Precise taxa identification offers bases for biodiversity research, which is a vital component of workflow of biological investigation along with evolution and distribution of zooplankton. However, carrying out manual research on zooplankton, which includes the sample collection by Niskin bottles or towed plankton net etc. and the manual classification by individual experts are laborious, time consuming and expensive tasks [7]. Thus, automating such tasks will help taxonomist, pharmacologist and also ease the labor of biological experts.
Few underwater imaging devices have been designed and tested for marine zooplankton studies for many years which erased the need of manual sampling by plankton nets up to major extent [8]. But the abrasive quality and size of underwater images dataset carries a challenging task in analyzing and classification due to unclear morphological traits and training of automatic classification model [9] [10]. However, focusing the critical importance of zooplankton, it is still in the interest of researchers to develop more advanced, robust and automatic system for its imaging and classification [11].
Later Section II of this paper include literature review of some imaging devices used in underwater imaging of plankton, and some recent work related to classification of zooplankton. Section III describes the methodology followed in this work, which consists of imaging of freshwater zooplankton including imaging device and method, and designing of neural network for zooplankton classification. Section IV presents training and testing results of CNN. Section V conclude the whole work, describing the summarized results, significance and future aspects of this study.

II. RELATED WORK
A lot of effort has already been thru on imaging and automatic classification of microorganisms for last few decades [12]. Underwater imaging devices like Shadow Image Particle Profiling Evaluation Recorder (SIPPER) [13], ZOOplankton Visualization and Imaging System (ZOOVIS) [14], In Situ Ichthyoplankton Imaging System (ISIIS) [15], Underwater Vision Profiler 5 (UVP5) [16] have been in service for marine zooplankton imaging. From handcrafted feature extraction and classifier design by [17] to the most advanced approaches like deep learning by [18] have been established in different setups of zooplankton classification. Some of the recent work presented in last few years on the project of automatic classification of marine zooplankton is summarized in Table I and discussed in later section. Author in [1] proposed another hybrid CNN model for plankton classification which consists of 3 AlexNet networks and fuses together at final fully connected layer. The threechannel pyramid structured network, which takes original image and two preprocessed copies of it as input respectively is trained over WHOI-Plankton dataset containing 30000 images of 30 classes.
In [20], the author experimented the behavior of CNN network trained over two very different datasets collected with different imaging devices, ISIIS and IFCB and tested over another out of domain dataset collected with SPC. CNN from scratch and fine tuning pre-trained AlexNet were chosen for experiment.
Following the preceding work by [1], [21] designed a hybrid system with addition of a concatenation layer before convolution layers, which resulted in low time cost and slightly better overall accuracy. The system was trained and tested over WHOI-Plankton dataset.
CEAL approach was presented by [12] to train a CNN for zooplankton classification. AlexNet architecture was adapted during training of CNN. In their work two dataset of images, ILES and CZECH collected by ISIIS, were used. The experiments showed that with CEAL approach there is no need for a human expert to annotate the large number of images in dataset, but just a small number of annotated images will maximize the possible accuracy of CNN.
In the work by [10], both deep learning and transfer learning approaches have been developed for plankton classification. Three major datasets, WHOI, Kaggle and ZOOscan were used during the task. They used fine-tuning and transfer learning of pretrained models and showed the possibility of pre-processing coupled with CNN in order to enhance feature extraction capability. AlexNet, GoogleNet, Inception V3, VGGNet, ResNet, DenseNet, SqueezeNet models were used and concluded DenseNet to be the best model for classification.
Unlike the proposed approach in this paper, all the previous work was carried out using publicly available large dataset of plankton images acquired during marine zooplankton survey.

III. METHODOLOGY
System flowchart shown in Fig. 1 present the key phases followed during the study of automatic zooplankton classification, which includes sample collection and observation under microscope, data acquisition, neural network training and testing.

A. Sample Collection
Freshwater samples were collected from two stations, pond and canal located in Ocean College of Zhejiang University, Zhoushan. Plankton net or any other type of plankton catcher were not used during sample collection. The water samples were kept in a glass tank without taking any biological sample preservation approaches like addition of formaldehyde solution. The samples were studied with underwater microscope designed by Ocean Optics lab. Visual data in the form of images and videos was collected from the surface and underwater observations, which confirmed the presence of several types of zooplankton as shown in Fig. 2.
Nomenclature of the zooplankton in Fig. 2 was done with the reference of "Practical guide to identifying freshwater crustacean zooplankton" [22]. Due to difficulty in distinguishing between the morphological features of zooplankton found in freshwater samples, the nomenclature is done up to the least possible level of taxonomy. Table II shows the detail taxonomy of the spotted types of zooplankton.

B. Data Acquisition
Water samples were observed through designed underwater microscope containing 14 mega pixels, Lapsun (M102) Charge-Coupled Device (CCD) camera optical chip combined with the lenses, offering total magnification of 1524x and Field of View (FOV) ranging from "1.7mm to 2.6mm". White Light Emitted Diode (LED) lamp powered by 12-volt battery was used as lighting source. The designed microscope was vertically deployed in the sample testing glass tub and total 16 videos were captured with a frame rate of 60 frames per second. Fig. 3 shows working setup of microscope. 253 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 7, 2020 CCD camera's built-in function of auto focus as well as manual focus, controlled by variable objective lens ranging 1.7x to 4.5x, was used during the process of underwater microscopy. Total 2900 raw images of the size 1920x1080x3 and Joint Photographic Experts Group (JPEG) format were acquired from the microscope which contain desired Region of Interest (ROI).

C. Building Image Dataset
After the screening of the acquired images, the two types of zooplankton in Fig. 2, Cyclopoid Nauplii and Rotifer were discarded from image database due to less amount and coarse quality of images. Two Image Datasets (IMDS) were built after executing the following operations.
Since microscopic images contain other particles in the image too, for example algae, thus the easiest technique to discard those noisy particles in the image is to crop desired ROI. Cropping ROI reduced the size of image, resulting quicker processing speed and low data consumption by Graphic Processing Unit (GPU) in training the model. First IMDS was created after resizing the cropped images to the new size of 227x227x3, as shown in Fig. 4, and allocating three hundred images of each type of zooplankton into four classes. Each class was named according to their taxonomy 254 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 7, 2020 after verification with the key for zooplankton nomenclature. 80% of each class of images were used for training and remaining 20% for validation of the neural network.
Underwater microscopic images are mostly targeted by many factors that are; out of focus lens and continuous drifting of zooplankton, which cause blurry images resulting in the loss of basic morphological structure of microorganism. Second IMDS was built by including the offline augmented or the preprocessed image replicas along with the blurry raw images, keeping the dataset balanced. Thus, the total number of images in each class is increased to six hundred images per class, in which 80% were used for training and remaining for validation. Processing included contrast enhancement by contrast stretching technique and later applying Gaussian filter to smoothen the edges and get the texture of enhanced image. Sample raw images of each of four classes and processed images along with their respective histograms are shown in Fig. 5.
A separate test dataset was created containing the cropped images of the same size as that of training dataset for the accuracy estimation of trained CNN. Table III provides overall quantitative analysis of IMDS.

D. CNN Training
Proposed CNN model comprises of four different training scenarios based on IMDS, as shown in Fig. 6. Due to small dataset of images, pre-trained CNN model, AlexNet developed by [23], was adapted for this planned study. Pretrained model is fine-tuned by replacing the final three layers with a fully connected layer of the size of number of classes, a SoftMax layer to yield the class probability calculation and finally an output classification layer. Rectified linear activation (ReLU) function is performed for each convolutional layer and fully connected layer, except for the last one which consist of SoftMax layer. Cross channel normalization layer and max pooling layer is included in the network. A dropout of 50% is used while training to prevent the network from overfitting.    Table IV shows the CNN configuration of hyperparameters which were selected to improve immunity of the system to overfitting during the four training models. To make fair evaluation of models' accuracies, same hyperparameters are used during the training of all the models. Lower learning rate is considered more effective during fine tuning of the pre-trained architecture.  Data augmentation was included to enhance the accuracy of the network and preventing the network to overfit. To keep the better tradeoff between validation accuracy and overfitting of data during training, only reflection of images about x-axis is selected for online augmentation of images.

IV. SYSTEM MODELING AND EXPERIMENTAL RESULTS
Classification system was designed on MATLAB 2019a and trained over Nvidia GeForce 920MX GPU. Image acquisition, image processing, neural network, parallel computing and CUDA enabled GPU tool box were used in the modeling of zooplankton classification system. Four networks Net1, Net2, Net3 and Net4 were trained on two datasets IMDS1 and IMDS2, keeping the same hyperparameters, and compared the results on the bases of test dataset classification. Net1 and Net2 denote the networks trained on IMDS1 with and without augmentation respectively. Similarly, Net3 and Net4 denotes the networks trained on IMDS2.
Results of all the four networks were formulated on the bases of size of dataset, training routine which shows network response to the data, confusion matrix and probability scores for each class after classification.
Precision, recall and accuracy were concluded from confusion matrix by the following equations.
Another useful quantity is F-measure, which is harmonic mean of recall and precision. Its value ranges from 0 -1, closest to 1 determines the decent grading of the network and vice versa. It can be measured as; F measure = 2*(Recall * Precision)/(Recall+Precision) Experimental results of four networks during training and testing are summarized in Table V. Net4 yielded in to better classification results of the test data, as compared to other three networks.
Since Net4 showed better outcomes, it was considered in this study of zooplankton classification. The network was tested on in-domain test dataset which is captured with same underwater microscope. The network provided individual accuracy of 92.5% for Calanoid and Cyclopoid (Female), 90.0% for Cyclopoid (Male) and 97.5% for Daphnia as shown in Fig. 7.  In addition, the network yielded into much confident classification probability scores for Calanoid and Daphnia, which is 83.0% and 84.5%. The probability scores for Cyclopoid (Female), 76.2% and Cyclopoid (Male), 79.9% showed minor confusion in the classification. Fig. 8 shows the average probability scores of classifications among four classes of test dataset. Furthermore, the system showed positive response to out of domain images during classification as in Fig. 9.
GUI designed on MATLAB aids with easy routine use of the system. Developing dataset, training network and classification modules are added in GUI. Also provides results during training and probability scores during classification. Through GUI, the system can be easily modified and improved by addition of other classes for classification.  In this study a model is propossed for operative underwater microscopy of zooplankton and automatic classification of underwater microscopic images of zooplankton. Two different and comparatively small datasets of four classes of zooplankton captured with the same imaging system were developed for the training of CNN model. The overall maximum classification accuracy of 93.1% was calculated after the trained network was tested on independent test dataset. Both online and offline data augmentation is applied in the system to enlarge the size of dataset and overcome the chances of overfitting during network training. The experimental results of architecture based on pre-trained convolutional neural network show that this system can classify zooplankton images effectively with the cost of very less time and low computing requirements.
Freshwater zooplankton distribution is very diverse and inhomogeneous and its research is of very importance in ecology protection. Main focus in zooplankton study is effective zooplankton classification which will help aquatic ecologist in robust sampling and classification without putting much efforts.
How to sample and classify zooplankton more effectively is still a big challenge and there are still many things that can be done on zooplankton imaging and classification. For future advancement in the system, much research will be focused on development of more effective and enhanced zooplankton imaging systems and designing of improved classification models. Besides, other pre-trained CNN models will also be applied in zooplankton classification systems. Also, addition of more genera as well as other aspects like life cycle stages of zooplankton into the classification model will be done.