The Method of Braille Embossed Dots Segmentation for Braille Document Images Produced on Reusable Paper

Braille is the language of communication for blind and visually impaired people. Braille characters are embossed at points to convey the meaning. Typically, Braille documents can be produced on plain paper. Braille documents can be created on reusable paper, also known as a third-page paper; this reduces the paper cost, allowing more available documents to stimulate learning for blind or visually impaired persons. This research presents a method of Braille embossed dots segmentation for Braille document images produced on reusable paper to support the availability of cheaper learning material. Initially, Braille documents were imported with a calibrated scanner, Braille document image layer separation was then performed. Followed by edge removal, Braille embossed dot recovery, noise removal, and specify the embossed Braille point. This research was conducted by using four scanners, which scanned Braille documents images under four different lighting conditions. For each lighting condition, the Braille document image area was cropped to the desired size, considering the possible event conditions. They were used to create over 200,000 Braille cells, with over 12 billion patterns. When calculating the average performance under all lighting conditions, the values were Precision 1.0000, Recall 0.7817, Accuracy 0.8545, and F-Measure 0.8756. By effectively using Braille embossed dots segmentation, the process of Braille document recognition will also be efficient. Keywords—Braille; embossed dots; document images; reusable paper; segmentation; recognition; blind; visually impaired


I. INTRODUCTION
Louis Braille invented Braille so allowing blind or visually impaired persons to communicate using written communication; subsequently, there is a requirement they become proficient in writing and reading Braille. In Braille, one cell of Braille has six dots that represents a meaning. To create Braille documents is a writing pad (slate) and a sharp tip (stylus) being portable and cheap. Other devices can create Braille documents, such as a Braille typewriter and a Braille printer, but these devices are expensive and require expensive specialized paper. It is common practice that blind, or the visually impaired people produce Braille documents on reusable paper, known as the third page. These documents contain Braille embossed dots, characters, tables, and pictures known as patterns. It reduces the cost of purchasing Braille paper and is a cost-effective use of natural resources.
Braille documents created with reusable paper are everywhere. This research aims to accurately extract the Braille embossed dots [16] on those Braille document. Those Braille embossed dots are used for Braille recognition [712] and converted to characters. In the end, these characters will be used to make typical books.
Scope and limitations: This research created reusable papers with characters printed by using LaserJet and Inkjet printers on an A4 80 GSM thick. Braille documents were created on the reusable papers by using a portable Braille device and scanned with a flatbed scanner at 300 DPI resolution.
Contribution: (1) To develop a method for Braille embossed dots segmentation for Braille documents produced on reusable papers. (2) To reduce the complexity problems and cost of purchasing paper to create Braille documents. (3) To support improved communication channels between people.
The paper was organized: Section II summarizes the relevant research and describes the new approaches to this research. Section III explains the proposed method. The dataset, experimental design, evaluation, and discussion of the results are described in Section IV. Section V summarizes the results and discusses them. The final section, Section VI, outlines our future work.

II. RELATED WORK
The evolution of Braille document image processing is shown in Fig. 1. It is divided into two groups: (1) Group 1 document image processing for typical documents. (2) Group 2 document image processing of Braille documents created on plain paper for the visually impaired where the Braille characters have embossed dots.
Group 1 can be divided into 2 subgroups: (1) Subgroup 1 is documents created on plain paper without an overlaid pattern. There are relevant research topics such as text/non-text classification in online handwritten notes [16], the 2D chemical structures recognition in document images [17], detecting math www.ijacsa.thesai.org equations in scientific document images [18], the Arabic word recognition of historical documents images [19], the Vietnamese character recognition for verifying ID card [20], document zoning for document layout analysis [21], analysis of the structure of the musical document image [22], bibliographic reference extraction [23], extracting text and figure from document images [24,25], document localization in natural scene images [26], and table detection and segmentation in document images [27]. (2) Subgroup 2 is created on plain paper and overlapped patterns. There are related research topics such as image restoration and segmentation of historical document images caused by ink bleeding [2830], glare detection on captured document images [31], shadow removal on captured document images [32], and text segmentation from a highlighted area with colors [33].  [13,14], and Recovering the Braille embossed point of an old Braille document [15].
When considering the document image processing diagram for Braille documents as shown in Fig. 1 and comparing it with Fig. 2. In Fig. 2, the area to the right of the red dash line is a new research topic that has not been researched previously. This research project deals with Braille embossed dot segmentation for Braille document images produced on reusable paper. It has an overlaid pattern, as shown in Fig. 3. Therefore, this research is classified in Subgroup 2 of Group 2.    [2], Braille document images modeled using Beta distribution for thresholding and used grid for Braille embossed dots segmentation. A. S. Al-Salman et al. [3] uses image enhancement and image rotation. Then a grid method is used to extract the Braille Cells. A.-S. Amany et al. [4] uses thresholding based on Beta distribution then creates a grid for dot detection. M. Y. Babadi et al. [5] perform skew correction and create grids for Braille cells segmentation. A. AlSalman et al. [6], this research use between-class variance with Gamma distribution to separate Braille embossed dots from the background. J. Mennens et al. [7], Braille documents are imported with a scanner. The mask and grid are used to extract the Braille embossed dots. L. Wong et al. [8], this research use techniques Half-Character Recognition then generate a grid to extract Braille embossed dots. L. Jie et al. [9,10], this research uses Support Vector Machine (SVM), slides window techniques, and Haar wavelet to extract Braille embossed dots on Braille document images obtained from a scanner. B.-M. Hsu [11], this research used RCSA: Ratio Character Segmentation Algorithm for Braille embossed dots extraction. A. AlSalman et al. [12], the research use the Deep Convolution Neural Network (DCNN) for Braille document recognition. M. Yousefi et al. [13,14], this research finds the parameters of Braille documents, skewness, scaling, line spacing to obtain Braille dots. H. Kawabe et al. [15] uses deep learning to classify Braille dots in long-preserved or ancient Braille documents. It can be seen that these studies focus on Braille documents created on plain paper only. There has been no research that has created Braille on reusable paper.

III. PROPOSED METHOD
This research proposed the method of Braille embossed dots segmentation for Braille document images. Braille documents were produced on reusable paper and plain paper. Flatbed scanners were used to scan the documents. The method comprises six steps as shown in Fig. 4. The first step was to perform a scanner calibration process using a specific calibration plate and calculating the edges' threshold values and the black areas. The second step was to perform a layer separation process by using the edges' threshold values and the black areas from the previous step. The third step was to perform the edge removal process by eliminating the edges. The fourth step was to perform the data recovery process by calculating the eroded mask's data to increase the Braille dots' details. The next step was to perform a noise removal process by applying an explosion algorithm to diffuse the black pixels and then to image enhancement by spatial filtering. The final process was to perform a Braille localization process by calculating the Braille dots' positions from the black pixels' positions. The details are as follows. www.ijacsa.thesai.org A. Scanner Calibration Process 1) Plates for scanner calibration: The images were designed for scanner calibration, a black circle, and a black square then saved to the image files. Then printed on plain paper by using a laser printer and an inkjet printer is called the plates. Then, they were imported via a flatbed scanner.
2) The plates were converted to grayscale images: The plates were in the RGB color model. They were converted to the HSV color model. The color values of the V plane were applied to all three planes of the RGB color model. The grayscale images are shown in Fig. 5(a) and 5(b), comprising the black area in P-01 and the background in P-02.
3) Plate images are eroded with a mask: They were converted to binary images and then eroded with a small mask and large mask, respectively. As a result, it can be seen that the plate images were eroded by the small mask have a black area in P-03 of    B. Layer Separation Process 1) Braille document images importing: A Braille document was imported by using the calibrated scanner from the previous step. They were color images in the RGB model converted to grayscale, the same as in step 2) of the previous process. Fig. 6(a) contains the background of the Braille document in P-01, the Braille embossed dot in P-02, and the typical character that was called a pattern shows in P-03. values, black color values were recorded in the white image for the black area image at the same pixel positions. The result was called the image of the black areas, as shown in Fig. 6(c). It included the black areas of the pattern shows in P-07 and the holes of punctures in P-08.    Fig. 7(a). The black area's image was inverted color, the result is shown in Fig. 7(b). The black areas of the pattern were dilated with a mask. Then, the black areas that were larger and the holes of the puncture are filled, therefore the image of the black areas dilated, as shown in Fig. 7(c) and P-03 was larger than P-01 in Fig. 7(b).
2) Edges removed: The OR logic operation of the dilated black areas image shown in Fig. 7(c) and the image of the edges obtained in the previous process, Fig. 7(d), which can remove the edges of the pattern. In Fig. 7(e), the result is the image of the edges removed that still has Braille embossed dots in P-04 and P-06 and some noise in P-05, as shown in Fig. 7(e).

1) Image of the black area's erosion:
The inverted image of the black areas dilated from the previous process, Fig. 8(a), were eroded with a mask. In Fig. 8(b), the white areas are smaller than before. Fig. 8(c) showed the inverted color, called the image of the eroded black areas, which shown in P-01.
2) The braille dots recovery: The OR logic operation of the image of the black areas eroded, shown in Fig. 8(c), and the image of the edges, shown in Fig. 8(d), which can recover the details of the Braille embossed dot. And then, it was combined with the image of the edges, as shown in Fig. 8(e). The result was the image of removed edges and the details, as shown in Fig. 8(f), P-02 is the Braille embossed dot and P-03 is some noise.

E. Noise Removal Process 1) Diffusion of the edges:
The image of removed edges and details, Fig. 9(a), is exposed by a mask. The mask moves one pixel at a time to determine the color value. If the mask's center position was black value, it would be moved to a new position inside the mask. The black value with the new position was saved in the white image data. The result was the image of diffused edges as shown in Fig. 9(b), P-01 to P-03 are diffused dot and some noise.
2) Image enhancement: The images of diffused edges were applied by averaging filter, which created a clearer Braille embossed dot image. The result was an enhanced image as shown in Fig. 9(c).

F. Braille Localization Process 1) Braille embossed dot dilation:
The enhanced image, Fig. 10(a), was dilated by a mask. The result was a group of black values that were Braille embossed dots for more clarity as shown in Fig. 10(b).
2) The centroid of braille embossed dot: The group of black values in the previous step was calculated by connected component labeling. The result was the centroid of Braille embossed dot, which was a Braille embossed dots location as shown in P-01 of Fig. 10(c).

A. Dataset
This research produced a dataset of Braille embossed dots named KU-Braille-Dot. Braille documents were produced on reusable paper by using slate and stylus. Consider Table I, this research created reusable paper by using an 80 GSM A4 paper to print text by using a LaserJet printer and Inkjet printer. It was then used to create Braille documents by using the slate and stylus. These Braille documents were scanned by using four scanners, each with different lighting environments, as shown in Fig. 11. The Braille document images had a resolution of 300 DPI. It is cropped to size 4040 pixels under six events; each event contains 100 image data obtained from 50 images from the Inkjet printer and 50 images from LaserJet printer. The details of the six events were: (1) Event 1: Braille www.ijacsa.thesai.org embossed dots were in the middle of any text, 100 images.

B. Experimental Design
The proposed method was tested by creating a single-cell Braille with various patterns from the KU-Braille-Dots dataset. A single-cell Braille contains six dot positions, each of which can occur in six events, so 666666 is equal to 46,656 patterns in total. A dot position of Braille has six events. The events were arranged in order, and each event was a randomized image from 100 images, called a data group in the A form, as shown in Fig. 12. A single-cell Braille contained six dots, and therefore required six groups of data in the A form, which were sorted into a data group, called the B form, as shown in Fig. 13. Each row was a group of data in the A form relative to the dot positions of a single-cell Braille. A group of data in the B form which could create a single-cell Braille with 46,656 patterns.
This research created 150,000 groups of data in the A form. Each group had a unique event image or could contain no more than two duplicate event images. Those data were then randomly grouped into 262,144 groups in the B form, and each group had a unique arrangement of event images. That is, 262,144 cells Braille, which had 12,230,590,464 patterns.

C. Performance Measurement
This research was to test the performance measurement of the proposed method using the number of the cell Braille increased by 1 times, starting from a single-cell Braille up to 262,144 cells Braille, as shown in Table II. The performance measurement of the proposed method, a single-cell Braille was used to describe, as shown in Fig. 14

D. Results and Discussion
The presented method was tested with the data sets described in the previous section. This research aimed to extract the Braille embossed dots from Braille documents created on reusable paper. An example of the calculation according to the proposed method using a single Braille embossed dot is shown in Fig. 15. These documents were scanned by using four flatbed scanners, videlicet, four lighting conditions. This research plotted graphs of the light condition as shown in Fig. 16 to 19, each of which has four lines representing Precision, Recall, Accuracy, and F-Measure, respectively, and has a value between 0.00 and 1.00. The yaxis of the graph is the numerical measure, which is a value between 0.00 and 1.00, but these graphs start plotting at a value of 0.6 for clarity. The x-axis of the graph is group No. of the Braille datasets used for testing. By considering the Accuracy and the F-measure values greater than 0.80, it is known that there is approximately one position dot error in a single-cell Braille with six-position dots. This research is to extract Braille embossed dots on reusable paper. It is different from other research [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15], which is only interested in Braille documents on plain paper.

V. CONCLUSION
This paper presented the Braille embossed dots segmentation method for Braille document images that were tested as detailed. The results were good, with an average Accuracy and F-Measure of over 0.85 for all lighting conditions. This research confirmed that the proposed method has a good efficiency for Braille embossed dot segmentation for Braille documents produced on reusable paper, positively affecting the Braille recognition method.
The research included diverse flatbed scanner devices and lighting conditions. In this field of research, no research has been found on Braille documents produced on reusable paper. The methods presented here make the best use of paper resources and increase access to education for the blind. This research has opened a research path that is beneficial to the visually impaired or blind people to have more opportunities to study and learn.

VI. FUTURE WORK
This research is a starting point to help create a digital document from documents created by the blind by using a slate and a stylus on reusable paper. It also facilitates communication between ordinary people and the blind or visually impaired and promotes the desire to be treated as ordinary. In the future, this method will be tested on the KU-Braille-Partial dataset to achieve higher accuracy. Further developed methods are applied to actual Braille documents created on reusable paper. The KU-Braille-Partial was a dataset of parts of a Braille document image produced on reusable paper using slate and stylus. This research created reusable paper using an 80 GSM A4 paper to print text using a LaserJet printer and Inkjet printer. They were scanned by using four scanners with a resolution of 300 DPI. It was cropped to size 585x915 pixels-the sample images as shown in Fig. 20(a) and 20(c). The ground truth images were created-the sample images as shown in Fig. 20(b) and 20(d). The white rectangular areas were the Braille embossed dots, and the black areas were the non-Braille embossed dots. The red dot in the white square means that the proposed method was correct, but in other cases, it was wrong. The results obtained from this preliminary experiment showed that the proposed method was practical and that the research scale could be scaled up.