A Method to Accommodate Backward Compatibility on the Learning Application-based Transliteration to the Balinese Script

This research proposed a method to accommodate backward compatibility on the learning application-based transliteration to the Balinese Script. The objective is to accommodate the standard transliteration rules from the Balinese Language, Script, and Literature Advisory Agency. It is considered as the main contribution since there has not been a workaround in this research area. This multi-discipline collaboration work is one of the efforts to preserve digitally the endangered Balinese local language knowledge in Indonesia. The proposed method covered two aspects, i.e. (1) Its backward compatibility allows for interoperability at a certain level with the older transliteration rules; and (2) Breaking backward compatibility at a certain level is unavoidable since, for the same aspect, there is a contradictory treatment between the standard rule and the old one. This study was conducted on the developed web-based transliteration learning application, BaliScript, where its Latin text input will be converted into the Balinese Script output using the dedicated Balinese Unicode font. Through the experiment, the proposed method gave the expected transliteration results on the accommodation of backward compatibility. Keywords—Backward compatibility; Balinese Script; learning application; transliteration


I. INTRODUCTION
As one of the diversity of local language knowledge in Indonesia, the endangered Balinese Script transliteration knowledge [1]- [3] raises concerns for the preservation. The Bali Government has already conducted the preservation efforts through the Bali Governor Regulation [4], [5] and strengthen them with the Bali Governor Circular Letter [6]. These efforts make the Balinese Language, including its Balinese Script transliteration knowledge, running as a mandatory local subject from elementary school to senior high school in Bali Province.
Multiple approaches other than the governmental approach should strengthen the preservation effort and should have a greater impact. This research joined the effort through the technological approach by multi-discipline collaboration between Computer Science and Language discipline. It proposed a method to accommodate backward compatibility on the learning application-based transliteration to the Balinese Script. This work has never been conducted yet and applied to the previous works that were still based on the older transliteration rules (for short, the older rules) from The Balinese Alphabet document 1 . It exposes the backward compatibility method to accommodate the standard transliteration rules (for short, the standard rules) from the Balinese Language, Script, and Literature Advisory Agency [7]. This Bali Province government agency [4] carries out guidance and formulates programs for the maintenance, study, development, and preservation of the Balinese Language, Script, and Literature.
This study was conducted on the developed web-based transliteration learning application, BaliScript, for further ubiquitous Balinese Language learning since the proposed method reusable for the mobile application [8], [9]. It also advances the previous work by (1) accommodating special words [1], [10] through a certain table structure in the database rather than hard-coding them in the application code; (2) making use of the more developed and the less bug of Noto Serif Balinese (NSB) font 2 , 3 [11] to represent the Balinese Script rather than the Noto Sans Balinese font 4 . The NSB font is a dedicated Balinese Unicode font which makes it recognized on the computer system including mobile devices and makes the proposed method reusable on the mobile application; and (3) improving the learning experience on the application, that uses this method, through the addition of the Indonesian and English translation for the transliterated word (see the next Fig. 3). Overall, all of those advances are considered as the contribution of this work. This paper is organized into several sections, i.e. Section I (Introduction) states the problem background related to the transliteration to the Balinese Script; Section II (Related Works) describes the related works in the area of the transliteration to the Balinese Script and its backward compatibility aspect; Section III (Research Method) exposes the supporting algorithm, the implementation, and the testing www.ijacsa.thesai.org of the proposed method; Section IV (Result and Analysis) covers the analysis of the testing result; and finally, Section V (Conclusion) consists of important conclusion and future work points.

II. RELATED WORKS
Several related works on Latin-to-Balinese Script transliteration were conducted on the previous works [10], [12]- [20]. All of those were still based on the older rules from The Balinese Alphabet document, except [20]. Displaying Balinese Script output on those previous research was done by non-dedicated Balinese Unicode fonts (i.e. Bali Simbar 5 and Bali Simbar Dwijendra [21]) and dedicated Balinese Unicode font 2 [11] (i.e. Noto Sans Balinese and Noto Serif Balinese). The Bali Simbar (BS) font was utilized in [12] and gave a relatively good accuracy result on testing cases from The Balinese Alphabet document. It was also utilized in the developed robotic system that writes the Balinese Script from the Latin text input [13], and on the exploration of the linebreak handling during the transliteration [14]. The Bali Simbar Dwijendra (BSD) font, as the improvement of the BS font, was utilized in [15] with additional testing cases from the Balinese Script dictionary [7] to the same testing cases on [12]. It was also utilized in the exploration of the mathematical expression transliteration [16]. Ten transliteration lessons were also learned by using this font on the other testing data [17]. The Noto Sans Balinese font was utilized in [10] with the same testing cases in [12] and gave a relatively good accuracy result. It was also utilized in the developed robotic system that writes Balinese Script from the Latin text input [18]. Extensive accuracy analysis on the developed algorithm [10] was done in [19] for future improvement. the Noto Serif Balinese font was utilized in [20] for the unavoidable affixed words that need to be transliterated.
The other side of transliteration related to the Balinese Script-to-Latin transliteration that utilized the GNU Optical Character Recognition (OCR), i.e. Ocrad 6 [22]. This research was limited only to the basic syllable recognition (see The Balinese Alphabet document) from the Balinese Script image that was based on the glyph shape of the Bali Simbar font. For advancing functionality and mobile adoption for ubiquitous learning, the utilization of the Tesseract 7 OCR was conducted that needs several future improvements [23].

III. RESEARCH METHOD
The proposed method to accommodate backward compatibility on the transliteration to the Balinese Script covers two aspects related to the older transliteration rules from The Balinese Alphabet document. Those two aspects, i.e.
(1) Backward compatibility allows for interoperability at a certain level with the older rules; and (2) Breaking backward compatibility at a certain level is unavoidable since, for the 5  same aspect, there is a contradictory treatment between the standard rule and the old one.
This section describes (1) the supporting algorithm of the proposed method; (2) the implementation on the BaliScript, which is the web-based transliteration learning application; and (3) the testing by using the updated testing cases of The Balinese Alphabet document to comply with the standard transliteration rules from the Balinese Language, Script, and Literature Advisory Agency [4], [7].
Those two aspects should be handled by the proposed method. Fig. 1 shows the flowchart of the algorithm and uses regular expression [25], [26] on the implementation. B. The Implementation Fig. 2 (a) shows the Model-View-Controller (MVC) architecture [27]- [29] of the web-based transliteration learning application, BaliScript, that was used by the proposed method. The supporting database's table (Fig. 2 b) consists of records from the Balinese Script dictionary [7]. Fig. 3 shows the Indonesian and English translation of the example transliterated word for improving the learning experience on the application. As described previously, this feature is one of several advances as the contribution of this work. The BaliScript was constructed by Apache web server, MySQL database server, and PHP code combined with JavaScript code. This application was also used for the exploration of scriptio continua management in the previous work [30].  (2) output view that displays the transliteration result and other results from the closest similar words in the database where the similarity calculation is based on the Levenshtein distance [31], [32]. Fig.  3 (b) shows the transliteration output from the example homonym word [33], [34] at the similarity list by using AJAXbased switching (clicking on the word "USE" related to the certain word).

C. The Testing
The testing of the proposed method was conducted on the BaliScript, which was run on the Intel Core i7-4600U CPU @2.09GHz platform with 8 GB RAM and Windows 8 64-bit Operating System. IV. RESULT AND ANALYSIS Table I shows the testing cases consist of sections of interest (the marked sections) related to the result of backward compatibility (see Fig. 4). Noted that the testing used the updated testing cases that comply with the standard transliteration rules from the Balinese Language, Script, and Literature Advisory Agency [7] rather than the original testing cases [10] that refer to The Balinese Alphabet document. Alphabet "God's name" Ceremony One "A Javanese King" "One holy letter" "Symbol of God"  Table I, since the vowel "e" of the Balinese word "Sēla" (Yam) has sound [e] [24] for a certain meaning (the other "e", U+0065, with sound [ə] has a different meaning), to comply with the standard rule, the writing of that vowel should be changed to "ē". This condition breaks backward compatibility of the transliteration since the vowel "e" is a member of the letter set BBC (see The Algorithm section).
There are several sections of interest in Table I related   The bold underlined section on the updated testing case shows a section of interest that has backward compatibility where its transliteration result adheres to [7] and is the same as the transliteration result of the original testing case. This backward compatibility was achieved due to the process related to the algorithm.
 The bold dotted-underlined section on the updated testing case shows a section of interest that has a transliteration result that adheres to [7] but different from the transliteration result of the original testing case using The Balinese Alphabet document.
 The bold gray section on the updated testing case shows a section of interest that has broken backward compatibility by using different writing where its transliteration result adheres to [7] and the same to the transliteration result of the original testing case.
 The underline-across-space section on the updated testing case shows a section of interest that has a transliteration result that adheres to [7] and different from the transliteration result of the original testing case. This is because continuous (phrase or sentence) transliteration was used rather than word-by-word transliteration. If both updated and original testing cases use the same kind of transliteration then the result should be the same. It needs to be mentioned as a perspective that relatively was not related to backward compatibility. For example in case 2 of Table 2, the Balinese phrase "Kādep Jěro" (Sold House) has continuous transliteration result "ᬓᬵᬤᭂ ᬧ᭄ᬚᭂ ᬭᭀ" adheres to [7] and different from word-by-word transliteration result "ᬓᬵᬤᭂ ᬧ᭄ ᬚᭂ ᬭᭀ". In continuous transliteration, the second word of "Jěro" (House) has its consonant "J" was transliterated in appended form as "ᬧ᭄ᬚᭂ " (hanging below the regular form of consonant "p" of word "Kādep") while its vowel "ě" was transliterated as a vowel sign (upper form). In word-byword transliteration, the second word of "Jěro" has its consonant "J" was transliterated in regular form as "ᬚᭂ " (positioned on the side after the sound killer adeg-adeg "᭄" that kill the inherent sound of consonant "p" of word "Kādep") while its vowel "ě" was transliterated as a vowel sign (upper form).
Even though Balinese Script employs scriptio continua style [35], Fig. 4 shows its transliteration result in non-scriptio continua style (including preserved line breaks) which is possible to be generated for ease of visual analysis by the BaliScript learning application. This style was supported by the white-space 9 property of Cascading Style Sheets (CSS) that was set as pre-line. This kind of non-scriptio continua style has the same space and line break format as its Latin text input from the testing transliteration cases of Table I. It has a clear mapping between the input section of the Latin text (i.e. alphabet, syllable, word, or punctuation) and its related output section of the Balinese Script. That clear mapping was caused by the spaces and line breaks between those sections that were preserved by the transliteration algorithm [30]. The backward compatibility analysis of the transliteration results in Fig. 4 was based on the marked sections in Table I. The algorithm maintains backward compatibility and on the other side unavoidably breaks backward compatibility to comply with the standard transliteration rules [7].
Related to maintaining backward compatibility, the bold underlined section on the updated testing case shows a section of interest where its transliteration result adheres to [7] and is the same as the transliteration result of the original testing case. For example, in case 2 of Table I, the Balinese phrase "Kādep Jěro" (Sold House) has its continuous transliteration result "ᬓᬵᬤᭂ ᬧ᭄ᬚᭂ ᬭᭀ" that adheres to [7] and is the same as the continuous transliteration result from The Balinese Alphabet document (see the previous underline-across-space section).
From those cases with bold underline marks, certain of those were also marked with bold dotted-underline since each of them has a transliteration result that adheres to [7] but different from the transliteration result of the original testing case. For example, the Balinese word "Işwara" (God's name) and "Bhiśama" (Decree), each in case 3 and case 6 of Table I, have their variant words from [7], i.e. "Iswara" and "bisama" should be transliterated the same "ᬈᬰ᭄ᬯᬭ" and "ᬪᬷ ᬱᬫ" [7] but different from the transliteration result "ᬇᬰ᭄ᬯᬭ" (without vowel sign tedung "ᬵ") and "ᬪᬶ ᬱᬫ" (without vowel sign ulu sari "ᬷ "). This is a condition that should be taken care of by the effort for maintaining backward compatibility transliteration. Above that condition, these variances of word "Işwara", "Iswara", "Bhiśama", "bisama", and others should be registered with their related same value in column "sword" of database's table (see Fig. 2 (b)) for the same transliteration result that adheres to [7].
From those cases with bold underline marks, certain of those were safe to associate its vowel "e" (U+0065) to the vowel "ē" (U+0113) through the database registration because of its sound [e] [24]. This condition was possible since no counterpart word has the vowel "e" (U+0065) with sound [ə]. This condition is related to the next testing cases with the bold gray section. For example, the Balinese word "Akeh" (Many), in case 16 of Table I, has its variant words from [7], i.e. "akēh" should be transliterated the same "ᬳᬓᬾᬄ" [7]. As the exception to the standard rule [7] where the vowel "e" (U+0065) should be transliterated by using "ᭂ " (Balinese vowel sign pepet, U+1B42) while the vowel "ē" (U+0113) should be transliterated by using "᭄" (Balinese adeg-adeg, U+1B44), these variances of the word "Akeh", and "akēh" should be registered with their related same value in column "sword" of database's table (see Fig. 2(b)) for the same transliteration result that adheres to [7]. This is a condition that should be taken care of by the effort for maintaining backward compatibility transliteration since by nature people write the word in the easiest way (write "e" rather than "ē"), including inputting text to the transliteration application.
Related to unavoidable breaking backward compatibility to comply with the standard transliteration rules [7], the bold gray section on the updated testing cases shows a section of interest that has broken backward compatibility by using different writing where its transliteration result adheres to [7] and the same to the transliteration result of the original testing case. For example, in case 2 of Table I, the Balinese word "Sēla" (Yam) with its vowel "ē" (U+0113) has its transliteration result "ᬲᬾᬮ" that adheres to [7] and is the same as the transliteration result from The Balinese Alphabet document (see the previous standard rules [7] where the vowel "e" and "ē", each should be transliterated by using vowel sign pepet and sound killer adeg-adeg). If using the Balinese word "Sela" with its vowel "e" (U+0065) from the original testing case, its transliteration result "ᬲᭂ ᬮ" does not adhere to [7] even though is the same as the transliteration result from The Balinese Alphabet document.

V. CONCLUSION AND FUTURE WORK
A method to accommodate backward compatibility was proposed on the learning application-based transliteration to the Balinese Script. It covered two aspects related to considered sets of letters. The first aspect concerns the transliteration of a certain set of letters that causes backward compatibility to be maintained. The second aspect concerns the transliteration of a certain set of letters that causes backward compatibility to be broken unavoidably to comply with the standard rules from the Balinese Language, Script, and Literature Advisory Agency.