Stanza Type Identification using Systematization of Versification System of Hindi Poetry

Poetry covers a vast part of the literature of any language. Similarly, Hindi poetry is also having a massive portion in Hindi literature. In Hindi poetry construction, it is necessary to take care of various verse writing rules. This paper focuses on the automatic metadata generation from such poems by computational linguistics integrated advance and systematic, prosody rule-based modeling and detection procedures specially designed for Hindi poetry. The paper covers various challenges and the best possible solutions for those challenges, describing the methodology to generate automatic metadata for “Chhand” based on the poems’ stanzas. It also provides some advanced information and techniques for metadata generation for “Muktak Chhands”. Rules of the “Chhands” incorporated in this research were identified, verified, and modeled as per the computational linguistics perspective the very first time, which required a lot of effort and time. In this research work, 111 different “Chhand” rules were found. This paper presents rulebased modeling of all of the “Chhands”. Out of the all modeled “Chhands” the research work covers 53 “Chhands” for which at least 20 to 277 examples were found and used for automatic processing of the data for metadata generation. For this research work, the automatic metadata generator processed 3120 UTF-8 based inputs of 53 Hindi “Chhand” types, achieved 95.02% overall accuracy, and the overall failure rate was 4.98%. The minimum time taken for the processing of “Chhand” for metadata generation was 1.12 seconds, and the maximum was 91.79 seconds. Keywords—Chhand; computational linguistics; Hindi; metadata; poetry; prosody; stanza; verse


I. INTRODUCTION
Hindi ('ह िं दी') is known as a prevalent language. According to India's 2011 census, there were 322 million native speakers with Hindi as their first language [1]. The script is required to write any language. For the Hindi language, the writing script is Devanagari ('दे वनागरी'), which is fourth in the world when it comes to the most widely adopted writing systems [2]. With the help of the Devanagari script, more than 120 languages are written all over the world. As per The Unicode Standard, Version 13.0, the Devanagari Unicode range is 0900-097F [3].
Poetries hold an irreplaceable place in the world of literature in every language. Any poem's creation usually follows some specific patterns or rules known as prosody or poetics. Based on the prosody rules, it can be detected and decided that what kind of poem or part of the poem is, but the patterns may differ from language to language, and even in the same language, there can be plenty of prosody rules-based patterns [4].
There are two types of language processing approaches in the computational linguistics research domain: the text-based and speech-based approaches [5]. Both methods are required and play a vital role in research in the context of poetry. The text-based system is for working with the significant part of text-oriented rules, and for speech-related practices and patterns, the speech-based approach can be useful to fulfill the research demands. These approaches are adopted and followed based on the need and the nature of the research problem [6].
This research work revolves around Hindi poetry and its different prosody related rules. A significant part of the prosody rules is composed of the order of letters and their frequency. With the text-based approach of computational linguistics, the practices of the composition of stanza were initially systematically classified. Furthermore, the rules were used in the best possible way to carry out the rule-based modeling for the generation of the metadata automatically. In this research work, A proper classification structure will be introduced for Hindi verses. The research will also attempt to detect and identify the Hindi verse based on their appropriate formation rules.
Many verses are written in the Hindi language. This composition of verses in Hindi has been from ancient times. The knowledge hidden behind the rules of the creation of verses has inspired us to do this research work. The authors strongly believe that this research work will prove a milestone to preserve these verses' composition and give a new direction in computational linguistics research.
The rest of the paper has five major sections: literature review, the knowledge base about the Hindi stanza, Methodology, Results, and Conclusion. One can get a better idea of current research need or gap by going through the literature review first. The next part explains the systematic classification of the stanza specially introduced by this research work. Further, to understand the methodology, formation, and calculation, details of the stanzas are discussed in depth. Based on the various test and experiments, the result section focuses on the outcomes. Finally, based on the complete research work, the overall conclusion is discussed in the last conclusion section, consisting of all significant findings, developments, and results during the research journey. *Corresponding Author www.ijacsa.thesai.org

II. LITERATURE REVIEW
A literature review is a fundamental part of any research work. For the same, best efforts had been given to find out the relevant research works. Research work directly related to the metadata generation, related to the Hindi Prosody, and precisely basis on the Computation Linguistics not found. Some nearby research work related to computational linguistics or metadata generation were seen, which are enlighten here.
Efforts were made to find Indian regional languages related research works to know the research's standing specifically in the Indian languages segment. Moreover, it was found that different research work focusing on the various aspects of problems in Indian language-based studies. Audichya and Saini [7] introduced a way through computational linguistics approach for the automatic metadata generation based on the unified rule-based technique for Hindi poetries. They achieved nearly 98% of correct results. Rest errors were due to some input and provided data-related issues. Joshi and Kushwah [8] did research studies that emphasize the detection of 'चौपाई' (Chaupai -A type of Hindi verses) and achieved more than 85% accuracy in detection. They found the issues because some poets usually increase or decrease the 'Matras' to maintain the rhythm or flow due to some different structured but similar sounding words in the end. They also did some research on another Hindi verse named 'रोला' (Rola -A type of Hindi verses) detection and were able to achieve around 89% of accuracy [9]. The remaining accuracy was not gained due to assumptions of the poets while creating and because of the higher sum value of 'मात्रा' (Matras -Quantity) than the expected threshold.
Bafna and Saini [10] tried to classify the Hindi verses based on the various Machine Learning algorithms. They did a comparative study of SVM, Decision Tree, Neural Network, and Naive Byes on 697 poem classification. In another research work, they did Hindi poetry classification using Eager supervised machine learning algorithms and evaluated using the misclassification error [11]. Research work to predict the Hindi verse class using concept learning done by these researchers in which they found that K-nearest neighbors performed better [12].
Kaur and Saini [13] worked on Punjabi Poems' classification using ten different Machine Learning algorithms. In several other research works, they worked on various Punjabi poems using poetic features, linguistic Features, and Weighting [14][15]. They designed a content-based Punjabi poetry classifier using WEKA using different machine learning algorithms in another research work. They found that the Support Vector Machine algorithm was the best performing accuracy of 76.02% [16][17].
Pal and Patel [18] researched the development of a model based on the nine 'Ras' and tried to classify it using Machine Learning modeling. Saini and Kaur [19] also did emotion detection-based research focusing on 'Navrasa' using machine learning algorithm Naïve Bayes (NB) and Support Vector Machine (SVM), SVM performed better with 70.02% overall accuracy. Bafna and Saini [20] also worked for the Hindi and Marathi language Prose and Verse application-based researches. Apart from that, research work from the same researchers introduced in which they have given a technique for Hindi Verses and Proses to identify context-based standard tokens [21]. Some research work in the Sanskrit language based on Computational Linguistics was also found [22]. Apart from that, other research works for automatic metadata generation [23], text-based document classification [24][25], and computational linguistics-based metadata building research (CLiMB) works were explored too [26].
The internationally well-known foreign language-based research was analyzed to see the research works, current trends, and progress related to the different language-based works. While reviewing, some excellent foreign languagebased research works were found. In a research work of Arabic poetry, emotion classification using machine learning was carried out by Alsharif, AlShamaa, and Ghneim [27]. Hamidi, Razzazi, and Ghaemmaghami [28] researched using Support Vector Machines for the Persian automatic meter classification. In some other research works, researchers have explored Arabic, Bengali, Chinese, English, Hindi, Marathi, Oriya, Persian, Punjabi, and Urdu languages [29][30][31].
Based on this literature review, it was realized that there is still a lack of essential research for the Hindi poem's automatic metadata generation. Before that, a special effort to structure the combination of the universal unified Prosody ('Chhand') rules systematically is the need of the hour. Furthermore, a robust method based on those rules for metadata generation is the demand of this research wing.

A. Quantity Calculation Rules Simplified
Plenty of poetry rules were tested and authenticated to found appropriate stanza formation rules. Some manual calculations were carried out to know the rules and figure out the facts out of those calculations results. To validate the identified and authenticated rules information from various sources such as advice from some experts, the notes provided by them, some ancient books [32,33,40], online articles, some books which contain some or very minimal parts of 'Chhands' were also considered [38][39][40][41]. The research work designed and carried out is a very smooth and systematic way to adopt the new or missing rules if some additional information or practice is found in the future that can also be incorporated easily. Here mentioning the only rules considered for the calculations, and there may or may not be the chance of some new or old rules to be found in the future.
To perform the research work as a researcher or become an influential 'Chhands' writer, one must know the rules for better results to know more about the poems and understand the 'Chhands' from the core. Let us see basic simplified rules of 'Matra Ganna' or quantity calculation for Hindi.
The 'Chhand' rules are identified, verified, and modeled in the computational linguistics aspect very first time in this research area. It requires a lot of effort and time to scrutinize and model something that not already standardize. Along with that, the special exceptional rules were incorporated.

1) Common Rules a) Each 'Hrasva' Vowel
2) Special -Exceptional Rules: a) A special rule of the 'Anunasik' or 'Ardh Chandrabindu' (' ') is that if 'Anunasik' (' ') is applied to consonants which are considered as 'Laghu' and are not applied with any 'Harsva' Diacritic than quantity is regarded as one only as it is treated as 'Laghu'. If used on the 'Harsva', no changes in quantity and considered as two quantities.
b) Ligature or Joint Character at the starting of any word is considered 'Laghu' and counted as one quantity only.
c) Ligature or Joint Character at the starting of any word is along with 'Dirgh' Vowel diacritic than considered 'Guru' and counted as two quantities, and Half Characters value becomes 0. Example: त =22, न=21, य न=21. d) Before ligature, the 'Laghu' characters are considered 'Guru', and quantities become two instead of one.
e) Before the ligature, the 'Guru' characters are considered 'Guru', and quantity is counted as two quantities.
f) If the next character after the ligature is 'ह' and 'Dirgh' Vowel Diacritic, then half the Characters value becomes zero. Example: g) If the next character after the ligature is 'ह' with 'Laghu' Vowel Diacritic or Laghu Character, there will be no change in calculation rules. Example: हन=211, अ ह =211.

3) Most affecting exceptions:
Sometimes at the end of the stanza, if there is 'Laghu', it is considered 'Guru' as per the pronunciation. Also, some poets usually increase and decrease the diacritics to maintain the flow or their own choices. Some poet usually takes references from the existing 'Chhand' creation rules but does not follow those rules completely, which affects the automatic detection. Apart from the poorly formatted and junk characters added by avoiding the formation rules such as emojis, universal special characters, www.ijacsa.thesai.org special characters from other languages considered as the junk characters, and lower down the result accuracy.
After applying these rules, once one gets the allocated quantity, one needs to sum those quantities on a different basis as per the requirement. One example is here for the sum of the word-level quantity sum.
Example: सीिाराम = 2221= 2+2+2+1 = 7, सत्य = 21 = 2+1 = 3 4) 'Gana' Sequence: For 'Varnik' verses the 'Gana' is used in sequence-related rules, one needs to know more about the 'Gana', so Table I will show more about 'Gana'. Table I  That is all about the calculation-related rules and ways, which will help us understand the research work better and surely help future research works. These rules were needed to simplify, and a proper standardized flow by putting massive efforts and experiments were required. It was more challenging to manage everything because no such pertinent standard research-level articles or bases were found.

B. Structure Creation of Hindi' Chhands' from a Research
Perspective When the authors started finding the information related to the 'Chhands', it was found that only a tiny amount of properly arranged information is available [34], and whatever is available is also having some contradiction at different sources. For instance, Different rules and creation information of 'Bujangi Chhand' were found at various places [42][43]. It has been observed that the available data cannot be used with the research perspectives. With whatever information collected from the different sources which are mentioned in Section 3.1, manually analyzed, validated and once after proper authentication, this decision is made that whatever is available needs to be systemized first and for which an adequate structure creation is required, so this is an effort towards the same.
'Laukik Chhands' are the verses written by people and are not a part of Vedas. These are reported in both Sanskrit and the Hindi language. As the research work focuses on Hindi poetry, only Hindi 'Chhands' will be discussed. These verses can be classified into three classes based on the nature of rules of 'Chhand' writing, which are:  Table II. Table II shows the names of 'Chhand', classification type, and subtypes of 'Chhand' along with the 'Matra Counts' used for the detection at the time of rule-based modeling. Apart from these classifications and 'Matra' count, some more rules are associated with the specific verses, and these rules change for every verse. A few 'Sam Matrik Chhand', some 'Ardh Sam Matrik Chhand' and 'Visham Matrik Chhand' are shown in Tables III and IV.   Table III shows the information about the 'Ardh Sam Matrik Chhands' in which it can be seen that 'Matra' count of even and odd stanzas are different. Let's see something about 'Visham Matrik Chhands' as well.     2) 'Varnik Chhands':In the creation or writing of 'Varnik Chhands', the 'Gana' plays a vital role as these verses are based on the characters or 'Varnas'. The predefined sequences as per specific rules need to be maintained for each 'Chhand' according to the different arrangements of 'Gana', 'Laghu', and 'Guru' characters. Table V shows some of the 'Varnik Chhands' and the rules and the symbolic representation of the different regulations.
Here it can be seen that the characters' sequence matters the most and is based on those eight 'Gana', 'Guru', and 'Laghu' characters. Here a few 'Matrik' and 'Varnik' verses are included only. Similarly, 'Ardh Sam Varnik' and 'Visham Varnik' information can be managed and organized.
There are plenty of verses available. So, telling how many 'Chhands' exist is impossible now, but that does not mean that the remaining possibilities should left, so the significant rules and classification-related information were added for many 'Chhands', including the 'Chhands' for which examples or information are less available. Fifty-three different verses information were added here, but to add all the verses is does not seem to be appropriate as the list goes on and on and the remaining 'Chhands' are not having much information available. Managed information consisting of these verses how the relevant information of already written poems and if some new types follow specific rules, such new classes can be organized in systematic management. If research demands new or additional information blocks for each 'Chhand' can also be added, it was not possible until now due to cluttered and unmanaged raw information.
One more point to be noted here is that this information is just the core detailing of the verse creation structures. Along with that, there are still some more things are there which need to be managed and differs from one verse type to another verse type. The conceptual part will be discussed in the methodology section.

IV. METHODOLOGY
Based on the systemized rules, the best attempts were made to provide the best possible concrete concept for the automatic metadata generation for Hindi poetries by incorporating the rules-based unified modeling. This research work consists of multiple existing Hindi verses and trying to cover up every poetry that has been already written and having a systematic writing rule but is not found or missing until now and will be written in the future.

A. Data Pre-Processing
UTF-8 standard encoded data for Devanagari was chosen to work with as input for automatic metadata generator to use and integrate with the latest technologies will be more comfortable in the future. The automatic metadata generator expects the information in the form of the UTF-8 based complete poems or a few lines or part of poems [3]. These lines are further processed for the pre-processing of the input for passing it for the further calculations after some necessary trimming and cleaning operations. Once the pre-processed data is ready after the cleaning operation, the cleaned data can be passed for the next separation-related operations. There are several levels of the separation-related processes that occur as per the demand of this research work. The first stage of the separation of data is for the line-level break based on the new line character '\n' as the standard delimiter for separating the lines. After separating www.ijacsa.thesai.org the lines, the lines need to be split further into the parts known as 'Charans' or stanzas of the verse. Separation consists of a few delimiters (',', '|| ', '| '). If there is any remaining delimiter that needs to be used, it can be used too easily. After the 'Charan' separation, the separate stanzas need to chop into the words by performing word-level separation. Each separated word further needs to be divided into the characters and diacritics.

B. 'Chhand' Detection based on the Classification
One might wonder why this much separation is required, so the straight forward answer to the curiosity going on here is that these different separations are needed at different phases. Based on these separated data, the calculative operations can be performed efficiently and in an organized manner. Several kinds of separated data consisting of the line, stanza, words, and character level separation helps in getting so much meaningful information such as word count, character count, diacritic count, stanza count, the sequence of the words and characters, and these pieces of information help retrieve the more meaningful data while processing the data further.
Let us now understand this concept with an example for better conceptual clarity. Here is an example of one of the Hindi verse type 'Doha' from 'Hanuman Chalisa' by a wellknown poet-saint Tulsidas These are the type of different separations. Separation of the diacritic count, 'Guru' and 'Laghu' diacritic counts for the diacritic count stats were done. To add how many times half characters were used the count of the ' ' is used. After this, all the main parts come into the implementation, which is the calculation based on all the rules which can be seen under 3.1 Quantity Calculation Rules Simplified. Let us know the calculation mechanism for detecting the verse, type, subtype, and much more. Let us see stanza wise 'Matra' allocation and counting for the given input: The 'Matra' allocation and the 'Matra' Count will be used further to detect verse after the rule-based modeling of verse rules automatically. The 'Matra' allocation is also used after a few modifications and merging for character sequence mapping, specifically for 'Varnik' verses.
After this allocation and 'Matra' counting, the input passes through the different set of rule-based methods specifically designed for the specific verse-based rules. Each verse follows its own unique set of rules. The 53 verses were rule-based modeled, which can be detected in a bottom-up approach in which the verse is seen first. Later on, the verse type and subtype can be mapped with the already available and systematically managed list of types and subtypes relationships.
The provided input must follow all the associated rules of that particular verse to detect a verse automatically. Let us see about the given input after the different parts of the processing.
The automatic metadata generator still does not know which kind of verse it is. However, the metadata generator is modeled with rule-based modeling. The rules for the verse named 'Doha' are already available, so once it will process the data, it is capable enough to say that the provided input was 'Doha'. Let us know more about how it can be said automatically. For 'Doha' writing, it is a rule that the odd stanzas' Matra' count must be 13, and even stanzas 'Matra' count must be 11.
Along with that, even stanzas should end with the 'Laghu' character. Now when the 'Matra' count operation is performed on the 1st and 3rd stanzas, 'Matra' counts are 13,13 and 2nd and 4th stanzas 'Matra' counts are 11, 11 respectively, also you can see that at the end of 2nd and 4th stanza there is 'Laghu' character. After this much modeling, the metadata generator is aware of the input and can say that it is a 'Doha'. A systematically organized hierarchy from which 'Doha' can be mapped as 'Matrik' type and 'Ardh Sam Martik' subtype.
Similarly, for the 'Varnik' verses, there are different character sequence-based rules that work on the sequences related rules specified for the particular verses and are based on the 'Gana', 'Guru' and 'Laghu' sequences only. In 'Varnik' verses, each of the verses has its own unique rule and needs to www.ijacsa.thesai.org be managed separately. Similarly, after detecting each 'Varnik' verse, its type and subtype can be seen form the mapping with the systematically organized hierarchy of verse.
Even after this much processing, if any verses are not found or detected, they are considered the 'Muktak' verses.

C. Advance 'Muktak' Detection
Thought of advancement concerning research computationally means something which does not exist or available yet. During research, the need for this was felt the most, usually whenever the 'Chhand' detection takes place, and the input gets detected as 'Muktak'. In that case, the input gets chopped into several parts again until it is possible to till 'Charan' level separation. Those separated parts are processed again from scratch as separate input and from which the results get recorded. At last, Metadata Generator can tell us that in 'Muktak' verses also it tried to find out if any part of the 'Muktak' verses is using any 'Matrik' or 'Varnik' verse rules than that will also be detected. That can be one or more than one 'Chhand' rules combination.
Apart from this, while processing the individual input, which uses only one type of 'Chhand' rules, some specific part was having some issues. Because of this mechanism, the input first goes into the 'Muktak' part. Except for the faulty/issue part, all remaining parts get detected as the specific 'Chhand', so metadata generator can generate data as the input is having this particular 'Chhand' and is the primary reason for the higher accuracy success rate of this metadata generator.

D. Stop Words Filtering
After this, to add a few more advancements into this metadata generator, detecting stop words from the given input is one of those advancements. Stop words are filtered through the list of already carried out hybrid research work [48], specifically on the stop words for Hindi's stop words. Filtering the stop words is essential because in the next stage, when the terms are processed through wordnet, stop words should not be processed. This filtering gives the metadata generator ability not to process the stop words while getting meanings from the wordnet. It saves a lot of execution time as well as improve the overall efficiency of the metadata generator.

E. Wordnet Integration for Meaning and another Example of Words
Sometimes a user might want to know the meaning of the words used in verse, for which after removing the stop word, the Hindi wordnet named 'pyiwn' for the meaning of the words and the examples were integrated [49]. The wordnet integration makes the metadata generator more worthy because if a beginner wants to learn and understand Hindi Poetry should be aware of the meaning of the poem's words. The user also gets one another similar example of that work, capable enough to make sure that how specific word should be used.
The wordnet integration helps the wordnet advancement too. As it separates, the list of the word does not exist in the wordnet still, or the meaning of the word is yet not incorporated in the wordnet. This list can be considered for the improvement of the ongoing wordnet research works as well.

F. Example Suggestions of Detected 'Chhand'
This mechanism is integrated with keeping the scenario in mind that a user who wants metadata about the input might be interested in the same types 'Chhand' example of the detected 'Chhand' type. It gives the user more ability to know and understand the 'Chhand' formation better by comparing the input with the suggested example. The example sets were stored in key-value pair in the JavaScript Object Notation (JSON) file from which any random example gets populated whenever the specific 'Chhand' type gets detected.

G. Additional Several Utilities
Data collection issues were faced during the research work as the dataset for such research work is not available as of now directly. A systematic approach was required to avoid redundant data, to maintain and manage the collecting data in a decent form. A utility for data collection was also designed to store and check if the entered data already exists or not, and if not, it will add the inputted data. The data collection utility is capable enough to get the single and the multiple inputs from the Comma Separated Value (CSV) files or Text Files.
A utility to generate some random text-based 'Chhand' from the collection of words on the given parameters of the character set was also developed to test the metadata generator's ability to ensure the capacity to handle the 'Chhand' types which do not exist currently or can come up in future. The metadata generator generates the data about the provided input by combining all the mentioned methodology parts, gives robust metadata, and strengthens the metadata performance.

V. RESULTS
Hindi poems can be written using any 'Chhand' or any combination of 'Chhands'. The research work is sufficient enough to help in the automatic generation of the metadata from the Hindi poetries by covering the majority of 'Chhands' already and having the capability to incorporate new kinds of 'Chhands' in a very systematic manner with ease. Along with the detection of 'Chhands', the metadata generator provides several meaningful and useful information. Information includes Word Count, Character Count, Diacritic Count, Quantity ('Matra') Count, Symbolical String Representation, stop words, Meaning of the words, along with an example of the use of the word for better understanding. So, the approach which is followed can be understood better as per this automatically generated metadata. The result is an example of metadata output generated by the metadata generator. The same example discussed in this paper since the beginning for a better understanding of the metadata generator's working.
The metadata generator is modeled based on the rules of the 111 different 'Chhands' classification types, subtypes, and subtypes of subtypes. The 53 different types shown in Table  VI, 'Chhands' data were collected for testing and validation. Each class has at least 20 and a maximum of 277 records based on the data's availability, which may vary for different types. Total 3120 records were found from various sources and tested, from which 2992 records were detected successfully. One hundred twenty-eight records were not detected due to some reasons such as poorly formatted data, some grammarwww.ijacsa.thesai.org related issues, manipulated words from regional languages, unnecessary uses of the special symbols, and junk character.
The overall accuracy rate based on the results is 95.02%, and the failure rate is 04.98%. Fig. 1 represents a graph of the 'Chhand' data found, detected, and not found along with the individual accuracy and failure rates of different 'Chhands'. Fig. 1 shows that the various 'Chhands' accuracy rate varies between 84% to 100%, and the failure rate lies between 0% and 16%. Out of all the 53 different types of 'Chhands' data, only 5 'Chhands' were having an accuracy rate below 90%, and the rest all were between 90 and 100%, and in that also 29 were having 95 or more than 95 and 100% accuracy rate. The lowest accuracy percentage is 84% of 'Chhapay Chhand', which is made up of two different 'Chhands', which makes its construction and detection complex, and due to that, only the error or issues occurs more. The six best performing 'Chhands' with 100% accuracy rate, were 'Aansu', 'Bhujangprayag', 'Janak', 'Mandakranta', 'Muktak' and 'Tilka'. The authors were unable to find much-existing work in this area, so comparing research work and the results is not possible. Any exactly similar work related to the Hindi verses classification identification was not find that only makes this work innovative through novelty. There were only two very nearby research works were found. The first research work was based on identifying 'Chaupai' with 85% accuracy, while this research gives 98% accuracy. Similarly, another research work based on 'Rola' was seen with 89% accuracy, where this research provides 95.24% accuracy.
Execution time is also one of the significant aspects of any research work. Hence execution time was tracked. Fig. 2 represents the execution time graph for the tracked time of average, minimum, and maximum time taken by each 'Chhand' for the metadata generated by the metadata generator. It can be seen that the Minimum time taken by different 'Chhands' is between 1.12 seconds and 1.72 seconds, and the average time taken for execution is between 1.45 seconds and 28.15 seconds. The maximum time is 1.80 seconds to 91.79 seconds. By analyzing all the time-based results, it was figured out that the inputs which were having more data took more time. 'Chhands', which have a smaller number of lines, words, or characters, gets detected quicker than the 'Chhands', which have a more significant number of lines, words, and characters.

VI. CONCLUSIONS AND FUTURE WORK
The authors did a thorough literature review and found that no research study worked for the classification of Hindi verses out of all the research works, detection and identification is still missing. Even the main thing required initially for this type of research works, the taxonomical hierarchical structure of Hindi verses was absent.
To conclude, the authors taxonomically structured the Hindi verse hierarchy initially to begin the research, later the 'Chhand' rules were identified, verified, and appropriately structured with the aspect of the computational linguisticsbased research works, which are managed very well by this research work. 'Chhands' were classified in the standard classes for better and smooth hierarchical management in this research work. It was experienced that 'Chhand' detection based on the rules is a complex process. 'Chhands' made up of complicated rules, takes more time to detect. Special exception rules slow down the execution time as it takes more time to process and check the data. 'Chhands' made up of the combination of the one or more 'Chhand' rules, also takes more time as they need to be gone through more than once while detection and metadata generation processing for the detection. Additionally, this research can extract stop words by filtering from the existing list of stop words. Wordnet was incorporated for the meaning and example uses of the phrase. 111 'Chhands' rules were found and modeled, out of which 53 'Chhands' were having at least 20 examples, so 53 'Chhands' example data was considered for this research study. It was concluded that 'Aansu', 'Bhujangprayag', 'Janak', 'Mandakranta', 'Muktak' and 'Tilka' were the six best performing 'Chhands' out of 53 'Chhands' with 100% accuracy. 'Chhapay Chhand' was the lowest performer with 84% accuracy. The examples of the detected 'Chhand' were populated, making it easier to understand the 'Chhand' creation rules with access to another example of the same type of the detected 'Chhand'. After carrying out this much research work, it can be powerfully conveyed that systematized Hindi Prosody Rules and the concept of automatic metadata generation for Hindi poetry are capable enough to automatically generate meaningful metadata. The research work done so far is sufficient to open a new path for upcoming researchers to think of some other relevant aspects and further contribute to Natural Language Processing and Computational linguistics research domain. For future works to continue this research work in more in-depth automatic grammatical correction, a speechbased approach for inputs can be explored and integrated.