Attribute Analysis for Bangla Words for Universal Networking Language(UNL)

The Universal Networking Language (UNL) is an artificial worldwide generalizes form human interactive in machine independent digital platform for defining, recapitulating, amending, storing and dissipating knowledge or information among people of different affiliations. The theoretical and practical research associated with these interdisciplinary endeavor facilities in a number of practical applications in most domains of human activities such as creating globalization trends of market or geopolitical independence among nations. In our research work we have tried to develop analysis rules for Bangla part of speech which will help to create a doorway for converting the Bangla language to UNL and vice versa and overcome the barrier between Bangla to other Languages.


INTRODUCTION
Today the regional economics, societies, cultures and education are integrated through a globe-spanning network of communication and trade.This globalization trend evokes for a homogeneous platform so that each number of the platform can apprehend what other intimates and perpetuates the discussion in a mellifluous way.However the barriers of languages throughout the world are continuously obviating the whole world from congregating into a single domain of sharing knowledge and information.As a consequence United Nation University/Institute of advance Studies (UNU/IAS) were decided to develop an inter-language translation program.The corollary of their continuous research leads to a common form of language known as Universal Networking Language (UNL) [1].
The UNL acts as an intermediate form computer semantic language whereby any text written in a particular language is converted to a text of any other forms of language [2].UNL, in other words is an artificial language for computer to express information and knowledge that can expressed in natural language.The rest of the paper is organized as the following.Section II outlines the UNL general structure.
In Section III Bangla part of speech and in section IV provides a Rule generation for Bangla part of speech for UNL expression.

II. STRUCTURE OF UNL
UNL system composed of three parts namely Universal words, Attributes labels and relational labels.Universal word (UW) which is actually nothing but English like word and is represented by nodes in a hyper graph [1,7].Nodes associated with a sentence are connected by a relation known as symbolic relation.Each UW has some attributes that uniquely specifies that word and is placed according to a conceptual hierarchy derives from a knowledge base.However each of the UWs is comprised of Headword along with some constraints.The headword is considered as the form of native language word known as label whereas each of constraints in a constraint list of the Universal word corresponds to a concept of that word.The attributes lists associated with the individual universal word are used to represent the subjectivity of word based on their grammatical properties [5,6].
The knowledge base (KB) which actually holds every possible combination of semantic relations basically plays two roles.Firstly it defines semantics of UWs and then provides linguistics knowledge of concepts.The KB however not only provides linguistics knowledge in computer understandable format but also provides the semantics background of UNL expressions [8].
In addition to the above parts the UNL system has a language server which can be fragmented into two parts known as en-converter (EnCo) and de-converter (DeCo).The converter builds a framework, independent of the diversity of languages, for morphological, semantic analysis and converts the native language text into UNL expressions autonomously [14].To perform the conversion operation the EnCo uses word dictionary, knowledge base and en-cnversion rules.In contrast the DeCo acts the reverse way the EnCo does [15].The general formats of the word dictionary entry are defined by UNL as follows: [HW] -UW‖ (ATTRIBUTE1, ATTRIBUTE2 Morphology is the field of linguistics that studies the structure of words.It focuses on the patterns of word formation within and across languages, and attempts to formulate rules that model the knowledge of speakers of those languages.In natural language processing (NLP) we need to identify words in texts in order to determine their syntactic and semantic properties [10,11].In the following section we are analyzing morphologically the different Bangla part of speech so that we can develop efficient rules for UNL expression.

A. Grammatical Construction of words
In this section, we have pointed out some essential grammatical issues about different parts of speech of Bangla that must be needed for English to Bangla MT dictionary.

B. Parts of speech
In Bangla language word may be categorized in one of five categories: noun, pronoun, adjective, verb and indeclinable [16].Here, adverb is considered as adjective and the type indeclinable is concerned as preposition, conjunction and interjection.
If we consider ‗চ' (means go) as a root, we can represent this root in the dictionary as [চ]{} -go (icl>do)‖ (V, @present) <B,0,0> Some transformations based on the persons and tenses are.Using the same procedure we can make dictionary entries for different transformations of other roots such as ের (do), লি (write),ছে (give) etc.

D. Number
Both Bangla and English language, there are two types of number.They are: (i) Singular number, and (ii) Plural number.

IV. RULES FOR UNL TEXT GENERATION
In this section, we have presented some Bangla morphological rules for regular inflections, derivations and compounding with additional explicit rules for irregular inflection, derivation and compounding.

A. Analysis Rules
An analysis rule describes rule application conditions, a method to rewrite the attribute of node that satisfies the application condition, and construction methods of syntax tree.While applying rules, the EnConverter analyzes morphemes, syntax and semantics.Finally, it generates a syntax tree and a network.
The description format of the analysis rules is as follows [11]:

Symbol Explanation:
-‖ represents terminal symbol, [ ] represents zero or more times, {} and () designates an analysis windows in the node list.

Description of Condition:
<PRE> Describes condition of nodes on the left side of the left of analysis window.
<SUF> Describes condition of nodes on the right side of the right of analysis window.

Description of Action:
<ACTION1> Describe the rewriting of grammatical attribute in the LAW.
<ACTION2> Describe the rewriting of grammatical attribute in the RAW.

Direction of Semantic Relation:
It describes the semantic relation between the left node (LN) and the right node (RN).www.ijacsa.thesai.org<RELATION1>Describe the semantic relation of the RAW to LAW. <RELATION2>Describe the semantic relation of the LAW to RAW. <PRIORITY> Describes priority of the rules.Code 0-255 is used to specify the priority.

B. Types of the Analysis Rules
This part explains the action and functions of the rule types that can lie described with <TYPE> in analysis rules.

Left Composition " + | +:+ | +: c | +:*‖
The RN is combined to LN to make one composition node.The syntax tree and the attribute having left node are inherited.When the RN attributes is inherited, -@‖ is put in the action column of the LN, the original two nodes are deleted from the node list.The composition node is inserted into the node list.After applying the the composition node takes a position in the RAW.

Right Composition " -| -:+ | -:c | -:* "
The LN is combined to RN to make one composition node.The composition node is inserted into the node list.After applying the rules, the composition node takes a position in the LAW.

Left Modification "<"
When the RN modifies LN, the RN is deleted from node list and the LN remains only.The node, which the <RELATION> is described, is the to-node and the other node is from-node.

Right Modification ">"
When the LN modifies RN, the LN is deleted from node list and the RN remains only.

Left Shift "L"
Shift the analysis window to the left.

Right Shift "R"
Shift the analysis window to the right.

Attribute Changing Rule ":"
This rule adds or deletes attributes from a particular node.

C. Morphological rule Generation for Bangla Parts of Speech
Bangla is a semantic language, and its basic characteristic is the rich morphology in which most of its words are derived from roots.Inflections and derivations are generated by changing vowels and insertion of consonants.Bangla sentences are characterized by a strong tendency for agreement between its constituents: between verb and noun, noun and objective, in matters of numbers, gender, definitiveness, case, person, etc.These properties are expressed by a comprehensive system of affixation.To satisfy these grammatical properties, the generation rules are expected to be complex for handling the processing of generating grammatically correct Bangla sentences from UNL expression and structure A database system has been developed for the classification and features adding for each entry in the dictionary [3,4].The selected Bangla word is then classified to Noun or Verb or Particle.The relation mapping is implemented in the en-conversion rule.
Bangla parts of speech conversion rules are mainly for noun ↔ adjective.Some conversion rules are also done for indeclinable ↔ noun and indeclinable ↔ adjective and there are some exceptions also.
From the analysis of Bangla Part of speech, gender and number, one can readily find that they agree right composition rule.

V. CONCLUSION
In this paper we have presented morphological rules for Bangla part of speech, number and gender.To do so we did morphological analysis of Bangla part of speech.We hope that these rules would be useful for conversion of Bangla sentence to UNL expressions and vice-versa.
Even though the limited numbers of rules are considered in this paper, it theoretically shows that the designed model works perfectly for Bangla words.All the Bangla words and rules will be considered in future.