Automatic Melakarta Raaga Identification Syste: Carnatic Music

It is through experience one could as certain that the classifier in the arsenal or machine learning technique is the Nearest Neighbour Classifier. Automatic melakarta raaga identification system is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance today because issues of poor run-time performance are not such a problem these days with the computational power that is available. This paper presents an overview of techniques for Nearest Neighbour classification focusing on; mechanisms for finding distance between neighbours using Cosine Distance, Earth Movers Distance and formulas are used to identify nearest neighbours, algorithm for classification in training and testing for identifying Melakarta raagas in Carnatic music. From the derived results it is concluded that Earth Movers Distance is producing better results than Cosine Distance measure.


INTRODUCTION
Performance in Indian classical music is always within a Melakarta raaga, except for solo percussion.Melakarta raaga is a system within which performers improvise and compose.Melakarta raagas are often summarized by the notes they use, though many Melakarta raagas in fact share the same notes.Melakarta raaga recognition is a difficult task even for humans.A Melakarta raaga is popularly defined as a specified combination, decorated with embellishments and graceful consonances of notes within a mode which has the power of evoking a unique feeling distinct from all other joys and sorrows.It possesses something of a transcendental element.
A Melakarta raaga is characterized by several attributes, like its Vaadi-Samvaadi, Aarohana-Avrohana and Pakad [17], besides the sequence of notes.It is of utmost importance to note here that no two performances of the same Melakarta raaga, even two performances by the same artist, will be identical.A certain music piece is considered a certain Melakarta raaga, as long as the attributes associated with it are satisfied.This concept of Indian classical music, in that way, is very open.
Based on this work the following major contributions to the study of musical raagas and KNN with CD and EMD are made.In first place, our solutions based primarily on techniques from speech processing and pattern matching, which shows that techniques from other domains can be purposefully extended to solve problems in computational musical raagas, Secondly, the two note transcription methods presented are novel ways to extract notes from sample raagas of Indian classical music.This approach has given very encouraging results.
The rest of the paper is organized as follows.Section 2 highlights some of the useful and related previous research work in the area.The solution strategy is discussed in detail in Section 3. The test procedures and experimental results are presented in Section 4, Finally, Section 5 lists out the conclusions.

II. PEVIOUS WORK
The earlier work in Carnatic music retrieval is on a slow pace compared to western music.Some work is being done in Swara identification [1] and Singer identification [2] of Carnatic music.In Hindustani music work has been done in identifying the Melakarta raaga of Hindustani music [3].In [3][19] the authors have created a HMM based on which they have identified two raagas of Hindustani music.The fundamental difference between Hindustani Raaga pattern and Carnatic Raaga pattern is that in Hindustani R1, R2 are present as against R1, R2, R3 in Carnatic.Similarly G, D, N all has three distinct frequencies in Carnatic music as compared to two frequencies in Hindustani [8].This reduces the confusion in identifying the distinct frequencies in Hindustani music as compared to Carnatic music.The authors have not used polyphonic music signal and have assumed that the input music signal is a voice only signal.
The fundamental frequency of the signal was also assumed and based on these features the Melakarta raaga identification process was done for two Hindustani raagas.On the western music aspect, melody retrieval is being performed by researchers.The one proposed by [9] is based on identifying the change in frequency in the given query.
The query is received in the form a humming tune and based on the rise and fall in the pitch of the received query, the melody pattern that matches with the query's rise and fall of pitch is retrieved.The melody retrieval based on features like distance measures and gestalt principles.The approach is based on low level signal features and the Melakarta raaga is identified by considering different instrument signal as input to our system.In the present work Melakarta raaga identification is done using KNN with two different distance metrics one CD and the other EMD.www.ijacsa.thesai.orgIII.PRESENT WORK K Nearest Neighbour has rapidly become one of the booming technologies in today's world for developing convoluted control systems.Melakarta raaga Recognition is the fascinating applications of KNNwhich is basically used in Melakarta raaga identification for many cases, Melakarta raaga detection is considered as a rudimentary nearest neighbour problem.The problem becomes more fascinating because the content is an audiogiven an audio find the audio closest to the query from the trained database.The basic idea is as shown in Figure 1 which depicts a 3-Nearest Neighbour Classifier on a two-class problem in a twodimensional feature space.In this example the decision for q 1 is straightforwardall three of its nearest neighbours are of class O so it is classified as an O.The situation for q 2 is a bit more complicated at it has two neighbours of class X and one of class O.This can be resolved by simple majority voting or by distance weighted voting (see below).So k−NN classification has two stages; the first is the determination of the nearest neighbours and the second is the determination of the class using those neighbours.The following section describes the techniques CD and EMD which is used to raaga classification.Cosine similarity has a special property that makes it suitable for metric learning: the resulting similarity measure is always within the range of -1 and +1.This property allows the objective function to be simple and effective.

B. EARTH MOVER DISTANCE (EMD)
The Earth Mover Distance is based on the solution to a discrete optimal mass transportation problem.EMD represents the minimum cost of moving earth from some source locations to fill up holes at some sink locations.In other words, given any two mass (or probability) distributions, one of them can be viewed as a distribution of earth and the other a distribution of holes, then the EMD between the two distributions is the minimum cost of rearranging the mass in one distribution to obtain the other.In the continuous setting, this problem is known as the Monge-Kantorovich optimal mass transfer problem and has been well studied over the past 100 years the importance here is that EMD can be used to measure the discrepancy between two multidimensional distributions.

C. METHODOLOGY/ALGORITHM FOR MELAKARTA RAAGA RECOGNITION SYSTEM
Following is the methodology is used for the Melakarta raaga Recognition for training and testing.Initially first k-Nearest Neighbour Classifier is determined on a two-class problem in a two-dimensional feature space which is shown in the following diagram raagas in horizontal axis and neighbours of raaga on the vertical axis.In this proposed approach the decision for raaga is straightforwardone of its nearest neighbours is of class O and one of class X.For each xi Є D the distance between q and xi is calculated as follows: Where x i = trained raaga , q = testing raaga, f = feature(flow pattern) w f = weighted feature of raaga There are huge ranges of possibilities for this distance metric; a basic version for continuous and discrete attributes would be: The k nearest neighbours is selected based on this distance metric.In order to determine the class of q the majority class among the nearest neighbours is assigned to the query.It will often make sense to assign more weight to the nearer neighbours in deciding the class of the query.

DETERMINATION OF THE CLASS USING THOSE NEIGHBOURS:
If more than one of the neighbours is identified then it can be resolved by simple majority voting or by distance weighted voting.A fairly general technique to achieve this is distance weighted voting where the neighbours get to vote on the class of the query case with votes weighted by the inverse of their distance to the query.------(4) Thus the vote assigned to class y j by neighbour x c is 1 divided by the distance to that neighbour, i.e. 1(y j , y c ) returns 1 if the class labels match and 0 otherwise.From the above equation would normally be 1 but values greater than 1 can be used to further reduce the influence of more distant neighbours.Now the distance measures Cosine and EMD measures applied to our KNN process is discussed.

1) COSINE DISTANCE MEASURE
The cosine similarity measure is the cosine of the angle between these two vectors, suppose d i and d j are the paths between a i and a j in instance x i and instance x j , respectively.d i and d j are represented as vectors of term frequencies in the vector-space model.The cosine is calculated by using the following formula ∑ √∑ √∑ -----( 5)

2) EARTH MOVER DISTANCE
The Earth Mover Distance (EMD) is a distance measure that overcomes many of problems that arise from the arbitrariness of binning.As the name implies, the distance is based on the notion of the amount of effort required to convert one instrumental music to another based on the analogy of transporting mass from one distribution to another.If two instrumental music are viewed as distributions and view one distribution as a mass of earth in space and the other distribution as a hole (or set of holes) in the same space then the EMD is the minimum amount of work involved in filling the holes with the earth.Some researchers analysis of the EMD argue that a measure based on the notion of a signature is better than one based on a histogram.A signature {s j = m j ,wm j } is a set of j clusters where m j is a vector describing the mode of cluster j and wm j is the fraction of features falling into that cluster.Thus, a signature is a generalization of the notion of a histogram where boundaries and the number of partitions are not set in advance; instead j should be 'appropriate' to the complexity of the instrumental music.The example in Figure 3 illustrates this idea.The clustering can be thought as a quantization of the instrumental music in some frequency space so that the instrumental music is represented by a set of cluster modes and their weights.In the figure the source instrumental music is represented in a 2D space as two points of weights 0.6 and 0.4; the target instrumental music is represented by three points with weights 0.5, 0.3 and 0.2.In this example the EMD is calculated to be the sum of the amounts moved (0.2, 0.2, 0.1 and 0.5) multiplied by the distances they are moved.Calculating the EMD involves discovering an assignment that minimizes this amount.For two instrumental music described by signatures S = {m j ,wm j }n j =1 and Q = {p k ,wp k }r k=1 .The work required to transfer from one to the other for a given flow pattern F: ) , , ( www.ijacsa.thesai.orgwhere d jk is the distance between clusters m j and p k and f jk is the flow between m j and p k that minimizes overall cost.Once the transportation problem of identifying the flow that minimizes effort is solved by using dynamic programming.The EMD is defined as: ) , ( ----- (7) EMD is expensive to compute with cost increasing more than linearly with the number of clusters.Nevertheless it is an effective measure for capturing similarity between instrumental music.It is identified that the EMD approach is giving better results than Cosine measure.

IV. RESULTS AND DISCUSSION
The input signal is sampled at 44.1 KHz.The identification of different Raagams for the purpose of evaluating this algorithm is considered.For the purpose of Melakarta raaga identification seven different instruments are considered.The signal is made to pass through the signal separation algorithm, and segmentation algorithm.
The result showing the segmentation points for one input is shown in below Figures.This is the first level of segmentation where the protruding lines indicate the points of segmentation.After identifying the segmentation points the frequency components are determined using the HPS algorithm and tabulated the frequency values which have the dominant energy.Using the raaga identification system, the confusion matrix is determined.
The following figure shows the plot graphs and edge detection graphs: The intuition underlying Nearest Neighbour Classification is quite straight forward, classified based on the class of their nearest neighbours.It is often useful to take more than one neighbour into account so the technique is more commonly referred to as k-Nearest Neighbour (k-NN) Classification where k nearest neighbours are used in determining the class.Since the training examples are needed at run-time, i.e. they need to be in memory at run-time, it is sometimes also called Memory-Based Classification.Because induction is delayed to run time, it is considered a Lazy Learning technique.Because classification is based directly on the training examples it is also called Example-Based Classification or Case-Based Classification.

Fig. 2 1 -
Fig. 2 1-Nearest Neighbour classification of Raagas A training dataset D is made up of (xi), I Є[1,|D|] training samples where xi is the raaga.The raaga is divided in to 15 samples by eliminating unwanted frequencies (disturbances, accompanied instruments) by using low level filter-Fourier Transform of a Signal (Spft).The same process is repeated for each raaga in database D. Then these samples are trained by using Self-Organizing and Learning Vector Quantization Nets.The grouping process is carried by us.Each training example is labeled with a class label y j Є Y.Our objective is to classify an unknown example raaga q.Now training process is completed.Next the testing phase is performed by using KNN classification.The KNN approach carried in two phases

Fig. 3 .
Fig. 3.An example of the EMD between two 2D signatures with two points (clusters) in one signature and three in the other.

Table 1
Confusion Matrix: Same data for Train and Sample Cosine Distance: The Data is different for Train and Sample

Table 2 Confusion
Matrix: Different data for Train and Sample The Data is same for both Train and SampleTable 3 Confusion Matrix: Same data for Train and Sample EMD: The Data is different for Train and Sample Table 4 Confusion Matrix: Different data for Train and Sample