Double Competition for Information-Theoretic SOM

In this paper, we propose a new type of informationtheoretic method for the self-organizing maps (SOM), taking into account competition between competitive (output) neurons as well as input neurons. The method is called ”double competition”, as it considers competition between outputs as well as input neurons. By increasing information in input neurons, we expect to obtain more detailed information on input patterns through the information-theoretic method. We applied the informationtheoretic methods to two well-known data sets from the machine learning database, namely, the glass and dermatology data sets. We found that the information-theoretic method with double competition explicitly separated the different classes. On the other hand, without considering input neurons, class boundaries could not be explicitly identified. In addition, without considering input neurons, quantization and topographic errors were inversely related. This means that when the quantization errors decreased, topographic errors inversely increased. However, with double competition, this inverse relation between quantization and topographic errors was neutralized. Experimental results show that by incorporating information in input neurons, class structure could be clearly identified without degrading the map quality to severely. Keywords—double competition, self-organizing maps, mutual information, class structure I. I NTRODUCTION A. Goal of the Present Paper The present paper aims to show that the concept of competition among components in neural networks should be extended to all components of neural networks. Many methods have been developed to realize competition in neural networks. However, we think that they are only related to one aspect of competition. For example, competitive learning is in particular specialized in the competition between output neurons. In standard competitive learning, output neurons compete with each other to represent input patterns. If a neuron wins the competition, it tries to represent input patterns as efficiently as possible. A number of variants to overcome the problems such as dead neurons, the number of neurons, and initial conditions have been developed [1], [2], [3], [4], [5] ,[6], [7], [8], [9]. However, the focus in competitive learning is on competition between output neurons. We have mentioned that competition can be realized in any component of neural networks. Then, in addition to output neurons, we can consider input neurons in competitive neural networks. We can imagine a case where output as well as input neurons compete with other to represent input patterns. The goal of the present paper is to show that the extension of competition into input neurons can improve the performance of neural networks. B. Information-Theoretic SOM We apply the information-theoretic method to SOM (information-theoretic SOM), which is based on competition between neurons. The self-organizing map is one of the most important techniques in neural networks [10], [11] and has been used to visualize complex and highly structured data. In SOM, much attention has been paid in particular to topological preservation, and many methods to measure topological consistency have been proposed [12], [13], [14], [15], [16], [17], [18]. In addition, many visualization methods have also been developed to interpret the SOM knowledge obtained by learning [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29]. However, in spite of having a good reputation for visualization, SOM has faced difficulty in visualizing results obtained by learning. In the SOM, competition and cooperation between neurons are simultaneously performed in learning. In particular, cooperation processes need extensive fine tuning to maintain topological preservation. However, as more focus is placed on cooperation processes, it becomes more difficult to visualize class structure or class boundaries, since cooperation processes have roles to diminish discontinuity between neurons related to class boundaries. Though several methods have been developed to measure and extract discontinuity on the output space [30], [31], it is still difficult to extract clear class structure. To overcome this shortcoming of SOM, we have introduced several information-theoretic methods to realize SOM [32], [33]. Information-theoretic methods are numerous in neural learning [34], [35], [36], [37], [38], [39], [40], [41], [42], [43], [44], [45], [46], [47]. From the information-theoretic point of view, learning in neural networks lies in the acquisition of information content on input patterns. Though we need expensive computations to measure the information content or mutual information, there are a number of information theoretic methods available to do this. In particular, we have introduced similarity between competitive learning and mutual information maximization. When mutual information is defined between input patterns and output neurons and is maximized, just one neuron fires, while all the others ceases to do. Thus, mutual information maximization corresponds to the competitive processes of competitive learning. One of the main merits of this information-theoretic method is that it is asy to control the process of competition and cooperation. Depending on the information obtained by the informationtheoretic method, we can control final connection weights and corresponding outputs. For example, when information obtained in learning is larger, competition between neurons (IJARAI) International Journal of Advanced Research in Artificial Intelligence, Vol. 3, No.11, 2014 www.ijarai.thesai.org 21 | P a g e becomes more intense, and more severe competition processes are realized. On the other hand, when obtained information is smaller, competition between neurons becomes weaker and all neurons tend to fire equally. This means that just by adjusting the information to be obtained by learning, we can control the competition processes. In addition, the method is not winnertake-all, and many neurons can participate in competition and cooperation. By controlling the information content in neural networks, we can easily control its performance. C. Necessity of Double Competition The information-theoretic SOM has shown good performance in clarifying class structure. However, the method cannot always detect clear class structure, in particular when the problems are complex. To resolve this, we introduce the concept of competition into input neurons, as mentioned above. In our framework, the input neurons must compete with each other to represent input patterns. In addition, if an output neuron fires at the same time as an input neuron, the corresponding connection weights between the input and output neuron should be stronger. We call this competition ”double competition”, because input and output neurons compete with each other to represent input patterns. We have so far tried to introduce competition in input neurons, which is called ”information enhancement”. In information enhancement, we tried to enhance competition between neurons by focusing on specific input neurons [48], [49]. On the other hand, we have combined information maximization in input neurons and output neurons, which are separately defined [50]. Those methods have shown improved performance for several problems. However, they are not always effective for taking into account the combined effect of input and output neurons. In this double competition, we suppose two types of actions, namely, competition and mutual interaction. In the competition, input as well as output neurons compete with each other. In addition, we suppose some interaction between input and output neurons. Concretely, when an input and output pattern fire in the same way, the interaction between them becomes stronger. D. Outline In Section 2, we first explain the correspondence between information maximization and competitive learning. We explain the concept of double competition to include input and output neurons. Then, we try to present the informationtheoretic learning method to realize double competition by using the free energy. Finally, we explain how to estimate the firing probabilities of input neurons. In Section 3, we present two experimental results from the well-known machine learning database. Using a principal component analysis, we try to show that class structure can be clarified by using the present method. However, we point out that topological preservation may be sacrificed for this better visualization. Thus, it is important to more closely examine the relations between better visualization and topological preservation. II. T HEORY AND COMPUTATIONAL METHODS A. Double Competition Competitive learning has been considered to be one of the most important learning methods in neural networks [51], [2], [52], [53], [4], [5], [3], [54], [55], [1], [3], [56], [7], [57], [58]. In particular, we have introduced informationtheoretic competitive learning [59], [60], [61]. Contrary to the computational methods so far developed, we have supposed that competitive learning is a realization of mutual information maximization between output neurons and input neurons. In competitive learning, attention has been mainly to output neurons. However, we can imagine that any components in a neural network compute with each other and we try to apply the concept of competition to input neurons. In the input neurons, we focus on the importance of input neurons. Because input neurons correspond to input variables, the importance of input variables should be taken into account. Let us explain how to produce self-organizing maps by using a network architecture in Figure 1. The sth input pattern can be represented by x = [x1, x s 2, · · · , xL] , s = 1, 2, · · · , S. Connection weights into thejth competitive neuron are denoted by wj = [w1j , w2j , · · · , wLj ] , j = 1, 2, . . . ,M. Supposing that the firing rate p(k | s) of the kth input neuron


A. Goal of the Present Paper
The present paper aims to show that the concept of competition among components in neural networks should be extended to all components of neural networks.Many methods have been developed to realize competition in neural networks.However, we think that they are only related to one aspect of competition.For example, competitive learning is in particular specialized in the competition between output neurons.In standard competitive learning, output neurons compete with each other to represent input patterns.If a neuron wins the competition, it tries to represent input patterns as efficiently as possible.A number of variants to overcome the problems such as dead neurons, the number of neurons, and initial conditions have been developed [1], [2], [3], [4], [5] , [6], [7], [8], [9].However, the focus in competitive learning is on competition between output neurons.We have mentioned that competition can be realized in any component of neural networks.Then, in addition to output neurons, we can consider input neurons in competitive neural networks.We can imagine a case where output as well as input neurons compete with other to represent input patterns.The goal of the present paper is to show that the extension of competition into input neurons can improve the performance of neural networks.

B. Information-Theoretic SOM
We apply the information-theoretic method to SOM (information-theoretic SOM), which is based on competition between neurons.The self-organizing map is one of the most important techniques in neural networks [10], [11] and has been used to visualize complex and highly structured data.In SOM, much attention has been paid in particular to topological preservation, and many methods to measure topological consistency have been proposed [12], [13], [14], [15], [16], [17], [18].In addition, many visualization methods have also been developed to interpret the SOM knowledge obtained by learning [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29].However, in spite of having a good reputation for visualization, SOM has faced difficulty in visualizing results obtained by learning.In the SOM, competition and cooperation between neurons are simultaneously performed in learning.In particular, cooperation processes need extensive fine tuning to maintain topological preservation.However, as more focus is placed on cooperation processes, it becomes more difficult to visualize class structure or class boundaries, since cooperation processes have roles to diminish discontinuity between neurons related to class boundaries.Though several methods have been developed to measure and extract discontinuity on the output space [30], [31], it is still difficult to extract clear class structure.
To overcome this shortcoming of SOM, we have introduced several information-theoretic methods to realize SOM [32], [33].Information-theoretic methods are numerous in neural learning [34], [35], [36], [37], [38], [39], [40], [41], [42], [43], [44], [45], [46], [47].From the information-theoretic point of view, learning in neural networks lies in the acquisition of information content on input patterns.Though we need expensive computations to measure the information content or mutual information, there are a number of information theoretic methods available to do this.In particular, we have introduced similarity between competitive learning and mutual information maximization.When mutual information is defined between input patterns and output neurons and is maximized, just one neuron fires, while all the others ceases to do.Thus, mutual information maximization corresponds to the competitive processes of competitive learning.One of the main merits of this information-theoretic method is that it is easy to control the process of competition and cooperation.Depending on the information obtained by the informationtheoretic method, we can control final connection weights and corresponding outputs.For example, when information obtained in learning is larger, competition between neurons becomes more intense, and more severe competition processes are realized.On the other hand, when obtained information is smaller, competition between neurons becomes weaker and all neurons tend to fire equally.This means that just by adjusting the information to be obtained by learning, we can control the competition processes.In addition, the method is not winnertake-all, and many neurons can participate in competition and cooperation.By controlling the information content in neural networks, we can easily control its performance.

C. Necessity of Double Competition
The information-theoretic SOM has shown good performance in clarifying class structure.However, the method cannot always detect clear class structure, in particular when the problems are complex.To resolve this, we introduce the concept of competition into input neurons, as mentioned above.In our framework, the input neurons must compete with each other to represent input patterns.In addition, if an output neuron fires at the same time as an input neuron, the corresponding connection weights between the input and output neuron should be stronger.We call this competition "double competition", because input and output neurons compete with each other to represent input patterns.
We have so far tried to introduce competition in input neurons, which is called "information enhancement".In information enhancement, we tried to enhance competition between neurons by focusing on specific input neurons [48], [49].On the other hand, we have combined information maximization in input neurons and output neurons, which are separately defined [50].Those methods have shown improved performance for several problems.However, they are not always effective for taking into account the combined effect of input and output neurons.In this double competition, we suppose two types of actions, namely, competition and mutual interaction.In the competition, input as well as output neurons compete with each other.In addition, we suppose some interaction between input and output neurons.Concretely, when an input and output pattern fire in the same way, the interaction between them becomes stronger.

D. Outline
In Section 2, we first explain the correspondence between information maximization and competitive learning.We explain the concept of double competition to include input and output neurons.Then, we try to present the informationtheoretic learning method to realize double competition by using the free energy.Finally, we explain how to estimate the firing probabilities of input neurons.In Section 3, we present two experimental results from the well-known machine learning database.Using a principal component analysis, we try to show that class structure can be clarified by using the present method.However, we point out that topological preservation may be sacrificed for this better visualization.Thus, it is important to more closely examine the relations between better visualization and topological preservation.

A. Double Competition
Competitive learning has been considered to be one of the most important learning methods in neural networks [51], [2], [52], [53], [4], [5], [3], [54], [55], [1], [3], [56], [7], [57], [58].In particular, we have introduced informationtheoretic competitive learning [59], [60], [61].Contrary to the computational methods so far developed, we have supposed that competitive learning is a realization of mutual information maximization between output neurons and input neurons.In competitive learning, attention has been mainly to output neurons.However, we can imagine that any components in a neural network compute with each other and we try to apply the concept of competition to input neurons.In the input neurons, we focus on the importance of input neurons.Because input neurons correspond to input variables, the importance of input variables should be taken into account.
Let us explain how to produce self-organizing maps by using a network architecture in Figure 1.The sth input pattern can be represented by Supposing that the firing rate p(k | s) of the kth input neuron for the sth input pattern can be computed, then the distance between input patterns and connection weights can be computed by The firing rate p(k | s) is considered to be the importance of the kth input neuron for the sth input pattern.The output from an output neuron is computed by where σ denotes a spread parameter and defined by where β is larger than zero.By normalizing the output, we have the firing rate We should also compute the collective output from the neuron.In the self-organizing maps, the collective outputs are determined by distances from the winner.The winner c 1 is determined by Following the formulation of SOM, we compute the distance between the winner and the other neurons by where r j denotes the position of the jth neuron on the output map and σ ngh is the spread parameter.Thus, the expected output is approximated by this function p(j|s) q(j|s) x Network architecture for the information-theoretic self-organizing maps where only connection weights from a winner at the center are plotted.

B. Free Energy Minimization
Learning should be performed to reduce the difference between actual and expected outputs.We represent this difference by using the Kullback-Leibler divergence In addition to this KL divergence, we have other errors which must be minimized, namely quantization errors between connection weights and input patterns Fixing this quantization errors and minimizing the KLdivergence, we have the optimal firing rates ) .(10) We have the following equation called "free energy" Finally, by differentiating the free energy, we can obtain the re-estimation formula As shown in the equation (10), connection weights are modified to make the actual outputs closer to expected outputs.

C. Estimating Firing Rates of Input Neurons
To obtain connection weights w kj , we must estimate the firing rates of input neurons p(k | s) in the equation (1).For this purpose, we first compute the outputs from the jth neuron by Normalizing this output, we have the estimated firing rates By using p(j | s), we have the output from the kth input neuron Then, we have the firing rate of the kth input neuron Putting this firing rate p(k | s) in the equation ( 1), ( 2), we have the distance considering input neurons,

A. Experiment Outline
We applied the method two data sets, namely, the glass and dermatology data.Both were taken from the well-known machine learning database [62].The number of input neurons and patterns for the glass data were 214 and 10, respectively.The number of input neurons and patterns for the dermatology data were 366 and 34, respectively.All the data were normalized to range between zero and one.For quantitative evaluation, we used the well-known quantization and topographic errors.There have been many attempts [12], [13], [14], [15], [16], [17], [18] to measure map quality quantitatively.Among them, both errors are very simple and easy to implement.For visual evaluation, we used the principal component analysis (PCA) to summarize connection weights.As mentioned in the introduction section, there is difficulty in interpreting SOM knowledge, a number of methods have been developed to clarify the knowledge [20], [21], [22], [23], [24], [25], [26], [27], [28], [29].In this study, we used the PCA for clarification, in particular for simplifying the knowledge.It is easy to demonstrate the performance by using the techniques specific to the SOM, such as the U-matrix.However, we used the PCA so that the present results could be widely interpreted and reproduced.

B. Glass Data 1) Firing Rates of Input
Neurons: First, we examine how the firing rates could be changed by increasing the parameter β or by increasing information content in input neurons.When the parameter β was increased from one in Figure 2(a) to three in Figure 2(c), little change in the firing rates could be seen.When the parameter β was increased from four in Figure 2(d) to eight in Figure 2(f), the firing rates became gradually differentiated.When the parameter β was increased from ten in Figure 2(g) to 14 in Figure 2(i), higher and lower firing rates becomes clearer.Finally, when the parameter β was increased from 16 in Figure 2(j) to 20 in Figure 2(l), the clearest firing rates could be seen.Input neurons No.6 and No.8 had the highest firing rates, while input neuron No.5 had the lowest firing rate.The results show that when the parameter β was increased, the firing rates became gradually clearer.
2) Results of PCA for Connection Weights: Figure 3 shows the results of the PCA for connection weights by the SOM and the information-theoretic method with double competition.Figure 3(a) shows the results of PCA by using the conventional SOM.We can see that a condensed group could be seen on the right hand side of the map, and the remaining connection weights were scattered widely on the left hand side of the map.When the parameter β was two and three in Figures 3(b) and (c), connection weights were close to those by the conventional SOM in Figure 3(a).When the parameter β was increased from four in Figure 3(d) to eight in Figure 3(f), a group on the right side began to separate from the others.When the parameter β was increased from 10 in Figure 3(g) to 14 in Figure 3(i), two explicit groups of connection weights on the left and right hand sides began to form.Finally, when the parameter β was increased from 16 in Figure 3(j) to 20 in Figure 3(l), connection weights were separated into two groups on the left hand side and right hand side.In addition, another group could be seen in the middle of the map.
Figure 4 shows the results of PCA by the informationtheoretic methods without considering input neurons.When the parameter β was increased from two in Figure 4(a) to 20 in Figure 4(c), a condensed group on the right hand side remained the same, but connection weights on the left hand side became more scattered.These results show that when the parameter β was increased, input patterns were separated into explicit groups by using the information on input neurons.On the other hand, without considering input neurons, explicit groups could not be expected.
3) Quantization and Topographic Errors: We have seen that class structure is clearer by using the information-theoretic method with double competition.The next step is to quantify the map quality obtained using this method.Figure 5(a) shows the quantization errors by the information-theoretic method with double competition in red, without considering input neurons in blue, and SOM in black.The information-theoretic method without considering input neurons showed a sharp decrease in quantization errors, while the quantization errors by SOM and the method with double competition had relatively higher errors.Topographic errors using the informationtheoretic method without considering input neurons were quite large.On the other hand, topographic errors did not increase when using the information-theoretic method with double competition.The decrease in quantization errors and increase in the topographic errors by the information-theoretic method without considering input neurons can be inferred from free energy equation ( 18) (for more detailed discussion, see the discussion section).On the other hand, quantization and topographic errors did not change excessively with the double competition information-theoretic method, and were close to the errors obtained by the conventional SOM.Thus, it can be said that the introduction of information on input neurons attenuated the operation of the free energy.

C. Dermatology Data 1) Firing Rates of Input Neurons:
We applied the information-theoretic method to the well-known data set of the dermatology from the machine learning database.Figure 6 shows the firing rates of input neurons when the parameter β was increased from one (a) to 15 (i).Even if we increased the parameter β beyond this point, little change could be seen in the firing rates.When the parameter β was one in Figure 6(a), the firing rates were almost uniform.When the parameter β was two and three in Figure 6(b) and (c), small changes in the firing rates appeared.When the parameter β was increased from five in Figure 6(d) to nine in Figure 6(f), differences between higher and lower rates became larger.When the parameter β was increased from 11 in Figure 6(g) to 15 in Figure 6(i), higher and lower firing rates were at their largest.
2) Results of the PCA for Connection Weights: Figure 7 shows the results of the PCA for connection weights by the SOM (a) and the information-theoretic method with double competition when the parameter β was increased from one (b) to 15 (i).By using the SOM, as shown in Figure 7(a), connection weights seemed to be divided into three groups with weak boundaries.When the parameter β was one and three in Figures 7(b) and (c), the results of the PCA were almost equivalent to that by the SOM in Figure 7(a).When the parameter β was increased from five in Figure 7(d) to nine in Figure 7(f), a distinct group became separated on the right hand side of the map.When the parameter β was further increased from 11 in Figure 7(g) to 15 in Figure 7(i), three groups were clearly separated.
Figure 8 shows the results of the PCA by the informationtheoretic method without considering input neurons.When the parameter β was increased from one in Figure 8(a) to nine in Figure 8(b), three groups became more apparent.Then, even when the parameter β was increased from nine in Figure 8(b) to 15 in Figure 8(c), the results of the PCA remained almost the same.The results of the PCA by the informationtheoretic method without considering input neurons were inferior to those by the information-theoretic method with double competition in terms of class structure.This shows that the information of input neurons is critical in clarifying class structure.
3) Quantization and Topographic Errors: Figure 9 shows the quantization and topographic errors when the parameter β was increased from one to 15. Figure 9(a) shows quantization errors by the SOM in black, the information-theoretic method with double competition in red, and without considering input neurons in blue.By using the information-theoretic method without considering input neurons, the quantization errors decreased sharply from the beginning onwards.On the other hand, by using the information-theoretic method with double competition, quantization errors increased and became larger than that by the conventional SOM. Figure 9(b) shows topographic errors when the parameter β was increased from one to 15.By using the information-theoretic method without considering input neurons, the topographic error increased sharply and eventually became much larger than the error obtained by the conventional SOM.On the other hand, by using the information-theoretic method with double competition, the Fig. 2. Firing rates of input neurons by the information-theoretic method with double competition when the parameter β was increased from one (a) to 20 (t) for the glass data.
topographic error increased less than by using the informationtheoretic method without double competition.The behavior of the information-theoretic method without considering input neurons can be inferred from the free energy equation (18).By introducing the firing rates of input neurons, this tendency was attenuated.When using the information-theoretic method with double competition, the quantization and topographic errors did not increase or decrease to the extent observed when using the information-theoretic method without considering input neurons.

1) Validity of Methods and Experimental Results:
In this paper, we have proposed a new type of information-theoretic method which takes into account the firing rates of input neurons.We have so far shown that competitive learning as well as self-organizing maps aim to maximize mutual information between input patterns and output neurons [59], [60], [61].However, little attention has been paid to information content in input neurons.In particular, we have not fully used any information on input neurons in learning processes.Thus, we have introduced the firing rates of input neurons in the learning procedure of the self-organizing maps.We succeeded in determining the re-estimation formula for connection weights.We applied the method to two well-known data sets from the machine learning database, namely, the glass and dermatology data.In both data sets, we succeeded in extracting clearer class structure, particularly by detecting clear class boundaries for the both data sets.
In addition, we could see that quantization and topographic errors were inversely related when we used the method without considering input neurons.This inverse relation can be predicted by examining the free energy equation.The free energy equation in its expanded form appears as the following recalling that the spread parameter σ is defined by using the When the parameter β was increased, and the spread parameter σ was decreased, the first term of the free energy became more effective.This means that quantization errors decreased, as shown in Figures 5 (a) and 9(a).On the other hand, when the parameter β is decreased and the spread parameter σ is increased, the effect of the second term of the free energy becomes dominant.The second term is the KL divergence is used to imitate the collective behavior of output neurons.Thus, when the parameter β is decreased, the topological errors should decreased as well.This is shown in Figure 5(b) and 9(b).The introduction of input neuron firing rates in the learning processes attenuated this tendency.

2) Problems of the Method:
There are two problems of this information-theoretic method, namely, the estimation of firing rates of input neurons and degradation in terms of quantization and topographic errors.
First, there is a problem with estimating the firing rates of input neurons, which must be computed in order to realize competitive processes.However, in the computation of competitive neurons, we must insert the firing rates of input neurons   into the equation (1).In Section II.C, we briefly presented how to estimate the firing rates of input neurons.However, in the estimation of the firing rates, we must insert the firing rates of competitive neurons into the equation (2).We should thus more carefully examine whether the firing rates of input neurons can be stabilized for the precise computation of the information content, and for producing stable self-organizing maps.
Second, we have a problem of degradation in terms of quantization and topographic errors.In Figures 5 and 9, quantization and topographic errors increased, though they did not reach extreme values as was the case with the method without considering input neurons.We must explain why and how the degradation occurred and try to improve quantization and topographic errors.

3) Possibilities of the Method:
The method presented in this paper can be considered as a new input variable selection in SOM, and opens up the possibility of having competition in all components of neural networks.First, this method is an extension of the self-organizing maps which takes into account the importance of input neurons or input variables.The competition between input neurons can be considered as the introduction of the importance of input variables in the self-organizing maps.As is well known, variable se-  lection has played important roles in learning, in particular in supervised learning [63], [64].In unsupervised learning, such as SOM, the criteria to choose important variables have not been determined.However, in the information-theoretic method, the criteria to measure the importance of neurons is naturally introduced: the importance is measured in terms of information content in neurons.When this information increases, the importance of the neurons becomes larger.We use the importance of input neurons to visualize input patterns by SOM, as it plays an important role in this regard.Thus, it is important to examine relations between the importance of input neurons and the visualization of SOM.
Second, there is the possibility of having competition among all components in neural networks.In the present model of a neural network, in addition to input and output neurons, there are connection weights from the input neurons to output neurons.If it is possible to take into account the competition between all these connection weights, much better performance of a network can be expected.This means that in a neural network, every component competes with each other to most efficiently process outside stimuli.

IV. CONCLUSION
In this paper, we have introduced an information-theoretic method considering information in input neurons to realize competitive learning as well as the self-organizing maps.When mutual information is maximized between neurons and input patterns, just one neuron wins the competition.Namely, mutual information maximization corresponds to competitive learning.However, we can imagine that any component in a neural network should contain information on input patterns.Thus, we tried to take into account input neurons in addition to the output or competitive neurons usually used in competitive learning.We applied the information-theoretic method to the self-organizing maps by adding cooperation processes to competitive learning.Then, we applied the information-theoretic methods to two well-known data sets, namely, glass and dermatology data sets from the machine learning database.We found that by increasing information in input neurons, connection weights tended to be divided into clear groups.In addition, the inverse relation between quantization and topographic errors which was observed in the information-theoretic competitive learning without considering input neurons, was neutralized by considering these input neurons.However, quantization and topographic errors tended to degrade map quality when using the information-theoretic method.Thus, we should examine how and why this deterioration occurred in terms of quantization and topographic errors to realize the information-theoretic method with better quantization and visualization performance.
Fig.1.Network architecture for the information-theoretic self-organizing maps where only connection weights from a winner at the center are plotted.

Fig. 3 .Fig. 4 .
Fig. 3. Results by PCA for connection weights by double competition for the glass data.

Fig. 5 .
Fig.5.Quantization and topographic errors by SOM in black, information-theoretic with double competition (in red) and without considering input neurons (in blue) for the glass data.

Fig. 6 .
Fig.6.Firing rates of input neurons when the parameter β was increased from one to 20 for the dermatology data.

Fig. 7 .
Fig. 7. Results by PCA for connection weights by double competition for the dermatology data.

Fig. 8 .
Fig. 8. Results of the PCA for connection weights without considering input neurons for the dermatology data.

Fig. 9 .
Fig.9.Quantization and topographic errors by the SOM, and the information-theoretic method with double competition and without considering input neurons for the dermatology data.