Hasil (
Bahasa Indonesia) 1:
[Salinan]Disalin!
I. INTRODUCTIONThe self organizing map (SOM) proposed by Kohonen [1], has been widely used in industrial applications such as pattern recognition, biological modeling, data compression, signal processing and data mining [2]. It is an unsupervised and nonparametric neural network approach. The success of the SOM algorithm lies in its simplicity that makes it easy to understand, simulate and be used in many applications.The basic SOM consists of neurons usually arranged in a two-dimensional structure such that there are neighborhood relations among the neurons. After completion of training, each neuron is attached to a feature vector of the same dimension as input space. By assigning each input vector to the neuron with nearest feature vectors, the SOM is able to divide the input space into regions (clusters) with common nearest feature vectors. This process can be considered as performing vector quantization (VQ) [3].Also, because of the neighborhood relation contributed by the inter-connections among neurons, the SOM exhibits another important property of topology preservation.Clustering algorithms attempt to organize unlabeled input vectors into clusters such that points within the cluster are more similar to each other than vectors belonging to different clusters [4]. The clustering methods are of five types: hierarchical clustering, partitioning clustering, density-based clustering, grid-based clustering and model-based clustering [5].In this paper, a new two-level clustering algorithm is proposed. The idea is that the first level is to train the data by the SOM neural network and the clustering at the secondlevel is a rough set based incremental clustering approach [6], which will be applied on the output of SOM and requires only a single neurons scan. The optimal number of clusters can be found by rough set theory which groups the given neurons into a set of overlapping clusters (clusters the mapped data respectively).This paper is organized as following; in section II the basics of SOM algorithm are outlined. The basic of incremental clustering and rough set based approach are described in section III. In section IV the proposed algorithm is presented. Section V is dedicated to experiment results and section VI provides brief conclusion and future works.II. SELF ORGANIZING MAP AND CLUSTERINGCompetitive learning is an adaptive process in which the neurons in a neural network gradually become sensitive to different input categories, sets of samples in a specific domain of the input space. A division of neural nodes emerges in the network to represent different patterns of the inputs after training.The division is enforced by competition among the neurons: when an input x arrives, the neuron that is best able to represent it wins the competition and is allowed to learn it even better. If there exist an ordering between the neurons, i.e. the neurons are located on a discrete lattice, the competitive learning algorithm can be generalized. Not only the winning neuron but also its neighboring neurons on the lattice are allowed to learn, the whole effect is that the final map becomes an ordered map in the input space. This is the essence of the SOM algorithm.The SOM consist of m neurons located on a regular low-dimensional grid, usually one or two dimensional. The lattice of the grid is either hexagonal or rectangular.
The basic SOM algorithm is iterative. Each neuron i has a d -dimensional feature vector wi = [wi 1,...,wid ] . At each training step t , a sample data vector x(t) is randomly chosen for the training set. Distance between x(t) and all feature vectors are computed. The winning neuron, denoted by c , is the neuron with the feature vector closest to ) (tx:
i {m}
~ 1,...,. (1)
A set of neighboring nodes of the winning node is denoted as Nc . We define hic (t) as the neighborhood kernel function around the winning neuron c at time t . The
neighborhood kernel function is a non-increasing function of time and of the distance of neuron i from the winning neuron c. The kernel can be taken as a Gaussian function:
Sedang diterjemahkan, harap tunggu..