Fuzzy Possibilistic C-Means Clustering
Partitions a numeric data set by using the Fuzzy and Possibilistic C-Means (FPCM) clustering algorithm (Pal et al, 1997).
fpcm(x, centers, memberships, m=2, eta=2, dmetric="sqeuclidean", pw=2, alginitv="kmpp", alginitu="imembrand", nstart=1, iter.max=1000, con.val=1e-09, fixcent=FALSE, fixmemb=FALSE, stand=FALSE, numseed)
x |
a numeric vector, data frame or matrix. |
centers |
an integer specifying the number of clusters or a numeric matrix containing the initial cluster centers. |
memberships |
a numeric matrix containing the initial membership degrees. If missing, it is internally generated. |
m |
a number greater than 1 to be used as the fuzziness exponent or fuzzifier. The default is 2. |
eta |
a number greater than 1 to be used as the typicality exponent. The default is 3. |
dmetric |
a string for the distance metric. The default is sqeuclidean for the squared Euclidean distances. See |
pw |
a number for the power of Minkowski distance calculation. The default is 2 if the |
alginitv |
a string for the initialization of cluster prototypes matrix. The default is kmpp for K-means++ initialization method (Arthur & Vassilvitskii, 2007). For the list of alternative options see |
alginitu |
a string for the initialization of memberships degrees matrix. The default is imembrand for random sampling of initial membership degrees. |
nstart |
an integer for the number of starts for clustering. The default is 1. |
iter.max |
an integer for the maximum number of iterations allowed. The default is 1000. |
con.val |
a number for the convergence value between the iterations. The default is 1e-09. |
fixcent |
a logical flag to make the initial cluster centers not changed along the different starts of the algorithm. The default is |
fixmemb |
a logical flag to make the initial membership degrees not changed along the different starts of the algorithm. The default is |
stand |
a logical flag to standardize data. Its default value is |
numseed |
a seeding number to set the seed of R's random number generator. |
Fuzzy and Possibilistic C Means (FPCM) algorithm which has been proposed by Pal et al (1997) indended to combine the characteristics of FCM and PCM, and hence, was also so-called Mixed C-Means (MCM) algorithm.
The objective function of FPCM is:
J_{FPCM}(\mathbf{X}; \mathbf{V}, \mathbf{U}, \mathbf{T})=∑\limits_{i=1}^n (u_{ij}^m + t_{ij}^η) \; d^2(\vec{x}_i, \vec{v}_j)
In the above equation:
\mathbf{X} = {\vec{x}_1, \vec{x}_2,…, \vec{x}_n} \subseteq\Re^p is the data set for n objects in the p-dimensional data space \Re,
\mathbf{V} = {\vec{v}_1, \vec{v}_2, …, \vec{v}_k} \subseteq\Re^n is the protoype matrix of the clusters,
\mathbf{U} = {u_{ij}} is the matrix for a fuzzy partition of \mathbf{X},
\mathbf{T} = {t_{ij}} is the matrix for a possibilistic partition of \mathbf{X},
d^2(\vec{x}_i, \vec{v}_j) is the squared Euclidean distance between the object \vec{x}_j and cluster prototype \vec{v}_i.
d^2(\vec{x}_i , \vec{v}_j) = ||\vec{x}_i - \vec{v}_j||^2 = (\vec{x}_i - \vec{v}_j)^T (\vec{x}_i - \vec{v}_j)
m is the fuzzifier to specify the amount of fuzziness for the clustering; 1≤q m≤q ∞. It is usually chosen as 2.
η is the typicality exponent to specify the amount of typicality for the clustering; 1≤q η≤q ∞. It is usually chosen as 2.
FPCM must satisfy the following constraints:
∑\limits_{j=1}^k u_{ij} = 1 \;\;;\; 1 ≤q i≤q n
∑\limits_{i=1}^n t_{ij} = 1 \;\;;\; 1 ≤q j≤q k
The objective function of FPCM is minimized by using the following update equations:
u_{ij} =\Bigg[∑\limits_{j=1}^k \Big(\frac{d^2(\vec{x}_i, \vec{v}_j)}{d^2(\vec{x}_i, \vec{v}_l)}\Big)^{1/(m-1)} \Bigg]^{-1} \;\;; 1≤q i ≤q n,\; 1 ≤q l ≤q k
t_{ij} =\Bigg[∑\limits_{l=1}^n \Big(\frac{d^2(\vec{x}_i, \vec{v}_j)}{d^2(\vec{x}_i, \vec{v}_l)}\Big)^{1/(η-1)} \Bigg]^{-1} \;\;; 1≤q i ≤q n, \; 1 ≤q j ≤q k
\vec{v}_{j} =\frac{∑\limits_{i=1}^n (u_{ij}^m + t_{ij}^η) \vec{x}_i}{∑\limits_{i=1}^n (u_{ij}^m + t_{ij}^η)} \;\;; {1≤q j≤q k}
an object of class ‘ppclust’, which is a list consists of the following items:
x |
a numeric matrix containing the processed data set. |
v |
a numeric matrix containing the final cluster prototypes (centers of clusters). |
u |
a numeric matrix containing the fuzzy memberships degrees of the data objects. |
d |
a numeric matrix containing the distances of objects to the final cluster prototypes. |
k |
an integer for the number of clusters. |
m |
a number for the fuzzifier. |
eta |
a number for the typicality exponent. |
cluster |
a numeric vector containing the cluster labels found by defuzzying the fuzzy membership degrees of the objects. |
csize |
a numeric vector containing the number of objects in the clusters. |
iter |
an integer vector for the number of iterations in each start of the algorithm. |
best.start |
an integer for the index of start that produced the minimum objective functional. |
func.val |
a numeric vector for the objective function values in each start of the algorithm. |
comp.time |
a numeric vector for the execution time in each start of the algorithm. |
stand |
a logical value, |
wss |
a number for the within-cluster sum of squares for each cluster. |
bwss |
a number for the between-cluster sum of squares. |
tss |
a number for the total within-cluster sum of squares. |
twss |
a number for the total sum of squares. |
algorithm |
a string for the name of partitioning algorithm. It is ‘FCM’ with this function. |
call |
a string for the matched function call generating this ‘ppclust’ object. |
Zeynel Cebeci, Alper Tuna Kavlak & Figen Yildiz
Arthur, D. & Vassilvitskii, S. (2007). K-means++: The advantages of careful seeding, in Proc. of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, p. 1027-1035. <http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf>
Pal, N.R., Pal, K., & Bezdek, J.C. (1997). A mixed c-means clustering model. In Proc. of the 6th IEEE Int. Conf. on Fuzzy Systems, 1, pp. 11-21. <doi:10.1109/FUZZY.1997.616338>
# Load dataset iris data(iris) x <- iris[,-5] # Initialize the prototype matrix using K-means++ v <- inaparc::kmpp(x, k=3)$v # Initialize the memberships degrees matrix u <- inaparc::imembrand(nrow(x), k=3)$u # Run FPCM with the initial prototypes and memberships fpcm.res <- fpcm(x, centers=v, memberships=u, m=2, eta=2) # Show the fuzzy membership degrees for the top 5 objects head(fpcm.res$u, 5) # Show the possibilistic membership degrees for the top 5 objects head(fpcm.res$t, 5)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.