API: pytsk.cluster

pytsk.cluster mainly provide fuzzy clustering algorithms. Each algorithm will implemente a transform method, which convert the input of raw feature \(X \in \mathbb{R}^{N,D}\) into the consequent input matrix \(P \in \mathbb{R}^{N,T}\) of TSK fuzzy systems, where \(D\) is the input dimension, \(N\) is the number of samples, for a zero order TSK fuzzy system, \(T=R\), where \(R\) is the number of rules (equal to the number of clusters of the fuzzy clustering algorithm), for a first-order TSK fuzzy system, \(T = (D+1)\times R\).

class pytsk.cluster.BaseFuzzyClustering

Parent: object.

The parent class of fuzzy clustering classes.

set_params(self, **params): Setting attributes. Implemented to adapt the API of scikit-learn.

class pytsk.cluster.FuzzyCMeans(n_cluster, fuzzy_index='auto', sigma_scale='auto', init='random', tol_iter=100, error=1e-06, dist='euclidean', verbose=0, order=1)

Parent: BaseFuzzyClustering, sklearn.base.BaseEstimator, sklearn.base.TransformerMixin.

The fuzzy c-means (FCM) clustering algorithm [1]. This implementation is adopted from the scikit-fuzzy package. When constructing a TSK fuzzy system, a fuzzy clustering algorithm is usually used to compute the antecedent parameters, after that, the consequent parameters can be computed by least-squared error algorithms, such as Ridge regression [2]. How to use this class can be found at Quick start.

The objective function of the FCM is:

\[\begin{split}&J = \sum_{i=1}^{N}\sum_{j=1}^{C} U_{i,j}^m\|\mathbf{x}_i - \mathbf{v}_j\|_2^2\\ &s.t. \sum_{j=1}^{C}\mu_{i,j} = 1, i = 1,...,N,\end{split}\]

where \(N\) is the number of samples, \(C\) is the number of clusters (which also corresponding to the number of rules of TSK fuzzy systems), \(m\) is the fuzzy index, \(\mathbf{x}_i\) is the \(i\)-th input vector, \(\mathbf{v}_j\) is the \(j\)-th cluster center vector, \(U_{i,j}\) is the membership degree of the \(i\)-th input vector on the \(j\)-th cluster center vector. The FCM algorithm will obtain the centers \(\mathbf{v}_j, j=1,...,C\) and the membership degrees \(U_{i,j}\).

Parameters

n_cluster (int) – Number of clusters, equal to the number of rules \(R\) of a TSK model.
fuzzy_index (float/str) – Fuzzy index of the FCM algorithm, default auto. If fuzzy_index=auto, then the fuzzy index is computed as \(\min(N, D-1) / (\min(N, D-1)-2)\) (If \(\min(N, D-1)<3\), fuzzy index will be set to 2), according to [3]. Otherwise the given float value is used.
sigma_scale (float/str) – The scale parameter \(h\) to adjust the actual standard deviation \(\sigma\) of the Gaussian membership function in TSK antecedent part. If sigma_scale=auto, sigma_scale will be set as \(\sqrt{D}\), where \(D\) is the input dimension [4]. Otherwise the given float value is used.
init (str/np.array) – The initialization strategy of the membership grid matrix \(U\). Support “random” or numpy array with the size of \([R, N]\), where \(R\) is the number of clusters/rules, \(N\) is the number of training samples. If init="random", the initial membership grid matrix will be randomly initialized, otherwise the given matrix will be used.
tol_iter (int) – The total iteration of the FCM algorithm.
error (float) – The maximum error that will stop the iteration before maximum iteration is reached.
dist (str) – The distance type for the scipy.spatial.distance.cdist() function, default “euclidean”. The distance function can also be “braycurtis”, “canberra”, “chebyshev”, “cityblock”, “correlation”, “cosine”, “dice”, “euclidean”, “hamming”, “jaccard”, “jensenshannon”, “kulsinski”, “kulczynski1”, “mahalanobis”, “matching”, “minkowski”, “rogerstanimoto”, “russellrao”, “seuclidean”, “sokalmichener”, “sokalsneath”, “sqeuclidean”, “yule”.
verbose (int) – If > 0, it will show the loss of the FCM objective function during iterations.
order (int) – 0 or 1. Decide whether to construct a zero-order TSK or a first-order TSK.

fit(self, X, y=None)

Run the FCM algorithm.

Parameters

X (numpy.array) – Input array with the size of \([N, D]\), where \(N\) is the number of training samples, and \(D\) is number of features.
y (numpy.array) – Not used. Pass None.

predict(self, X, y=None)

Predict the membership degrees of X on each cluster.

Parameters

X (numpy.array) – Input array with the size of \([N, D]\), where \(N\) is the number of training samples, and \(D\) is number of features.
y (numpy.array) – Not used. Pass None.

Returns

return the membership degree matrix \(U\) with the size of \([N, R]\), where \(N\) is the number of samples of X, and \(R\) is the number of clusters/rules. \(U_{i,j}\) represents the membership degree of the \(i\)-th sample on the \(r\)-th cluster.

transform(self, X, y=None)

Compute the membership degree matrix \(U\), and use \(X\) and \(U\) to get the consequent input matrix \(P\) using function x2xp(x, u, order)

Parameters

X (numpy.array) – Input array with the size of \([N, D]\), where \(N\) is the number of training samples, and \(D\) is number of features.
y (numpy.array) – Not used. Pass None.

Returns

return the consequent input \(X_p\) with the size of \([N, (D+1)\times R]\), where \(N\) is the number of test samples, \(D\) is number of features, \(R\) is the number of clusters/rules.

x2xp(X, U, order)

Convert the feature matrix \(X\) and the membership degree matrix \(U\) into the consequent input matrix \(X_p\)

Each row in \(X\in \mathbb{R}^{N,D}\) represents a \(D\)-dimension input vector. Suppose vector \(\mathbf{x}\) is one row, and then the consequent input matrix \(P\) is computed as [5] for a first-order TSK:

\[\begin{split}&\mathbf{x}_e = (1, \mathbf{x}),\\ &\tilde{\mathbf{x}}_r = u_r \mathbf{x}_e,\\ &\mathbf{p} = (\tilde{\mathbf{x}}_1, \tilde{\mathbf{x}}_2, ...,\tilde{\mathbf{x}}_R),\end{split}\]

where \(\mathbf{p}\) is the corresponding row in \(P\), which is a \((D+1)\times R\)-dimension vector. Then the consequent parameters of TSK can be optimized by any linear regression algorithms.

Parameters

x (numpy.array) – size: \([N,D]\). Input features.
u (numpy.array) – size: \([N,R]\). Corresponding membership degree matrix.
order (int) – 0 or 1. The order of TSK models.

Returns

If order=0, return \(U\) directly, else if order=1, return the matrix \(X_p\) with the size of \([N, (D+1)\times R]\). Details can be found at [2].

compute_variance(X, U, V)

Compute the variance of the Gaussian membership function in TSK fuzzy systems. After performing the FCM, one can use \(\mathbf{v}_j\) and \(U_{i,j}\) to construct the Gaussian membership function based antecedent of a TSK fuzzy system. The center of the Gaussian membership function can be directly set as center mathbf{v}_j, the standard deviation of the Gaussian membership function can be computed as follows:

\[\sigma_{r,d}=\left[\sum_{i=1}^N U_{i,r}(x_{i,d}-v_{r,d})^2 / \sum_{i=1}^N U_{i,r} \right]^{1/2},\]

where \(v_{r,d}\) represents the cluster center of the \(d\)-th dimension in the \(r\)-th rule.

Parameters

x (numpy.array) – Input matrix \(X\) with the size of \([N, D]\).
u (numpy.array) – Membership degree matrix \(U\) with the size of \([R, N]\).
v (numpy.array) – Cluster center matrix \(V\) with the size of \([R, D]\).

Returns

The standard variation matrix \(\Sigma\) with the size of \([R, D]\).

[1] Bezdek J C, Ehrlich R, Full W. FCM: The fuzzy c-means clustering algorithm[J]. Computers & geosciences, 1984, 10(2-3): 191-203.

[2] Wang S, Chung K F L, Zhaohong D, et al. Robust fuzzy clustering neural network based on ɛ-insensitive loss function[J]. Applied Soft Computing, 2007, 7(2): 577-584.

[3] Yu J, Cheng Q, Huang H. Analysis of the weighting exponent in the FCM[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2004, 34(1): 634-639.

[4] Cui Y, Wu D, Xu Y. Curse of dimensionality for tsk fuzzy neural networks: Explanation and solutions[C]//2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021: 1-8.

[5] Deng Z, Choi K S, Chung F L, et al. Scalable TSK fuzzy modeling for very large datasets using minimal-enclosing-ball approximation[J]. IEEE Transactions on Fuzzy Systems, 2010, 19(2): 210-226.