参考代码:
1from sklearn.datasets import make_blobs
2from sklearn.cluster import KMeans
3from sklearn.metrics import silhouette_score, davies_bouldin_score, calinski_harabasz_score
4
5# 创建模拟数据
6X, y = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)
7
8# 应用 k-means 聚类
9kmeans = KMeans(n_clusters=4, random_state=0).fit(X)
10
11# 预测聚类标签
12labels = kmeans.labels_
13
14# 计算轮廓系数
15silhouette_avg = silhouette_score(X, labels)
16print(f"轮廓系数: {silhouette_avg}")
17
18# 计算 WCSS
19wcss = kmeans.inertia_
20print(f"聚类内误差平方和(WCSS): {wcss}")
21
22# 计算 Davies-Bouldin 指数
23dbi = davies_bouldin_score(X, labels)
24print(f"Davies-Bouldin 指数: {dbi}")
25
26# 计算 Calinski-Harabasz 指数
27chi = calinski_harabasz_score(X, labels)
28print(f"Calinski-Harabasz 指数: {chi}")
1from sklearn.datasets import make_blobs
2from sklearn.cluster import KMeans
3from sklearn.metrics import silhouette_score, davies_bouldin_score, calinski_harabasz_score
4
5# 创建模拟数据
6X, y = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)
7
8# 应用 k-means 聚类
9kmeans = KMeans(n_clusters=4, random_state=0).fit(X)
10
11# 预测聚类标签
12labels = kmeans.labels_
13
14# 计算轮廓系数
15silhouette_avg = silhouette_score(X, labels)
16print(f"轮廓系数: {silhouette_avg}")
17
18# 计算 WCSS
19wcss = kmeans.inertia_
20print(f"聚类内误差平方和(WCSS): {wcss}")
21
22# 计算 Davies-Bouldin 指数
23dbi = davies_bouldin_score(X, labels)
24print(f"Davies-Bouldin 指数: {dbi}")
25
26# 计算 Calinski-Harabasz 指数
27chi = calinski_harabasz_score(X, labels)
28print(f"Calinski-Harabasz 指数: {chi}")
输出结果:
1轮廓系数: 0.6819938690643478
2聚类内误差平方和(WCSS): 212.00599621083472
3Davies-Bouldin 指数: 0.43756400782378396
4Calinski-Harabasz 指数: 1210.0899142587816
1轮廓系数: 0.6819938690643478
2聚类内误差平方和(WCSS): 212.00599621083472
3Davies-Bouldin 指数: 0.43756400782378396
4Calinski-Harabasz 指数: 1210.0899142587816