René's URL Explorer Experiment


Title: 全面解析Kmeans聚类(Python) · Issue #42 · aialgorithm/Blog · GitHub

Open Graph Title: 全面解析Kmeans聚类(Python) · Issue #42 · aialgorithm/Blog

X Title: 全面解析Kmeans聚类(Python) · Issue #42 · aialgorithm/Blog

Description: 一、聚类简介 Clustering (聚类)是常见的unsupervised learning (无监督学习)方法,简单地说就是把相似的数据样本分到一组(簇),聚类的过程,我们并不清楚某一类是什么(通常无标签信息),需要实现的目标只是把相似的样本聚到一起,即只是利用样本数据本身的分布规律。 聚类算法可以大致分为传统聚类算法以及深度聚类算法: 传统聚类算法主要是根据原特征+基于划分/密度/层次等方法。 深度聚类方法主要是根据表征学习后的特征+传统聚类算法。 二、kmean...

Open Graph Description: 一、聚类简介 Clustering (聚类)是常见的unsupervised learning (无监督学习)方法,简单地说就是把相似的数据样本分到一组(簇),聚类的过程,我们并不清楚某一类是什么(通常无标签信息),需要实现的目标只是把相似的样本聚到一起,即只是利用样本数据本身的分布规律。 聚类算法可以大致分为传统聚类算法以及深度聚类算法: 传统聚类算法主要是根据原特征+基于划分/密度/层次...

X Description: 一、聚类简介 Clustering (聚类)是常见的unsupervised learning (无监督学习)方法,简单地说就是把相似的数据样本分到一组(簇),聚类的过程,我们并不清楚某一类是什么(通常无标签信息),需要实现的目标只是把相似的样本聚到一起,即只是利用样本数据本身的分布规律。 聚类算法可以大致分为传统聚类算法以及深度聚类算法: 传统聚类算法主要是根据原特征+基于划分/密度/层次...

Opengraph URL: https://github.com/aialgorithm/Blog/issues/42

X: @github

direct link

Domain: github.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"全面解析Kmeans聚类(Python)","articleBody":"### 一、聚类简介\r\n\r\nClustering (聚类)是常见的unsupervised learning (无监督学习)方法,简单地说就是把相似的数据样本分到一组(簇),聚类的过程,我们并不清楚某一类是什么(通常无标签信息),需要实现的目标只是把相似的样本聚到一起,即只是利用样本数据本身的分布规律。\r\n\r\n聚类算法可以大致分为传统聚类算法以及深度聚类算法:\r\n\r\n- 传统聚类算法主要是根据原特征+基于划分/密度/层次等方法。\r\n\r\n![](https://upload-images.jianshu.io/upload_images/11682271-0e38a702cec2e569.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n- 深度聚类方法主要是根据表征学习后的特征+传统聚类算法。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-27edeeb5de2d7a8d.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n\r\n\r\n\r\n### 二、kmeans聚类原理\r\n\r\nkmeans聚类可以说是聚类算法中最为常见的,它是基于划分方法聚类的,原理是先初始化k个簇类中心,基于计算样本与中心点的距离归纳各簇类下的所属样本,迭代实现样本与其归属的簇类中心的距离为最小的目标(如下目标函数)。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-505d51ed511d7ea2.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n\r\n其优化算法步骤为:\r\n\r\n1.随机选择 k 个样本作为初始簇类中心(k为超参,代表簇类的个数。可以凭先验知识、验证法确定取值);\r\n\r\n2.针对数据集中每个样本 计算它到 k 个簇类中心的距离,并将其归属到距离最小的簇类中心所对应的类中;\r\n\r\n3.针对每个簇类,重新计算它的簇类中心位置;\r\n\r\n4.重复迭代上面 2 、3 两步操作,直到达到某个中止条件(如迭代次数,簇类中心位置不变等)。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-b8e1b4f09e39bed0.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n```\r\n.... \r\n完整代码可见:https://github.com/aialgorithm/Blog\r\n\r\n#kmeans算法是初始化随机k个中心点\r\nrandom.seed(1)\r\ncenter = [[self.data[i][r] for i in range(1, len((self.data)))]  \r\n                      for r in random.sample(range(len(self.data)), k)]\r\n\r\n#最大迭代次数iters\r\nfor i in range(self.iters):\r\n    class_dict = self.count_distance() #计算距离,比较个样本到各个中心的的出最小值,并划分到相应的类\r\n    self.locate_center(class_dict) # 重新计算中心点\r\n    #print(self.data_dict)\r\n    print(\"----------------迭代%d次----------------\"%i)\r\n    print(self.center_dict)  #聚类结果{k:{{center:[]},{distance:{item:0.0},{classify:[]}}}}\r\n    if sorted(self.center) == sorted(self.new_center):\r\n        break\r\n    else:\r\n        self.center = self.new_center\r\n...\r\n```\r\n可见,K-means 聚类的迭代算法实际上是 EM 算法。EM 算法解决的是在概率模型中含有无法观测的隐含变量情况下的参数估计问题。在 K-means 中的隐变量是每个类别所属类别。K-means 算法迭代步骤中的 每次确认中心点以后重新进行标记 对应 EM 算法中的 E 步 求当前参数条件下的 Expectation 。而 根据标记重新求中心点 对应 EM 算法中的 M 步 求似然函数最大化时(损失函数最小时)对应的参数 。EM 算法的缺点是容易陷入局部极小值,这也是 K-means 有时会得到局部最优解的原因。\r\n\r\n### 三、选择距离度量\r\nkmeans 算法是基于距离相似度计算的,以确定各样本所属的最近中心点,常用距离度量有曼哈顿距离和欧式距离,具体可以见文章[【全面归纳距离和相似度方法(7种)】](https://mp.weixin.qq.com/s?__biz=MzI4MDE1NjExMQ==\u0026mid=2247486660\u0026idx=1\u0026sn=09fca0715a7c120c721aa26d295b7b97\u0026scene=19#wechat_redirect)\r\n- 曼哈顿距离 公式:\r\n\r\n![](https://upload-images.jianshu.io/upload_images/11682271-86f9852ba8ca496a.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n\r\n\r\n- 欧几里得距离 公式:\r\n\r\n![](https://upload-images.jianshu.io/upload_images/11682271-0d1161aa10669e59.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n曼哈顿、欧几里得距离的计算方法很简单,就是计算两样本(x,y)的各个特征i间的总距离。\r\n如下图(二维特征的情况)蓝线的距离即是曼哈顿距离(想象你在曼哈顿要从一个十字路口开车到另外一个十字路口实际驾驶距离就是这个“曼哈顿距离”,也称为城市街区距离),红线为欧几里得距离:\r\n\r\n![](https://upload-images.jianshu.io/upload_images/11682271-70fbf7d8b61024df.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n\r\n### 四、k 值的确定\r\n kmeans划分k个簇,不同k的情况,算法的效果可能差异就很大。K值的确定常用:先验法、手肘法等方法。\r\n- 先验法\r\n\r\n先验比较简单,就是凭借着业务知识确定k的取值。比如对于iris花数据集,我们大概知道有三种类别,可以按照k=3做聚类验证。从下图可看出,对比聚类预测与实际的iris种类是比较一致的。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-99bd7c675d3e6316.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n![](https://upload-images.jianshu.io/upload_images/11682271-733e6789d94d8d4a.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n- 手肘法\r\n可以知道k值越大,划分的簇群越多,对应的各个点到簇中心的距离的平方的和(类内距离,WSS)越低,我们通过确定WSS随着K的增加而减少的曲线拐点,作为K的取值,这也是最常用的手肘法。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-778ef765be194376.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n\r\n手肘法的缺点在于需要人为判断不够自动化,还有些其他方法如:\r\n- 使用 Gap statistic 方法,确定k值。\r\n- 验证不同K值的平均轮廓系数,越趋近1聚类效果越好。\r\n- 计算类内距离/类间距离,值越小越好。\r\n- ISODATA算法:它是在k-均值算法的基础上,增加对聚类结果的“合并”和“分裂”两个操作,确定最终的聚类结果。从而不用人为指定k值。\r\n\r\n\r\n\r\n\r\n### 五、Kmeans的缺陷 \r\n#### 5.1 初始化中心点的问题\r\nkmeans是采用随机初始化中心点,而不同初始化的中心点对于算法结果的影响比较大。所以,针对这点更新出了Kmeans++算法,其初始化的思路是:各个簇类中心应该互相离得越远越好。基于各点到已有中心点的距离分量,依次随机选取到k个元素作为中心点。离已确定的簇中心点的距离越远,越有可能(可能性正比与距离的平方)被选择作为另一个簇的中心点。如下代码。\r\n\r\n```\r\n# Kmeans ++ 算法基于距离概率选择k个中心点\r\n            # 1.随机选择一个点\r\n            center = []\r\n            center.append(random.choice(range(len(self.data[0]))))\r\n            # 2.根据距离的概率选择其他中心点\r\n            for i in range(self.k - 1):\r\n                weights = [self.distance_closest(self.data[0][x], center) \r\n                         for x in range(len(self.data[0])) if x not in center]\r\n                dp = [x for x in range(len(self.data[0])) if x not in center]\r\n                total = sum(weights)\r\n                #基于距离设定权重\r\n                weights = [weight/total for weight in weights]\r\n                num = random.random()\r\n                x = -1\r\n                i = 0\r\n                while i \u003c num :\r\n                    x += 1\r\n                    i += weights[x]\r\n                center.append(dp[x])\r\n            center = [self.data_dict[self.data[0][center[k]]] for k in range(len(center))]\r\n```\r\n\r\n#### 5.2 核Kmeans\r\n基于欧式距离的 K-means 假设了了各个数据簇的数据具有一样的的先验概率并呈现球形分布,但这种分布在实际生活中并不常见。面对非凸的数据分布形状时我们可以引入核函数来优化,这时算法又称为核 K-means 算法,是核聚类方法的一种。核聚类方法的主要思想是通过一个非线性映射,将输入空间中的数据点映射到高位的特征空间中,并在新的特征空间中进行聚类。非线性映射增加了数据点线性可分的概率,从而在经典的聚类算法失效的情况下,通过引入核函数可以达到更为准确的聚类结果。\r\n#### 5.3 特征类型\r\n\r\nkmeans是面向数值型的特征,对于类别特征需要进行onehot或其他编码方法。此外还有 K-Modes 、K-Prototypes 算法可以用于混合类型数据的聚类,对于数值特征簇类中心我们取得是各特征均值,而类别型特征中心取得是众数,计算距离采用海明距离,一致为0否则为1。\r\n\r\n\r\n#### 5.4 特征的权重\r\n\r\n聚类是基于特征间距离计算,计算距离时,需要关注到特征量纲差异问题,量纲越大意味这个特征权重越大。假设各样本有年龄、工资两个特征变量,如计算欧氏距离的时候,(年龄1-年龄2)² 的值要远小于(工资1-工资2)² ,这意味着在不使用特征缩放的情况下,距离会被工资变量(大的数值)主导。因此,我们需要使用特征缩放来将全部的数值统一到一个量级上来解决此问题。通常的解决方法可以对数据进行“标准化”或“归一化”,对所有数值特征统一到标准的范围如0~1。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-6cde6ee02a4907fe.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n归一化后的特征是统一权重,有时我们需要针对不同特征赋予更大的权重。假设我们希望feature1的权重为1,feature2的权重为2,则进行0~1归一化之后,在进行类似欧几里得距离(未开根号)计算的时候,\r\n![](https://upload-images.jianshu.io/upload_images/11682271-84b46fb3181c64c4.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n我们将feature2的值乘根号2就可以了,这样feature2对应的上式的计算结果会增大2倍,从而简单快速的实现权重的赋权。如果使用的是曼哈顿距离,特征直接乘以2 权重也就是2 。\r\n\r\n如果类别特征进行embedding之后的特征加权,比如embedding为256维,则我们对embedding的结果进行0~1归一化之后,每个embedding维度都乘以 根号1/256,从而将这个类别全部的距离计算贡献规约为1,避免embedding size太大使得kmeans的聚类结果非常依赖于embedding这个本质上是单一类别维度的特征。\r\n\r\n#### 5.5 特征的选择\r\nkmeans本质上只是根据样本特征间的距离(样本分布)确定所属的簇类。而不同特征的情况,就会明显影响聚类的结果。当使用没有代表性的特征时,结果可能就和预期大相径庭! 比如,想对银行客户质量进行聚类分级:交易次数、存款额度就是重要的特征,而如客户性别、年龄情况可能就是噪音,使用了性别、年龄特征得到的是性别、年龄相仿的客户!\r\n\r\n对于无监督聚类的特征选择:\r\n- 一方面可以结合业务含义,选择贴近业务场景的特征。\r\n\r\n- 另一方面,可以结合缺失率、相似度、PCA等常用的特征选择(降维)方法可以去除噪音、减少计算量以及避免维度爆炸。再者,如果任务有标签信息,结合特征对标签的特征重要性也是种方法(如xgboost的特征重要性,特征的IV值。)\r\n\r\n- 最后,也可以通过神经网络的特征表示(也就深度聚类的思想。后面在做专题介绍),如可以使用word2vec,将高维的词向量空间以低维的分布式向量表示。\r\n\r\n\r\n\r\n\u003e参考文献:\r\n1、https://www.bilibili.com/video/BV1H3411t7Vk?spm_id_from=333.999.0.0\r\n2、https://zhuanlan.zhihu.com/p/407343831\r\n3、https://zhuanlan.zhihu.com/p/78798251\r\n\r\n\r\n\r\n","author":{"url":"https://github.com/aialgorithm","@type":"Person","name":"aialgorithm"},"datePublished":"2021-12-24T07:33:41.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/42/Blog/issues/42"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:475bbe2a-3d95-a45d-160b-0599ffa48290
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-idAC08:C198F:B75378:F76B6F:6969E947
html-safe-nonce2b480c074ae5be24ab314c7260a41523b8bceec197432e259890eb5f49baace3
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBQzA4OkMxOThGOkI3NTM3ODpGNzZCNkY6Njk2OUU5NDciLCJ2aXNpdG9yX2lkIjoiNjA3NTc0NzMyMDkzNTg2ODc0MyIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9
visitor-hmac193fb437b673fb4af60ba2b7fcde78b2bac75a2e0a0240e1eafa649912e09ea4
hovercard-subject-tagissue:1088211051
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/aialgorithm/Blog/42/issue_layout
twitter:imagehttps://opengraph.githubassets.com/2e50daaa2eabc0617f7aede7959f973a50a0ea1cdeb20d062e0a666bae22ea16/aialgorithm/Blog/issues/42
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/2e50daaa2eabc0617f7aede7959f973a50a0ea1cdeb20d062e0a666bae22ea16/aialgorithm/Blog/issues/42
og:image:alt一、聚类简介 Clustering (聚类)是常见的unsupervised learning (无监督学习)方法,简单地说就是把相似的数据样本分到一组(簇),聚类的过程,我们并不清楚某一类是什么(通常无标签信息),需要实现的目标只是把相似的样本聚到一起,即只是利用样本数据本身的分布规律。 聚类算法可以大致分为传统聚类算法以及深度聚类算法: 传统聚类算法主要是根据原特征+基于划分/密度/层次...
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernameaialgorithm
hostnamegithub.com
expected-hostnamegithub.com
None7b32f1c7c4549428ee399213e8345494fc55b5637195d3fc5f493657579235e8
turbo-cache-controlno-preview
go-importgithub.com/aialgorithm/Blog git https://github.com/aialgorithm/Blog.git
octolytics-dimension-user_id33707637
octolytics-dimension-user_loginaialgorithm
octolytics-dimension-repository_id147093233
octolytics-dimension-repository_nwoaialgorithm/Blog
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id147093233
octolytics-dimension-repository_network_root_nwoaialgorithm/Blog
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
releasebdde15ad1b403e23b08bbd89b53fbe6bdf688cad
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://github.com/aialgorithm/Blog/issues/42#start-of-content
https://github.com/
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Faialgorithm%2FBlog%2Fissues%2F42
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Faialgorithm%2FBlog%2Fissues%2F42
Sign up https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=aialgorithm%2FBlog
Reloadhttps://github.com/aialgorithm/Blog/issues/42
Reloadhttps://github.com/aialgorithm/Blog/issues/42
Reloadhttps://github.com/aialgorithm/Blog/issues/42
aialgorithm https://github.com/aialgorithm
Bloghttps://github.com/aialgorithm/Blog
Notifications https://github.com/login?return_to=%2Faialgorithm%2FBlog
Fork 259 https://github.com/login?return_to=%2Faialgorithm%2FBlog
Star 942 https://github.com/login?return_to=%2Faialgorithm%2FBlog
Code https://github.com/aialgorithm/Blog
Issues 66 https://github.com/aialgorithm/Blog/issues
Pull requests 0 https://github.com/aialgorithm/Blog/pulls
Actions https://github.com/aialgorithm/Blog/actions
Projects 0 https://github.com/aialgorithm/Blog/projects
Security Uh oh! There was an error while loading. Please reload this page. https://github.com/aialgorithm/Blog/security
Please reload this pagehttps://github.com/aialgorithm/Blog/issues/42
Insights https://github.com/aialgorithm/Blog/pulse
Code https://github.com/aialgorithm/Blog
Issues https://github.com/aialgorithm/Blog/issues
Pull requests https://github.com/aialgorithm/Blog/pulls
Actions https://github.com/aialgorithm/Blog/actions
Projects https://github.com/aialgorithm/Blog/projects
Security https://github.com/aialgorithm/Blog/security
Insights https://github.com/aialgorithm/Blog/pulse
New issuehttps://github.com/login?return_to=https://github.com/aialgorithm/Blog/issues/42
New issuehttps://github.com/login?return_to=https://github.com/aialgorithm/Blog/issues/42
全面解析Kmeans聚类(Python)https://github.com/aialgorithm/Blog/issues/42#top
https://github.com/aialgorithm
https://github.com/aialgorithm
aialgorithmhttps://github.com/aialgorithm
on Dec 24, 2021https://github.com/aialgorithm/Blog/issues/42#issue-1088211051
https://camo.githubusercontent.com/9946bcc1e41eaaced5611a29bd22e61a6653b8ebc327e330890fad0afc3962b4/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d306533386137303263656332653536392e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/f00aafd1ff5036507bcdd218aad079d7a6c57b9627f2bf60362617c76de7cd3b/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d323765646565623564653264376138642e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/20ec69fb1714dd6235d81a69a05f8c7077188cb4e5e72f3f8d159493e15a7b93/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d353035643531656435313164376561322e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/de4671efad289e05dbc7e5d099c24970395f1d35f33ed7bfff97d6f0792ecdc0/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d623865316234663039653339626564302e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
【全面归纳距离和相似度方法(7种)】https://mp.weixin.qq.com/s?__biz=MzI4MDE1NjExMQ==&mid=2247486660&idx=1&sn=09fca0715a7c120c721aa26d295b7b97&scene=19#wechat_redirect
https://camo.githubusercontent.com/d717c6cd2d12a87ba6e8fdcdcd567d63a649bfee5616df14176579d60ce35707/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d383666393835326261386361343936612e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/14f88da6f51da13b25901180d79ba9690028aa789464e4e976cc3421864dd3f2/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d306431313631616131303636396535392e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/e884c624a0fdb134ed6358223932c7f8bb5f84f0c2c956422b6ffc904e8de03d/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d373066626637643862363130323464662e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/f3cc7b6eaa943f2fe6a839a64770ad707377e70a2c01dbcc5c2e030a348c5487/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d393962643763363735643365363331362e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/7523eccc91dda6157a2f3f8d9ea8448a2f2836aa80b457e521ae0a064524cb33/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d373333653637383964393464386434612e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/c4a5146f43f01d32aa9517fc275013cf3d70c9c911b58166f6852033e619eae9/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d373738656637363562653139343337362e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/d2e8b5d035582ec015eb79ad5b785e5b089638b096e61621d1f1585a871e2795/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d366364653665653032613439303766652e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/ae2899b314dece6e4ef30bb41a25bcc80250507c268ebf221b1383fde5c39ce2/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d383462343666623331383163363463342e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://www.bilibili.com/video/BV1H3411t7Vk?spm_id_from=333.999.0.0https://www.bilibili.com/video/BV1H3411t7Vk?spm_id_from=333.999.0.0
https://zhuanlan.zhihu.com/p/407343831https://zhuanlan.zhihu.com/p/407343831
https://zhuanlan.zhihu.com/p/78798251https://zhuanlan.zhihu.com/p/78798251
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.