René's URL Explorer Experiment


Title: Python特征选择(全) · Issue #10 · aialgorithm/Blog · GitHub

Open Graph Title: Python特征选择(全) · Issue #10 · aialgorithm/Blog

X Title: Python特征选择(全) · Issue #10 · aialgorithm/Blog

Description: 1 特征选择的目的 机器学习中特征选择是一个重要步骤,以筛选出显著特征、摒弃非显著特征。这样做的作用是: 减少特征(避免维度灾难),提高训练速度,降低运算开销; 减少干扰噪声,降低过拟合风险,提升模型效果; 更少的特征,模型可解释性更好; 2 特征选择方法 特征选择方法一般分为三类: 2.1 过滤法--特征选择 通过计算特征的缺失率、发散性、相关性、信息量、稳定性等指标对各个特征进行评估选择,常用如缺失情况、单值率、方差验证、pearson相关系数、chi2卡方检验、I...

Open Graph Description: 1 特征选择的目的 机器学习中特征选择是一个重要步骤,以筛选出显著特征、摒弃非显著特征。这样做的作用是: 减少特征(避免维度灾难),提高训练速度,降低运算开销; 减少干扰噪声,降低过拟合风险,提升模型效果; 更少的特征,模型可解释性更好; 2 特征选择方法 特征选择方法一般分为三类: 2.1 过滤法--特征选择 通过计算特征的缺失率、发散性、相关性、信息量、稳定性等指标对各个特征进行评估选择...

X Description: 1 特征选择的目的 机器学习中特征选择是一个重要步骤,以筛选出显著特征、摒弃非显著特征。这样做的作用是: 减少特征(避免维度灾难),提高训练速度,降低运算开销; 减少干扰噪声,降低过拟合风险,提升模型效果; 更少的特征,模型可解释性更好; 2 特征选择方法 特征选择方法一般分为三类: 2.1 过滤法--特征选择 通过计算特征的缺失率、发散性、相关性、信息量、稳定性等指标对各个特征进行评估选择...

Opengraph URL: https://github.com/aialgorithm/Blog/issues/10

X: @github

direct link

Domain: github.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Python特征选择(全)","articleBody":"\r\n# 1  特征选择的目的\r\n机器学习中特征选择是一个重要步骤,以筛选出显著特征、摒弃非显著特征。这样做的作用是:\r\n- 减少特征(避免维度灾难),提高训练速度,降低运算开销;\r\n- 减少干扰噪声,降低过拟合风险,提升模型效果; \r\n- 更少的特征,模型可解释性更好;\r\n\r\n# 2 特征选择方法\r\n特征选择方法一般分为三类:\r\n\r\n![](https://upload-images.jianshu.io/upload_images/11682271-a558e7eec0d73b1e?imageMogr2/auto-orient/strip|imageView2/2/w/467/format/webp)\r\n\r\n##  2.1 过滤法--特征选择\r\n\r\n通过计算特征的缺失率、发散性、相关性、信息量、稳定性等指标对各个特征进行评估选择,常用如缺失情况、单值率、方差验证、pearson相关系数、chi2卡方检验、IV值、信息增益及PSI等方法。\r\n### 2.1.1 缺失率\r\n通过分析各特征缺失率,并设定阈值对特征进行筛选。阈值可以凭经验值(如缺失率\u003c0.9)或可观察样本各特征整体分布,确定特征分布的异常值作为阈值。\r\n```python\r\n# 特征缺失率\r\nmiss_rate_df = df.isnull().sum().sort_values(ascending=False) / df.shape[0]\r\n```  \r\n\r\n![](https://upload-images.jianshu.io/upload_images/11682271-4181452c9527f8c1.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n### 2.1.2 发散性\r\n特征无发散性意味着该特征值基本一样,无区分能力。通过分析特征单个值得最大占比及方差以评估特征发散性情况,并设定阈值对特征进行筛选。阈值可以凭经验值(如单值率\u003c0.9, 方差\u003e0.001)或可观察样本各特征整体分布,以特征分布的异常值作为阈值。\r\n\r\n\r\n```python\r\n\r\n# 分析方差 \r\nvar_features = df.var().sort_values()\r\n \r\n# 特征单值率\r\nsigle_rate = {}\r\nfor var in df.columns:\r\n    sigle_rate[var]=(df[var].value_counts().max()/df.shape[0])\r\n```  \r\n### 2.1.2 相关性 \r\n特征间相关性高会浪费计算资源,影响模型的解释性。特别对线性模型来说,会导致拟合模型参数的不稳定。常用的分析特征相关性方法如:\r\n- 方差膨胀因子VIF:\r\n\r\n方差膨胀因子也称为方差膨胀系数(Variance Inflation),用于计算数值特征间的共线性,一般当VIF大于10表示有较高共线性。\r\n```\r\nfrom statsmodels.stats.outliers_influence import variance_inflation_factor\r\n# 截距项\r\ndf['c'] = 1\r\nname = df.columns\r\nx = np.matrix(df)\r\nVIF_list = [variance_inflation_factor(x,i) for i in range(x.shape[1])]\r\nVIF = pd.DataFrame({'feature':name,\"VIF\":VIF_list})\r\n\r\n```\r\n\r\n- person相关系数:\r\n![](https://upload-images.jianshu.io/upload_images/11682271-b60c98445b41a119.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n用于计算数值特征两两间的相关性,数值范围[-1,1]。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-f58e7a19e88a28ad.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n```\r\nimport seaborn as sns\r\ncorr_df=df.corr()\r\n# 热力图\r\nsns.heatmap(corr_df)\r\n# 剔除相关性系数高于threshold的corr_drop\r\nthreshold = 0.9\r\nupper = corr_df.where(np.triu(np.ones(corr_df.shape), k=1).astype(np.bool))\r\ncorr_drop = [column for column in upper.columns if any(upper[column].abs() \u003e threshold)]\r\n```\r\n\r\n- Chi2检验\r\n\r\n经典的卡方检验是检验类别型变量对类别型变量的相关性。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-1fa9769dff575b53.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\nSklearn的实现是通过矩阵相乘快速得出所有特征的观测值和期望值,在计算出各特征的 χ2 值后排序进行选择。在扩大了 chi2 的在连续型变量适用范围的同时,也方便了特征选择。\r\n```\r\nfrom sklearn.datasets import load_iris\r\nfrom sklearn.feature_selection import SelectKBest\r\nfrom sklearn.feature_selection import chi2\r\nx, y = load_iris(return_X_y=True)\r\n\r\nx_new = SelectKBest(chi2, k=2).fit_transform(x, y)\r\n```\r\n### 2.1.3  信息量\r\n分类任务中,可以通过计算某个特征对于分类这样的事件到底有多大信息量贡献,然后特征选择信息量贡献大的特征。 常用的方法有计算IV值、信息增益。\r\n\r\n- 信息增益\r\n\r\n如目标变量D的信息熵为 H(D),而D在特征A条件下的条件熵为 H(D|A),那么信息增益 G(D , A) 为:\r\n![](https://upload-images.jianshu.io/upload_images/11682271-c718ba36986da490.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n信息增益(互信息)的大小即代表特征A的信息贡献程度。\r\n```\r\nfrom sklearn.feature_selection import mutual_info_classif\r\nfrom sklearn.datasets import load_iris\r\nx, y = load_iris(return_X_y=True)\r\nmutual_info_classif(x,y)\r\n```\r\n\r\n- IV\r\n\r\nIV值(Information Value),在风控领域是一个重要的信息量指标,衡量了某个特征(连续型变量需要先离散化)对目标变量的影响程度。其基本思想是根据该特征所命中黑白样本的比率与总黑白样本的比率,来对比和计算其关联程度。[【Github代码链接】](https://github.com/aialgorithm/Blog/tree/master/projects/feature_selector)\r\n\r\n![](https://upload-images.jianshu.io/upload_images/11682271-ca2343cadec6b825.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n\r\n### 2.1.4 稳定性\r\n对大部分数据挖掘场景,特别是风控领域,很关注特征分布的稳定性,其直接影响到模型使用周期的稳定性。常用的是PSI(Population Stability Index,群体稳定性指标)。\r\n\r\n- PSI\r\n\r\nPSI表示的是实际与预期分布的差异,SUM( (实际占比 - 预期占比)* ln(实际占比 / 预期占比) )。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-ceadf9063860e582.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n在建模时通常以训练样本(In the Sample, INS)作为预期分布,而验证样本作为实际分布。验证样本一般包括样本外(Out of Sample,OOS)和跨时间样本(Out of Time,OOT)[【Github代码链接】](https://github.com/aialgorithm/Blog/tree/master/projects/feature_selector)\r\n\r\n## 2.2 嵌入法--特征选择\r\n嵌入法是直接使用模型训练的到特征重要性,在模型训练同时进行特征选择。通过模型得到各个特征的权值系数,根据权值系数从大到小来选择特征。常用如基于L1正则项的逻辑回归、Lighgbm特征重要性选择特征。\r\n\r\n\r\n\r\n\r\n- 基于L1正则项的逻辑回归 \r\n\r\nL1正则方法具有稀疏解的特性,直观从二维解空间来看L1-ball 为正方形,在顶点处时(如W2=C, W1=0的稀疏解),更容易达到最优解。可见基于L1正则方法的会趋向于产生少量的特征,而其他的特征都为0。\r\n\r\n![](https://upload-images.jianshu.io/upload_images/11682271-00c038655b1f429c.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n```python\r\nfrom sklearn.feature_selection import SelectFromModel\r\nfrom sklearn.linear_model import LogisticRegression\r\n\r\nx_new = SelectFromModel(LogisticRegression(penalty=\"l1\", C=0.1)).fit_transform(x,  y)\r\n```\r\n\r\n\r\n- 基于树模型的特征排序 \r\n\r\n基于决策树的树模型(随机森林,Lightgbm,Xgboost等),树生长过程中也是启发式搜索特征子集的过程,可以直接用训练后模型来输出特征重要性。\r\n```\r\nimport matplotlib.pyplot as plt\r\nfrom lightgbm import plot_importance\r\nfrom lightgbm import LGBMClassifier\r\n\r\n\r\nmodel = LGBMClassifier()\r\nmodel.fit(x, y)\r\nplot_importance(model,  max_num_features=20, figsize=(10,5),importance_type='split')\r\nplt.show()\r\nfeature_importance = pd.DataFrame({\r\n        'feature': model.booster_.feature_name(),\r\n        'gain': model.booster_.feature_importance('gain'),\r\n        'split': model.booster_.feature_importance('split')\r\n    }).sort_values('gain',ascending=False)\r\n```\r\n当特征数量多时,对于输出的特征重要性,通常可以按照重要性的拐点划定下阈值选择特征。\r\n\r\n![](https://upload-images.jianshu.io/upload_images/11682271-c295ad6cf9df2c59.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n\r\n\r\n## 2.3 包装法--特征选择\r\n包装法是通过每次选择部分特征迭代训练模型,根据模型预测效果评分选择特征的去留。一般包括产生过程,评价函数,停止准则,验证过程,这4个部分。\r\n\r\n(1) 产生过程( Generation Procedure )是搜索特征子集的过程,首先从特征全集中产生出一个特征子集。搜索方式有完全搜索(如广度优先搜索、定向搜索)、启发式搜索(如双向搜索、后向选择)、随机搜索(如随机子集选择、模拟退火、遗传算法)。\r\n(2) 评价函数( Evaluation Function )  是评价一个特征子集好坏程度的一个准则。\r\n(3) 停止准则( Stopping Criterion )停止准则是与评价函数相关的,一般是一个阈值,当评价函数值达到这个阈值后就可停止搜索。\r\n(4) 验证过程( Validation Procedure )是在验证数据集上验证选出来的特征子集的实际效果。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-56d3b2e366b40069.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n首先从特征全集中产生出一个特征子集,然后用评价函数对该特征子集进行评价,评价的结果与停止准则进行比较,若评价结果比停止准则好就停止,否则就继续产生下一组特征子集,继续进行特征选择。最后选出来的特征子集一般还要验证其实际效果。\r\n\r\n-  RFE\r\n\r\nRFE递归特征消除是常见的特征选择方法。[原理](https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/feature_selection/_rfe.py)是递归地在剩余的特征上构建模型,使用模型判断各特征的贡献并排序后做特征选择。\r\n```\r\nfrom sklearn.feature_selection import RFE\r\nrfe = RFE(estimator,n_features_to_select,step)\r\nrfe = rfe.fit(x, y)\r\nprint(rfe.support_)\r\nprint(rfe.ranking_)\r\n```\r\n\r\n- 双向搜索特征选择\r\n\r\n鉴于RFE仅是后向迭代的方法,容易陷入局部最优,而且不支持Lightgbm等模型自动处理缺失值/类别型特征,便基于启发式双向搜索及模拟退火算法思想,简单码了一个特征选择的方法[【Github代码链接】](https://github.com/aialgorithm/Blog/tree/master/projects/feature_selector),如下代码:\r\n\r\n```python\r\n\"\"\"\r\nAuthor: 公众号-算法进阶\r\n基于启发式双向搜索及模拟退火的特征选择方法。\r\n\"\"\"\r\n\r\n\r\nimport pandas as pd \r\nimport random \r\n\r\nfrom sklearn.metrics import precision_score, recall_score, f1_score, accuracy_score, roc_curve, auc\r\n\r\n\r\ndef model_metrics(model, x, y, pos_label=1):\r\n    \"\"\" \r\n    评价函数 \r\n    \r\n    \"\"\"\r\n    yhat = model.predict(x)\r\n    yprob = model.predict_proba(x)[:,1]\r\n    fpr, tpr, _ = roc_curve(y, yprob, pos_label=pos_label)\r\n    result = {'accuracy_score':accuracy_score(y, yhat),\r\n              'f1_score_macro': f1_score(y, yhat, average = \"macro\"),\r\n              'precision':precision_score(y, yhat,average=\"macro\"),\r\n              'recall':recall_score(y, yhat,average=\"macro\"),\r\n              'auc':auc(fpr,tpr),\r\n              'ks': max(abs(tpr-fpr))\r\n             }\r\n    return result\r\n\r\ndef bidirectional_selection(model, x_train, y_train, x_test, y_test, annealing=True, anneal_rate=0.1, iters=10,best_metrics=0,\r\n                         metrics='auc',threshold_in=0.0001, threshold_out=0.0001,early_stop=True, \r\n                         verbose=True):\r\n    \"\"\"\r\n    model  选择的模型\r\n    annealing     模拟退火算法\r\n    threshold_in  特征入模的\u003e阈值\r\n    threshold_out 特征剔除的\u003c阈值\r\n    \"\"\"\r\n    included = []\r\n    best_metrics = best_metrics\r\n    \r\n    for i in range(iters):\r\n        # forward step     \r\n        print(\"iters\", i)\r\n        changed = False \r\n        excluded = list(set(x_train.columns) - set(included))\r\n        random.shuffle(excluded) \r\n        for new_column in excluded:             \r\n            model.fit(x_train[included+[new_column]], y_train)\r\n            latest_metrics = model_metrics(model, x_test[included+[new_column]], y_test)[metrics]\r\n            if latest_metrics - best_metrics \u003e threshold_in:\r\n                included.append(new_column)\r\n                change = True \r\n                if verbose:\r\n                    print ('Add {} with metrics gain {:.6}'.format(new_column,latest_metrics-best_metrics))\r\n                best_metrics = latest_metrics\r\n            elif annealing:\r\n                if random.randint(0, iters) \u003c= iters * anneal_rate:\r\n                    included.append(new_column)\r\n                    if verbose:\r\n                        print ('Annealing Add   {} with metrics gain {:.6}'.format(new_column,latest_metrics-best_metrics))\r\n                    \r\n        # backward step                      \r\n        random.shuffle(included)\r\n        for new_column in included:\r\n            included.remove(new_column)\r\n            model.fit(x_train[included], y_train)\r\n            latest_metrics = model_metrics(model, x_test[included], y_test)[metrics]\r\n            if latest_metrics - best_metrics \u003c threshold_out:\r\n                included.append(new_column)\r\n            else:\r\n                changed = True \r\n                best_metrics= latest_metrics \r\n                if verbose:\r\n                    print('Drop{} with metrics gain {:.6}'.format(new_column,latest_metrics-best_metrics))\r\n        if not changed and early_stop:\r\n            break \r\n    return included      \r\n\r\n#示例\r\nfrom sklearn.model_selection import train_test_split\r\n\r\nx_train, x_test, y_train, y_test = train_test_split(x, y)\r\n\r\nmodel = LGBMClassifier()\r\nincluded =  bidirectional_selection(model, x_train, y_train, x_test, y_test, annealing=True, iters=50,best_metrics=0.5,\r\n                     metrics='auc',threshold_in=0.0001, threshold_out=0,\r\n                     early_stop=False,verbose=True)\r\n\r\n```\r\n\r\n---\r\n    注:公众号点击阅读原文可访问github源码\r\n![](https://upload-images.jianshu.io/upload_images/11682271-df3653f38442d6b4.jpg?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n","author":{"url":"https://github.com/aialgorithm","@type":"Person","name":"aialgorithm"},"datePublished":"2021-01-30T11:24:56.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/10/Blog/issues/10"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:3ec85454-194f-88e9-66f1-413db71f743e
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-idAE6A:28B0D7:846F18:BB2821:696A1637
html-safe-noncec774d1f9cd5df7bf8b77b441baf07da14dfae30f79a1441fc517ed596ffac9b3
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBRTZBOjI4QjBENzo4NDZGMTg6QkIyODIxOjY5NkExNjM3IiwidmlzaXRvcl9pZCI6IjI5MTM0NDM1NDc0OTE5OTcyMzkiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ==
visitor-hmac622e11c398e9be90105b8c3c90a5700e9b51471b5e8859f2cb3a2562493a3fd8
hovercard-subject-tagissue:797395935
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/aialgorithm/Blog/10/issue_layout
twitter:imagehttps://opengraph.githubassets.com/c6badeee18876c8d310fc509b0f63aaa4fa1339b78db45da261901da2d016d0f/aialgorithm/Blog/issues/10
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/c6badeee18876c8d310fc509b0f63aaa4fa1339b78db45da261901da2d016d0f/aialgorithm/Blog/issues/10
og:image:alt1 特征选择的目的 机器学习中特征选择是一个重要步骤,以筛选出显著特征、摒弃非显著特征。这样做的作用是: 减少特征(避免维度灾难),提高训练速度,降低运算开销; 减少干扰噪声,降低过拟合风险,提升模型效果; 更少的特征,模型可解释性更好; 2 特征选择方法 特征选择方法一般分为三类: 2.1 过滤法--特征选择 通过计算特征的缺失率、发散性、相关性、信息量、稳定性等指标对各个特征进行评估选择...
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernameaialgorithm
hostnamegithub.com
expected-hostnamegithub.com
None34a52bd10bd674f68e5c1b6b74413b79bf2ca20c551055ace3f7cdd112803923
turbo-cache-controlno-preview
go-importgithub.com/aialgorithm/Blog git https://github.com/aialgorithm/Blog.git
octolytics-dimension-user_id33707637
octolytics-dimension-user_loginaialgorithm
octolytics-dimension-repository_id147093233
octolytics-dimension-repository_nwoaialgorithm/Blog
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id147093233
octolytics-dimension-repository_network_root_nwoaialgorithm/Blog
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
releasee8bd37502700f365b18a4d39acf7cb7947e11b1a
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://github.com/aialgorithm/Blog/issues/10#start-of-content
https://github.com/
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Faialgorithm%2FBlog%2Fissues%2F10
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Faialgorithm%2FBlog%2Fissues%2F10
Sign up https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=aialgorithm%2FBlog
Reloadhttps://github.com/aialgorithm/Blog/issues/10
Reloadhttps://github.com/aialgorithm/Blog/issues/10
Reloadhttps://github.com/aialgorithm/Blog/issues/10
aialgorithm https://github.com/aialgorithm
Bloghttps://github.com/aialgorithm/Blog
Notifications https://github.com/login?return_to=%2Faialgorithm%2FBlog
Fork 259 https://github.com/login?return_to=%2Faialgorithm%2FBlog
Star 942 https://github.com/login?return_to=%2Faialgorithm%2FBlog
Code https://github.com/aialgorithm/Blog
Issues 66 https://github.com/aialgorithm/Blog/issues
Pull requests 0 https://github.com/aialgorithm/Blog/pulls
Actions https://github.com/aialgorithm/Blog/actions
Projects 0 https://github.com/aialgorithm/Blog/projects
Security Uh oh! There was an error while loading. Please reload this page. https://github.com/aialgorithm/Blog/security
Please reload this pagehttps://github.com/aialgorithm/Blog/issues/10
Insights https://github.com/aialgorithm/Blog/pulse
Code https://github.com/aialgorithm/Blog
Issues https://github.com/aialgorithm/Blog/issues
Pull requests https://github.com/aialgorithm/Blog/pulls
Actions https://github.com/aialgorithm/Blog/actions
Projects https://github.com/aialgorithm/Blog/projects
Security https://github.com/aialgorithm/Blog/security
Insights https://github.com/aialgorithm/Blog/pulse
New issuehttps://github.com/login?return_to=https://github.com/aialgorithm/Blog/issues/10
New issuehttps://github.com/login?return_to=https://github.com/aialgorithm/Blog/issues/10
Python特征选择(全)https://github.com/aialgorithm/Blog/issues/10#top
https://github.com/aialgorithm
https://github.com/aialgorithm
aialgorithmhttps://github.com/aialgorithm
on Jan 30, 2021https://github.com/aialgorithm/Blog/issues/10#issue-797395935
https://camo.githubusercontent.com/69793a5d0fea26120ab4a973338bef8e9ee97ed0b66ff4313d200bc410333a0c/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d613535386537656563306437336231653f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f3436372f666f726d61742f77656270
https://camo.githubusercontent.com/f537999c9bcdf6c3e512d6e320a00204ce6c8bb08d22fbcb1b31e27fd13643f7/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d343138313435326339353237663863312e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/a618e17301b209f5160a4947f1abd01854d76a45ab964d9e333ebbd4db45d970/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d623630633938343435623431613131392e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/6b313473ab2d17368eb519fdce93fd2df4b88a310bc61ba1eef82bfb9cefd58e/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d663538653761313965383861323861642e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/df3d76b9adbf7b31998256447e964501860d1dfae385492a70c64ec2da1971ff/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d316661393736396466663537356235332e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/b45dbfbf9203f2b6aa264667434ca256b1c5dd6afd4abdc22f2fa5ee21d2d4f0/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d633731386261333639383664613439302e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
【Github代码链接】https://github.com/aialgorithm/Blog/tree/master/projects/feature_selector
https://camo.githubusercontent.com/2759e4d5373d51459d650be38ef781e358ed8ffd83febbb487515d172985eda7/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d636132333433636164656336623832352e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/db1bdcc975f9e306ac4c74c33754b721a80c141c2a5440f481def02f1157ffb9/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d636561646639303633383630653538322e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
【Github代码链接】https://github.com/aialgorithm/Blog/tree/master/projects/feature_selector
https://camo.githubusercontent.com/d0da3bb2a7e140ced5e7f52febecd5075140b914e7d800dd7321540683168e57/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d303063303338363535623166343239632e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/e0156b4971aed218fcbbb1ed99e2b7a5d4fd315ee4e21e639b3514c9b96cb045/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d633239356164366366396466326335392e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/b2bec8a0712018ef137d007bfb8e044f07f7d9a6c090c874bf72100a76ba9529/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d353664336232653336366234303036392e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
原理https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/feature_selection/_rfe.py
【Github代码链接】https://github.com/aialgorithm/Blog/tree/master/projects/feature_selector
https://camo.githubusercontent.com/0902a430fbe15a48f8349bd1cedb6e85d2e54fdab0aa6c2d6be4d060ac860b51/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d646633363533663338343432643662342e6a70673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.