Title: 一文归纳Ai调参炼丹之法 · Issue #12 · aialgorithm/Blog · GitHub
Open Graph Title: 一文归纳Ai调参炼丹之法 · Issue #12 · aialgorithm/Blog
X Title: 一文归纳Ai调参炼丹之法 · Issue #12 · aialgorithm/Blog
Description: 1 超参数优化 调参即超参数优化,是指从超参数空间中选择一组合适的超参数,以权衡好模型的偏差(bias)和方差(variance),从而提高模型效果及性能。常用的调参方法有: 人工手动调参 网格/随机搜索(Grid / Random Search) 贝叶斯优化(Bayesian Optimization) 注:超参数 vs 模型参数差异 超参数是控制模型学习过程的(如网络层数、学习率); 模型参数是通过模型训练学习后得到的(如网络最终学习到的权重值)。 2 人工调参 手...
Open Graph Description: 1 超参数优化 调参即超参数优化,是指从超参数空间中选择一组合适的超参数,以权衡好模型的偏差(bias)和方差(variance),从而提高模型效果及性能。常用的调参方法有: 人工手动调参 网格/随机搜索(Grid / Random Search) 贝叶斯优化(Bayesian Optimization) 注:超参数 vs 模型参数差异 超参数是控制模型学习过程的(如网络层数、学习率); 模...
X Description: 1 超参数优化 调参即超参数优化,是指从超参数空间中选择一组合适的超参数,以权衡好模型的偏差(bias)和方差(variance),从而提高模型效果及性能。常用的调参方法有: 人工手动调参 网格/随机搜索(Grid / Random Search) 贝叶斯优化(Bayesian Optimization) 注:超参数 vs 模型参数差异 超参数是控制模型学习过程的(如网络层数、学习率); 模...
Opengraph URL: https://github.com/aialgorithm/Blog/issues/12
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"一文归纳Ai调参炼丹之法","articleBody":"# 1 超参数优化\r\n\r\n调参即超参数优化,是指从超参数空间中选择一组合适的超参数,以权衡好模型的偏差(bias)和方差(variance),从而提高模型效果及性能。常用的调参方法有:\r\n- 人工手动调参\r\n- 网格/随机搜索(Grid / Random Search)\r\n- 贝叶斯优化(Bayesian Optimization)\r\n\r\n\r\n\r\n\u003e注:超参数 vs 模型参数差异\r\n超参数是控制模型学习过程的(如网络层数、学习率);\r\n模型参数是通过模型训练学习后得到的(如网络最终学习到的权重值)。\r\n\r\n\r\n# 2 人工调参\r\n\r\n手动调参需要结合数据情况及算法的理解,优化调参的优先顺序及参数的经验值。\r\n\r\n不同模型手动调参思路会有差异,如随机森林是一种bagging集成的方法,参数主要有n_estimators(子树的数量)、max_depth(树的最大生长深度)、max_leaf_nodes(最大叶节点数)等。(此外其他参数不展开说明)\r\n对于n_estimators:通常越大效果越好。参数越大,则参与决策的子树越多,可以消除子树间的随机误差且增加预测的准度,以此降低方差与偏差。\r\n对于max_depth或max_leaf_nodes:通常对效果是先增后减的。取值越大则子树复杂度越高,偏差越低但方差越大。\r\n\r\n\r\n\r\n\r\n# 3 网格/随机搜索\r\n\r\n\r\n\r\n- 网格搜索(grid search),是超参数优化的传统方法,是对超参数组合的子集进行穷举搜索,找到表现最佳的超参数子集。\r\n- 随机搜索(random search),是对超参数组合的子集简单地做固定次数的随机搜索,找到表现最佳的超参数子集。对于规模较大的参数空间,采用随机搜索往往效率更高。\r\n\r\n```\r\nimport numpy as np\r\nfrom sklearn.model_selection import GridSearchCV\r\nfrom sklearn.model_selection import RandomizedSearchCV\r\nfrom sklearn.ensemble import RandomForestClassifier\r\n\r\n# 选择模型 \r\nmodel = RandomForestClassifier()\r\n# 参数搜索空间\r\nparam_grid = {\r\n 'max_depth': np.arange(1, 20, 1),\r\n 'n_estimators': np.arange(1, 50, 10),\r\n 'max_leaf_nodes': np.arange(2, 100, 10)\r\n\r\n}\r\n# 网格搜索模型参数\r\ngrid_search = GridSearchCV(model, param_grid, cv=5, scoring='f1_micro')\r\ngrid_search.fit(x, y)\r\nprint(grid_search.best_params_)\r\nprint(grid_search.best_score_)\r\nprint(grid_search.best_estimator_)\r\n# 随机搜索模型参数\r\nrd_search = RandomizedSearchCV(model, param_grid, n_iter=200, cv=5, scoring='f1_micro')\r\nrd_search.fit(x, y)\r\nprint(rd_search.best_params_)\r\nprint(rd_search.best_score_)\r\nprint(rd_search.best_estimator_)\r\n```\r\n\r\n# 4 贝叶斯优化\r\n贝叶斯优化(Bayesian Optimization)与网格/随机搜索最大的不同,在于考虑了历史调参的信息,使得调参更有效率。(高维参数空间下,贝叶斯优化复杂度较高,效果会近似随机搜索。)\r\n\r\n\r\n## 4.1 算法简介\r\n贝叶斯优化思想简单可归纳为两部分:\r\n- 高斯过程(GP):以历史的调参信息(Observation)去学习目标函数的后验分布(Target)的过程。\r\n\r\n- 采集函数(AC): 由学习的目标函数进行采样评估,分为两种过程: 1、开采过程:在最可能出现全局最优解的参数区域进行采样评估。 2、勘探过程:兼顾不确定性大的参数区域的采样评估,避免陷入局部最优。\r\n\r\n## 4.2 算法流程\r\n```\r\nfor循环n次迭代:\r\n 采集函数依据学习的目标函数(或初始化)给出下个开采极值点 Xn+1;\r\n 评估Xn+1得到Yn+1;\r\n 加入新的Xn+1、Yn+1数据样本,并更新高斯过程模型;\r\n```\r\n\r\n\r\n\r\n\r\n```\r\n\"\"\"\r\n随机森林分类Iris使用贝叶斯优化调参\r\n\"\"\"\r\nimport numpy as np\r\nfrom hyperopt import hp, tpe, Trials, STATUS_OK, Trials, anneal\r\nfrom functools import partial\r\nfrom hyperopt.fmin import fmin\r\nfrom sklearn.metrics import f1_score\r\nfrom sklearn.ensemble import RandomForestClassifier\r\n\r\ndef model_metrics(model, x, y):\r\n \"\"\" 评估指标 \"\"\"\r\n yhat = model.predict(x)\r\n return f1_score(y, yhat,average='micro')\r\n\r\ndef bayes_fmin(train_x, test_x, train_y, test_y, eval_iters=50):\r\n \"\"\"\r\n bayes优化超参数\r\n eval_iters:迭代次数\r\n \"\"\"\r\n \r\n def factory(params):\r\n \"\"\"\r\n 定义优化的目标函数\r\n \"\"\"\r\n fit_params = {\r\n 'max_depth':int(params['max_depth']),\r\n 'n_estimators':int(params['n_estimators']),\r\n 'max_leaf_nodes': int(params['max_leaf_nodes'])\r\n }\r\n # 选择模型\r\n model = RandomForestClassifier(**fit_params)\r\n model.fit(train_x, train_y)\r\n # 最小化测试集(- f1score)为目标\r\n train_metric = model_metrics(model, train_x, train_y)\r\n test_metric = model_metrics(model, test_x, test_y)\r\n loss = - test_metric\r\n return {\"loss\": loss, \"status\":STATUS_OK}\r\n\r\n # 参数空间\r\n space = {\r\n 'max_depth': hp.quniform('max_depth', 1, 20, 1),\r\n 'n_estimators': hp.quniform('n_estimators', 2, 50, 1), \r\n 'max_leaf_nodes': hp.quniform('max_leaf_nodes', 2, 100, 1)\r\n }\r\n # bayes优化搜索参数\r\n best_params = fmin(factory, space, algo=partial(anneal.suggest,), max_evals=eval_iters, trials=Trials(),return_argmin=True)\r\n # 参数转为整型\r\n best_params[\"max_depth\"] = int(best_params[\"max_depth\"])\r\n best_params[\"max_leaf_nodes\"] = int(best_params[\"max_leaf_nodes\"])\r\n best_params[\"n_estimators\"] = int(best_params[\"n_estimators\"])\r\n return best_params\r\n\r\n# 搜索最优参数\r\nbest_params = bayes_fmin(train_x, test_x, train_y, test_y, 100)\r\nprint(best_params)\r\n\r\n```\r\n---\r\n\r\n公众号**阅读原文**可访问[Github源码](https://github.com/aialgorithm/Blog/tree/master/projects/%E4%B8%80%E6%96%87%E5%BD%92%E7%BA%B3Ai%E8%B0%83%E5%8F%82%E7%82%BC%E4%B8%B9%E4%B9%8B%E6%B3%95)","author":{"url":"https://github.com/aialgorithm","@type":"Person","name":"aialgorithm"},"datePublished":"2021-03-05T09:11:50.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/12/Blog/issues/12"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:d18c8078-39ea-52bf-1713-cfad6d392028 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | 90DA:16E0FE:1E5F4B3:2A693E2:696A56F7 |
| html-safe-nonce | e4ed302517e5f251626d24d5b0b1f4478b553ee9f498c0c78acffb54c3f26260 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5MERBOjE2RTBGRToxRTVGNEIzOjJBNjkzRTI6Njk2QTU2RjciLCJ2aXNpdG9yX2lkIjoiMzgxMjcxMjcwNjAyNzk2ODI0NyIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | 29eae9ae29dd37b9b376537090c1b2a7b6c11f571a129f127658edb63529f48c |
| hovercard-subject-tag | issue:822898642 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/aialgorithm/Blog/12/issue_layout |
| twitter:image | https://opengraph.githubassets.com/ba324e7003b71944fb41ef218f7c6d17b3974cea4a1061748328d88d27fbf434/aialgorithm/Blog/issues/12 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/ba324e7003b71944fb41ef218f7c6d17b3974cea4a1061748328d88d27fbf434/aialgorithm/Blog/issues/12 |
| og:image:alt | 1 超参数优化 调参即超参数优化,是指从超参数空间中选择一组合适的超参数,以权衡好模型的偏差(bias)和方差(variance),从而提高模型效果及性能。常用的调参方法有: 人工手动调参 网格/随机搜索(Grid / Random Search) 贝叶斯优化(Bayesian Optimization) 注:超参数 vs 模型参数差异 超参数是控制模型学习过程的(如网络层数、学习率); 模... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | aialgorithm |
| hostname | github.com |
| expected-hostname | github.com |
| None | 3f871c8e07f0ae1886fa8dac284166d28b09ad5bada6476fc10b674e489788ef |
| turbo-cache-control | no-preview |
| go-import | github.com/aialgorithm/Blog git https://github.com/aialgorithm/Blog.git |
| octolytics-dimension-user_id | 33707637 |
| octolytics-dimension-user_login | aialgorithm |
| octolytics-dimension-repository_id | 147093233 |
| octolytics-dimension-repository_nwo | aialgorithm/Blog |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 147093233 |
| octolytics-dimension-repository_network_root_nwo | aialgorithm/Blog |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 63c426b30d262aba269ef14c40e3c817b384cd61 |
| ui-target | canary-1 |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width