René's URL Explorer Experiment

Title: 神经网络拟合能力的提升之路（Pyhton） · Issue #25 · aialgorithm/Blog · GitHub

Open Graph Title: 神经网络拟合能力的提升之路（Pyhton） · Issue #25 · aialgorithm/Blog

X Title: 神经网络拟合能力的提升之路（Pyhton） · Issue #25 · aialgorithm/Blog

Description: 本文侧重于模型拟合能力的探讨。过拟合及泛化能力方面下期文章会专题讨论。原理上讲，神经网络模型的训练过程其实就是拟合一个数据分布（x）可以映射到输出（y）的数学函数 f(x)，而拟合效果的好坏取决于数据及模型。那对于如何提升拟合能力呢？我们首先从著名的单层神经网络为啥拟合不了XOR函数说起。一、单层神经网络的缺陷单层神经网络如逻辑回归、感知器等模型，本质上都属于广义线性分类器（决策边界为线性）。这点可以从逻辑回归模型的决策函数看出，决策函数Y=sigmoid(wx...

Open Graph Description: 本文侧重于模型拟合能力的探讨。过拟合及泛化能力方面下期文章会专题讨论。原理上讲，神经网络模型的训练过程其实就是拟合一个数据分布（x）可以映射到输出（y）的数学函数 f(x)，而拟合效果的好坏取决于数据及模型。那对于如何提升拟合能力呢？我们首先从著名的单层神经网络为啥拟合不了XOR函数说起。一、单层神经网络的缺陷单层神经网络如逻辑回归、感知器等模型，本质上都属于广义线性分类器（决策边界...

X Description: 本文侧重于模型拟合能力的探讨。过拟合及泛化能力方面下期文章会专题讨论。原理上讲，神经网络模型的训练过程其实就是拟合一个数据分布（x）可以映射到输出（y）的数学函数 f(x)，而拟合效果的好坏取决于数据及模型。那对于如何提升拟合能力呢？我们首先从著名的单层神经网络为啥拟合不了XOR函数说起。一、单层神经网络的缺陷单层神经网络如逻辑回归、感知器等模型，本质上都属于广义线性分类器（决策边界...

Opengraph URL: https://github.com/aialgorithm/Blog/issues/25

X: @github

direct link

Domain: github.com

Hey, it has json ld scripts:

{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"神经网络拟合能力的提升之路（Pyhton）","articleBody":"\u003e本文侧重于模型拟合能力的探讨。过拟合及泛化能力方面下期文章会专题讨论。\r\n\r\n原理上讲，神经网络模型的训练过程其实就是拟合一个数据分布（x）可以映射到输出（y）的数学函数 f(x)，而拟合效果的好坏取决于数据及模型。\r\n那对于如何提升拟合能力呢？我们首先从著名的单层神经网络为啥拟合不了XOR函数说起。\r\n\r\n##  一、单层神经网络的缺陷\r\n单层神经网络如逻辑回归、感知器等模型，本质上都属于广义线性分类器（决策边界为线性）。这点可以从逻辑回归模型的决策函数看出，决策函数Y=sigmoid(wx + b)，当wx+b\u003e0，Y\u003e0.5;当wx+b\u003c0，Y\u003c0.5，以wx+b这条线可以区分开Y=0或1（如下图），可见决策边界是线性的。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-50775a58058351a8.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n这也导致了历史上著名xor问题：\r\n\u003e1969年，“符号主义”代表人物马文·明斯基（Marvin Minsky）提出XOR问题：xor即异或运算的函数，输入两个bool数值（取值0或者1），当两个数值不同时输出为1，否则输出为0。如下图，可知XOR数据无法通过线性模型的边界正确的区分开 ![](https://upload-images.jianshu.io/upload_images/11682271-acae99d898a15b66.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n由于单层神经网络线性，连简单的非线性的异或函数都无法正确的学习，而我们经常希望模型是可以学习非线性函数，这给了神经网络研究以沉重的打击，神经网络的研究走向长达10年的低潮时期。![](https://upload-images.jianshu.io/upload_images/11682271-58b88cd6b28c3220.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n如下以逻辑回归代码为例，尝试去学习XOR函数：\r\n\r\n```\r\n# 生成xor数据\r\nimport pandas as pd \r\n\r\nxor_dataset = pd.DataFrame([[1,1,0],[1,0,1],[0,1,1],[0,0,0]],columns=['x0','x1','label'])\r\nx,y = xor_dataset[['x0','x1']], xor_dataset['label']\r\nxor_dataset.head()\r\n\r\nfrom keras.layers import *\r\nfrom keras.models import Sequential, Model\r\n\r\nnp.random.seed(0)\r\nmodel = Sequential()\r\nmodel.add(Dense(1, input_dim=2, activation='sigmoid'))\r\nmodel.summary()\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')\r\nx,y = xor_dataset[['x0','x1']], xor_dataset['label']\r\nmodel.fit(x, y, epochs=100000,verbose=False)\r\nprint(\"正确标签：\",y.values)\r\nprint(\"模型预测：\",model.predict(x).round())\r\n# 正确标签： [0 1 1 0]   模型预测： [1 0 1 0]\r\n```\r\n![](https://upload-images.jianshu.io/upload_images/11682271-297785534f68fa15.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n结果可见，lr线性模型的拟合能力有限，无法学习非线性的XOR函数。那如何解决这个问题呢？\r\n**这就要说到线性模型的根本缺陷———无法使用变量间交互的非线性信息。**\r\n##二、 如何学习非线性的XOR函数\r\n如上文所谈，学习非线性函数的关键在于：模型要使用变量间交互的非线性信息。\r\n\r\n解决思路很清晰了，要么，我们手动给模型加一些非线性特征作为输入（即特征生成的方法）。\r\n\r\n要不然，增加模型的非线性表达能力（即非线性模型），模型可以自己对特征x增加一些ф(x)非线性交互转换。假设原线性模型的表达为f(x;w)，非线性模型的表达为f(x, ф(x), w)。\r\n\r\n### 2.1 方法：引入非线性特征\r\n最简单的思路是我们手动加入些其他维度非线性特征，以提高模型非线性的表达能力。这也反映出了特征工程对于模型的重要性，模型很大程度上就是复杂特征+简单模型与简单特征+复杂模型的取舍。\r\n```\r\n# 加入非线性特征\r\n\r\nfrom keras.layers import *\r\nfrom keras.models import Sequential, Model\r\nfrom tensorflow import random\r\n\r\nnp.random.seed(5) # 固定随机种子\r\nrandom.set_seed(5)\r\n\r\n\r\nmodel = Sequential()\r\nmodel.add(Dense(1, input_dim=3, activation='sigmoid'))\r\n\r\nmodel.summary()\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')\r\n\r\n\r\nxor_dataset['x2'] = xor_dataset['x0'] * xor_dataset['x1'] # 非线性特征\r\n\r\n\r\nx,y = xor_dataset[['x0','x1','x2']], xor_dataset['label']\r\n\r\nmodel.fit(x, y, epochs=10000,verbose=False)\r\n\r\nprint(\"正确标签：\",y.values)\r\nprint(\"模型预测：\",model.predict(x).round())\r\n# 正确标签： [0 1 1 0]   模型预测： [0 1 1 0]\r\n```\r\n正确标签： [0 1 1 0] ，模型预测： [0 1 1 0]，模型预测结果OK！\r\n\r\n\r\n### 2.2 方法2：深度神经网络（MLP）\r\n搬出万能近似定理，“一个前馈神经网络如果具有线性输出层和至少一层具有任何一种‘‘挤压’’ 性质的激活函数的隐藏层，只要给予网络足够数量的隐藏单元，它可以以任意的精度来近似任何从一个有限维空间到另一个有限维空间的Borel可测函数。”简单来说，前馈神经网络有“够深的网络层”以及“至少一层带激活函数的隐藏层”，既可以拟合任意的函数。\r\n\r\n这里我们将逻辑回归加入一层的隐藏层，升级为一个两层的神经网络（MLP）：\r\n```\r\nfrom keras.layers import *\r\nfrom keras.models import Sequential, Model\r\nfrom tensorflow import random\r\n\r\nnp.random.seed(0) # 随机种子\r\nrandom.set_seed(0)\r\n\r\nmodel = Sequential()\r\nmodel.add(Dense(10, input_dim=2, activation='relu'))   # 隐藏层\r\nmodel.add(Dense(1, activation='sigmoid'))  # 输出层\r\n\r\nmodel.summary()\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')  \r\n\r\n\r\nx,y = xor_dataset[['x0','x1']], xor_dataset['label']\r\nmodel.fit(x, y, epochs=100000,verbose=False)  # 训练模型\r\n\r\n\r\nprint(\"正确标签：\",y.values)\r\nprint(\"模型预测：\",model.predict(x).round())\r\n```\r\n正确标签： [0 1 1 0] ，模型预测：[[0.][1.][1.][0.]]，模型预测结果OK！\r\n\r\n### 2.3 方法3：支持向量机的核函数\r\n\r\n支持向量机（Support Vector Machine, SVM）可以视为在单隐藏层神经网络基础上的改进（svm具体原理可关注笔者后面的专题介绍），对于线性不可分的问题，不同于深度神经网络的增加非线性隐藏层，SVM利用非线性核函数，本质上都是实现特征空间的非线性变换，提升模型的非线性表达能力。\r\n\r\n![](https://upload-images.jianshu.io/upload_images/11682271-90608491a29f1dce.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n```\r\nfrom sklearn.svm import SVC\r\n\r\nsvm = SVC()\r\nsvm.fit(x,y)\r\nsvm.predict(x)\r\n\r\n```\r\n正确标签： [0 1 1 0]   模型预测： [[0.][1.][1.][0.]]，模型预测结果OK！\r\n\r\n## 小结\r\n归根结底，机器学习模型可以看作一个函数，本质能力是通过参数w去控制特征表示，以拟合目标值Y，最终学习到的决策函数f( x; w )。模型拟合能力的提升关键即是，**控制及利用特征间交互的非线性信息，实现特征空间的非线性变换。**拟合能力的提升可以归结为以下两方面：\r\n\r\n- 数据方面：通过特征工程 构造复杂特征。\r\n\r\n- 模型方面：使用非线性的复杂模型。如：含非线性隐藏层的神经网络，非线性核函数svm，天然非线性的集成树模型。经验上讲，对这些异质模型做下模型融合效果会更好。\r\n\r\n\r\n\r\n---\r\n本文首发”算法进阶“，公众号阅读原文即访问文章[相关代码](https://github.com/aialgorithm/Blog)","author":{"url":"https://github.com/aialgorithm","@type":"Person","name":"aialgorithm"},"datePublished":"2021-10-15T06:23:21.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/25/Blog/issues/25"}

route-pattern	/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controller	voltron_issues_fragments
route-action	issue_layout
fetch-nonce	v2:69ceb285-7abd-4399-5655-fb5a46d78141
current-catalog-service-hash	81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-id	AE88:1478A3:87CCA0:BFAA4B:696A163A
html-safe-nonce	9fe2b921f22aaa6f99ae41af9e97fc453131b47df461cbe8cbbef9db5775ad24
visitor-payload	eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBRTg4OjE0NzhBMzo4N0NDQTA6QkZBQTRCOjY5NkExNjNBIiwidmlzaXRvcl9pZCI6Ijc4MDA2NTYxMjAyOTkyNjM1NDYiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ==
visitor-hmac	54ed2d16950eb364f3e882df96cbc4bbfaf1ec49462d50068c9cc4d81b1206cf
hovercard-subject-tag	issue:1027120908
github-keyboard-shortcuts	repository,issues,copilot
google-site-verification	Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-url	https://collector.github.com/github/collect
analytics-location	///voltron/issues_fragments/issue_layout
fb:app_id	1401488693436528
apple-itunes-app	app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/aialgorithm/Blog/25/issue_layout
twitter:image	https://opengraph.githubassets.com/ca367ca582415f156785054b10051e5d2f24662295e4b72b27c8268521a3b828/aialgorithm/Blog/issues/25
twitter:card	summary_large_image
og:image	https://opengraph.githubassets.com/ca367ca582415f156785054b10051e5d2f24662295e4b72b27c8268521a3b828/aialgorithm/Blog/issues/25
og:image:alt	本文侧重于模型拟合能力的探讨。过拟合及泛化能力方面下期文章会专题讨论。原理上讲，神经网络模型的训练过程其实就是拟合一个数据分布（x）可以映射到输出（y）的数学函数 f(x)，而拟合效果的好坏取决于数据及模型。那对于如何提升拟合能力呢？我们首先从著名的单层神经网络为啥拟合不了XOR函数说起。一、单层神经网络的缺陷单层神经网络如逻辑回归、感知器等模型，本质上都属于广义线性分类器（决策边界...
og:image:width	1200
og:image:height	600
og:site_name	GitHub
og:type	object
og:author:username	aialgorithm
hostname	github.com
expected-hostname	github.com
None	34a52bd10bd674f68e5c1b6b74413b79bf2ca20c551055ace3f7cdd112803923
turbo-cache-control	no-preview
go-import	github.com/aialgorithm/Blog git https://github.com/aialgorithm/Blog.git
octolytics-dimension-user_id	33707637
octolytics-dimension-user_login	aialgorithm
octolytics-dimension-repository_id	147093233
octolytics-dimension-repository_nwo	aialgorithm/Blog
octolytics-dimension-repository_public	true
octolytics-dimension-repository_is_fork	false
octolytics-dimension-repository_network_root_id	147093233
octolytics-dimension-repository_network_root_nwo	aialgorithm/Blog
turbo-body-classes	logged-out env-production page-responsive
disable-turbo	false
browser-stats-url	https://api.github.com/_private/browser/stats
browser-errors-url	https://api.github.com/_private/browser/errors
release	e8bd37502700f365b18a4d39acf7cb7947e11b1a
ui-target	full
theme-color	#1e2327
color-scheme	light dark

Links:

Skip to content	https://github.com/aialgorithm/Blog/issues/25#start-of-content
	https://github.com/
Sign in	https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Faialgorithm%2FBlog%2Fissues%2F25
GitHub CopilotWrite better code with AI	https://github.com/features/copilot
GitHub SparkBuild and deploy intelligent apps	https://github.com/features/spark
GitHub ModelsManage and compare prompts	https://github.com/features/models
MCP RegistryNewIntegrate external tools	https://github.com/mcp
ActionsAutomate any workflow	https://github.com/features/actions
CodespacesInstant dev environments	https://github.com/features/codespaces
IssuesPlan and track work	https://github.com/features/issues
Code ReviewManage code changes	https://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilities	https://github.com/security/advanced-security
Code securitySecure your code as you build	https://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they start	https://github.com/security/advanced-security/secret-protection
Why GitHub	https://github.com/why-github
Documentation	https://docs.github.com
Blog	https://github.blog
Changelog	https://github.blog/changelog
Marketplace	https://github.com/marketplace
View all features	https://github.com/features
Enterprises	https://github.com/enterprise
Small and medium teams	https://github.com/team
Startups	https://github.com/enterprise/startups
Nonprofits	https://github.com/solutions/industry/nonprofits
App Modernization	https://github.com/solutions/use-case/app-modernization
DevSecOps	https://github.com/solutions/use-case/devsecops
DevOps	https://github.com/solutions/use-case/devops
CI/CD	https://github.com/solutions/use-case/ci-cd
View all use cases	https://github.com/solutions/use-case
Healthcare	https://github.com/solutions/industry/healthcare
Financial services	https://github.com/solutions/industry/financial-services
Manufacturing	https://github.com/solutions/industry/manufacturing
Government	https://github.com/solutions/industry/government
View all industries	https://github.com/solutions/industry
View all solutions	https://github.com/solutions
AI	https://github.com/resources/articles?topic=ai
Software Development	https://github.com/resources/articles?topic=software-development
DevOps	https://github.com/resources/articles?topic=devops
Security	https://github.com/resources/articles?topic=security
View all topics	https://github.com/resources/articles
Customer stories	https://github.com/customer-stories
Events & webinars	https://github.com/resources/events
Ebooks & reports	https://github.com/resources/whitepapers
Business insights	https://github.com/solutions/executive-insights
GitHub Skills	https://skills.github.com
Documentation	https://docs.github.com
Customer support	https://support.github.com
Community forum	https://github.com/orgs/community/discussions
Trust center	https://github.com/trust-center
Partners	https://github.com/partners
GitHub SponsorsFund open source developers	https://github.com/sponsors
Security Lab	https://securitylab.github.com
Maintainer Community	https://maintainers.github.com
Accelerator	https://github.com/accelerator
Archive Program	https://archiveprogram.github.com
Topics	https://github.com/topics
Trending	https://github.com/trending
Collections	https://github.com/collections
Enterprise platformAI-powered developer platform	https://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security features	https://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI features	https://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 support	https://github.com/premium-support
Pricing	https://github.com/pricing
Search syntax tips	https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentation	https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in	https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Faialgorithm%2FBlog%2Fissues%2F25
Sign up	https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=aialgorithm%2FBlog
Reload	https://github.com/aialgorithm/Blog/issues/25
Reload	https://github.com/aialgorithm/Blog/issues/25
Reload	https://github.com/aialgorithm/Blog/issues/25
aialgorithm	https://github.com/aialgorithm
Blog	https://github.com/aialgorithm/Blog
Notifications	https://github.com/login?return_to=%2Faialgorithm%2FBlog
Fork 259	https://github.com/login?return_to=%2Faialgorithm%2FBlog
Star 942	https://github.com/login?return_to=%2Faialgorithm%2FBlog
Code	https://github.com/aialgorithm/Blog
Issues 66	https://github.com/aialgorithm/Blog/issues
Pull requests 0	https://github.com/aialgorithm/Blog/pulls
Actions	https://github.com/aialgorithm/Blog/actions
Projects 0	https://github.com/aialgorithm/Blog/projects
Security Uh oh! There was an error while loading. Please reload this page.	https://github.com/aialgorithm/Blog/security
Please reload this page	https://github.com/aialgorithm/Blog/issues/25
Insights	https://github.com/aialgorithm/Blog/pulse
Code	https://github.com/aialgorithm/Blog
Issues	https://github.com/aialgorithm/Blog/issues
Pull requests	https://github.com/aialgorithm/Blog/pulls
Actions	https://github.com/aialgorithm/Blog/actions
Projects	https://github.com/aialgorithm/Blog/projects
Security	https://github.com/aialgorithm/Blog/security
Insights	https://github.com/aialgorithm/Blog/pulse
New issue	https://github.com/login?return_to=https://github.com/aialgorithm/Blog/issues/25
New issue	https://github.com/login?return_to=https://github.com/aialgorithm/Blog/issues/25
神经网络拟合能力的提升之路（Pyhton）	https://github.com/aialgorithm/Blog/issues/25#top
	https://github.com/aialgorithm
	https://github.com/aialgorithm
aialgorithm	https://github.com/aialgorithm
on Oct 15, 2021	https://github.com/aialgorithm/Blog/issues/25#issue-1027120908
	https://camo.githubusercontent.com/8e931010f92fc41662d5b88d06f1aaf5216c722be0f367aac626f02839a97e41/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d353037373561353830353833353161382e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
	https://camo.githubusercontent.com/881d5d903b8c9b2b3761e9e8e30ef0f4eeb44c7fec0d99301435ef2aad892140/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d616361653939643839386131356236362e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
	https://camo.githubusercontent.com/a95f1502b8431108068fc31616bb47a06eb214cc4c7c518f77f4a2b0cf68eb48/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d353862383863643662323863333232302e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
	https://camo.githubusercontent.com/fa1ef332f1fbc582c1c31249ba521ce48d63228c5f4d49a5bedc8750e65c6509/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d323937373835353334663638666131352e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
	https://camo.githubusercontent.com/50829f63e758eb6c6bee8de1809528187fd891baa542ca74b36f398b39d4915d/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d393036303834393161323966316463652e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
相关代码	https://github.com/aialgorithm/Blog
	https://github.com
Terms	https://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacy	https://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Security	https://github.com/security
Status	https://www.githubstatus.com/
Community	https://github.community/
Docs	https://docs.github.com/
Contact	https://support.github.com?tags=dotcom-footer

Viewport: width=device-width

URLs of crawlers that visited me.