René's URL Explorer Experiment


Title: 神经网络拟合能力的提升之路(Pyhton) · Issue #25 · aialgorithm/Blog · GitHub

Open Graph Title: 神经网络拟合能力的提升之路(Pyhton) · Issue #25 · aialgorithm/Blog

X Title: 神经网络拟合能力的提升之路(Pyhton) · Issue #25 · aialgorithm/Blog

Description: 本文侧重于模型拟合能力的探讨。过拟合及泛化能力方面下期文章会专题讨论。 原理上讲,神经网络模型的训练过程其实就是拟合一个数据分布(x)可以映射到输出(y)的数学函数 f(x),而拟合效果的好坏取决于数据及模型。 那对于如何提升拟合能力呢?我们首先从著名的单层神经网络为啥拟合不了XOR函数说起。 一、单层神经网络的缺陷 单层神经网络如逻辑回归、感知器等模型,本质上都属于广义线性分类器(决策边界为线性)。这点可以从逻辑回归模型的决策函数看出,决策函数Y=sigmoid(wx...

Open Graph Description: 本文侧重于模型拟合能力的探讨。过拟合及泛化能力方面下期文章会专题讨论。 原理上讲,神经网络模型的训练过程其实就是拟合一个数据分布(x)可以映射到输出(y)的数学函数 f(x),而拟合效果的好坏取决于数据及模型。 那对于如何提升拟合能力呢?我们首先从著名的单层神经网络为啥拟合不了XOR函数说起。 一、单层神经网络的缺陷 单层神经网络如逻辑回归、感知器等模型,本质上都属于广义线性分类器(决策边界...

X Description: 本文侧重于模型拟合能力的探讨。过拟合及泛化能力方面下期文章会专题讨论。 原理上讲,神经网络模型的训练过程其实就是拟合一个数据分布(x)可以映射到输出(y)的数学函数 f(x),而拟合效果的好坏取决于数据及模型。 那对于如何提升拟合能力呢?我们首先从著名的单层神经网络为啥拟合不了XOR函数说起。 一、单层神经网络的缺陷 单层神经网络如逻辑回归、感知器等模型,本质上都属于广义线性分类器(决策边界...

Opengraph URL: https://github.com/aialgorithm/Blog/issues/25

X: @github

direct link

Domain: github.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"神经网络拟合能力的提升之路(Pyhton)","articleBody":"\u003e本文侧重于模型拟合能力的探讨。过拟合及泛化能力方面下期文章会专题讨论。\r\n\r\n原理上讲,神经网络模型的训练过程其实就是拟合一个数据分布(x)可以映射到输出(y)的数学函数 f(x),而拟合效果的好坏取决于数据及模型。\r\n那对于如何提升拟合能力呢?我们首先从著名的单层神经网络为啥拟合不了XOR函数说起。\r\n\r\n##  一、单层神经网络的缺陷\r\n单层神经网络如逻辑回归、感知器等模型,本质上都属于广义线性分类器(决策边界为线性)。这点可以从逻辑回归模型的决策函数看出,决策函数Y=sigmoid(wx + b),当wx+b\u003e0,Y\u003e0.5;当wx+b\u003c0,Y\u003c0.5,以wx+b这条线可以区分开Y=0或1(如下图),可见决策边界是线性的。\r\n![](https://upload-images.jianshu.io/upload_images/11682271-50775a58058351a8.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n这也导致了历史上著名xor问题:\r\n\u003e1969年,“符号主义”代表人物马文·明斯基(Marvin Minsky)提出XOR问题:xor即异或运算的函数,输入两个bool数值(取值0或者1),当两个数值不同时输出为1,否则输出为0。如下图,可知XOR数据无法通过线性模型的边界正确的区分开 ![](https://upload-images.jianshu.io/upload_images/11682271-acae99d898a15b66.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n由于单层神经网络线性,连简单的非线性的异或函数都无法正确的学习,而我们经常希望模型是可以学习非线性函数,这给了神经网络研究以沉重的打击,神经网络的研究走向长达10年的低潮时期。![](https://upload-images.jianshu.io/upload_images/11682271-58b88cd6b28c3220.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n如下以逻辑回归代码为例,尝试去学习XOR函数:\r\n\r\n```\r\n# 生成xor数据\r\nimport pandas as pd \r\n\r\nxor_dataset = pd.DataFrame([[1,1,0],[1,0,1],[0,1,1],[0,0,0]],columns=['x0','x1','label'])\r\nx,y = xor_dataset[['x0','x1']], xor_dataset['label']\r\nxor_dataset.head()\r\n\r\nfrom keras.layers import *\r\nfrom keras.models import Sequential, Model\r\n\r\nnp.random.seed(0)\r\nmodel = Sequential()\r\nmodel.add(Dense(1, input_dim=2, activation='sigmoid'))\r\nmodel.summary()\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')\r\nx,y = xor_dataset[['x0','x1']], xor_dataset['label']\r\nmodel.fit(x, y, epochs=100000,verbose=False)\r\nprint(\"正确标签:\",y.values)\r\nprint(\"模型预测:\",model.predict(x).round())\r\n# 正确标签: [0 1 1 0]   模型预测: [1 0 1 0]\r\n```\r\n![](https://upload-images.jianshu.io/upload_images/11682271-297785534f68fa15.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n结果可见,lr线性模型的拟合能力有限,无法学习非线性的XOR函数。那如何解决这个问题呢?\r\n**这就要说到线性模型的根本缺陷———无法使用变量间交互的非线性信息。**\r\n##二、 如何学习非线性的XOR函数\r\n如上文所谈,学习非线性函数的关键在于:模型要使用变量间交互的非线性信息。\r\n\r\n解决思路很清晰了,要么,我们手动给模型加一些非线性特征作为输入(即特征生成的方法)。\r\n\r\n要不然,增加模型的非线性表达能力(即非线性模型),模型可以自己对特征x增加一些ф(x)非线性交互转换。假设原线性模型的表达为f(x;w),非线性模型的表达为f(x, ф(x), w)。\r\n\r\n### 2.1 方法:引入非线性特征\r\n最简单的思路是我们手动加入些其他维度非线性特征,以提高模型非线性的表达能力。这也反映出了特征工程对于模型的重要性,模型很大程度上就是复杂特征+简单模型与简单特征+复杂模型的取舍。\r\n```\r\n# 加入非线性特征\r\n\r\nfrom keras.layers import *\r\nfrom keras.models import Sequential, Model\r\nfrom tensorflow import random\r\n\r\nnp.random.seed(5) # 固定随机种子\r\nrandom.set_seed(5)\r\n\r\n\r\nmodel = Sequential()\r\nmodel.add(Dense(1, input_dim=3, activation='sigmoid'))\r\n\r\nmodel.summary()\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')\r\n\r\n\r\nxor_dataset['x2'] = xor_dataset['x0'] * xor_dataset['x1'] # 非线性特征\r\n\r\n\r\nx,y = xor_dataset[['x0','x1','x2']], xor_dataset['label']\r\n\r\nmodel.fit(x, y, epochs=10000,verbose=False)\r\n\r\nprint(\"正确标签:\",y.values)\r\nprint(\"模型预测:\",model.predict(x).round())\r\n# 正确标签: [0 1 1 0]   模型预测: [0 1 1 0]\r\n```\r\n正确标签: [0 1 1 0] ,模型预测: [0 1 1 0],模型预测结果OK!\r\n\r\n\r\n### 2.2 方法2:深度神经网络(MLP)\r\n搬出万能近似定理,“一个前馈神经网络如果具有线性输出层和至少一层具有任何一种‘‘挤压’’ 性质的激活函数的隐藏层,只要给予网络足够数量的隐藏单元,它可以以任意的精度来近似任何从一个有限维空间到另一个有限维空间的Borel可测函数。”简单来说,前馈神经网络有“够深的网络层”以及“至少一层带激活函数的隐藏层”,既可以拟合任意的函数。\r\n\r\n这里我们将逻辑回归加入一层的隐藏层,升级为一个两层的神经网络(MLP):\r\n```\r\nfrom keras.layers import *\r\nfrom keras.models import Sequential, Model\r\nfrom tensorflow import random\r\n\r\nnp.random.seed(0) # 随机种子\r\nrandom.set_seed(0)\r\n\r\nmodel = Sequential()\r\nmodel.add(Dense(10, input_dim=2, activation='relu'))   # 隐藏层\r\nmodel.add(Dense(1, activation='sigmoid'))  # 输出层\r\n\r\nmodel.summary()\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')  \r\n\r\n\r\nx,y = xor_dataset[['x0','x1']], xor_dataset['label']\r\nmodel.fit(x, y, epochs=100000,verbose=False)  # 训练模型\r\n\r\n\r\nprint(\"正确标签:\",y.values)\r\nprint(\"模型预测:\",model.predict(x).round())\r\n```\r\n正确标签: [0 1 1 0] ,模型预测:[[0.][1.][1.][0.]],模型预测结果OK!\r\n\r\n### 2.3 方法3:支持向量机的核函数\r\n\r\n支持向量机(Support Vector Machine, SVM)可以视为在单隐藏层神经网络基础上的改进(svm具体原理可关注笔者后面的专题介绍),对于线性不可分的问题,不同于深度神经网络的增加非线性隐藏层,SVM利用非线性核函数,本质上都是实现特征空间的非线性变换,提升模型的非线性表达能力。\r\n\r\n![](https://upload-images.jianshu.io/upload_images/11682271-90608491a29f1dce.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)\r\n\r\n```\r\nfrom sklearn.svm import SVC\r\n\r\nsvm = SVC()\r\nsvm.fit(x,y)\r\nsvm.predict(x)\r\n\r\n```\r\n正确标签: [0 1 1 0]   模型预测: [[0.][1.][1.][0.]],模型预测结果OK!\r\n\r\n## 小结\r\n归根结底,机器学习模型可以看作一个函数,本质能力是通过参数w去控制特征表示,以拟合目标值Y,最终学习到的决策函数f( x; w )。模型拟合能力的提升关键即是,**控制及利用特征间交互的非线性信息,实现特征空间的非线性变换。**拟合能力的提升可以归结为以下两方面:\r\n\r\n- 数据方面:通过特征工程 构造复杂特征。\r\n\r\n- 模型方面:使用非线性的复杂模型。如:含非线性隐藏层的神经网络,非线性核函数svm,天然非线性的集成树模型。经验上讲,对这些异质模型做下模型融合效果会更好。\r\n\r\n\r\n\r\n---\r\n本文首发”算法进阶“,公众号阅读原文即访问文章[相关代码](https://github.com/aialgorithm/Blog)","author":{"url":"https://github.com/aialgorithm","@type":"Person","name":"aialgorithm"},"datePublished":"2021-10-15T06:23:21.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/25/Blog/issues/25"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:69ceb285-7abd-4399-5655-fb5a46d78141
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-idAE88:1478A3:87CCA0:BFAA4B:696A163A
html-safe-nonce9fe2b921f22aaa6f99ae41af9e97fc453131b47df461cbe8cbbef9db5775ad24
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBRTg4OjE0NzhBMzo4N0NDQTA6QkZBQTRCOjY5NkExNjNBIiwidmlzaXRvcl9pZCI6Ijc4MDA2NTYxMjAyOTkyNjM1NDYiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ==
visitor-hmac54ed2d16950eb364f3e882df96cbc4bbfaf1ec49462d50068c9cc4d81b1206cf
hovercard-subject-tagissue:1027120908
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/aialgorithm/Blog/25/issue_layout
twitter:imagehttps://opengraph.githubassets.com/ca367ca582415f156785054b10051e5d2f24662295e4b72b27c8268521a3b828/aialgorithm/Blog/issues/25
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/ca367ca582415f156785054b10051e5d2f24662295e4b72b27c8268521a3b828/aialgorithm/Blog/issues/25
og:image:alt本文侧重于模型拟合能力的探讨。过拟合及泛化能力方面下期文章会专题讨论。 原理上讲,神经网络模型的训练过程其实就是拟合一个数据分布(x)可以映射到输出(y)的数学函数 f(x),而拟合效果的好坏取决于数据及模型。 那对于如何提升拟合能力呢?我们首先从著名的单层神经网络为啥拟合不了XOR函数说起。 一、单层神经网络的缺陷 单层神经网络如逻辑回归、感知器等模型,本质上都属于广义线性分类器(决策边界...
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernameaialgorithm
hostnamegithub.com
expected-hostnamegithub.com
None34a52bd10bd674f68e5c1b6b74413b79bf2ca20c551055ace3f7cdd112803923
turbo-cache-controlno-preview
go-importgithub.com/aialgorithm/Blog git https://github.com/aialgorithm/Blog.git
octolytics-dimension-user_id33707637
octolytics-dimension-user_loginaialgorithm
octolytics-dimension-repository_id147093233
octolytics-dimension-repository_nwoaialgorithm/Blog
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id147093233
octolytics-dimension-repository_network_root_nwoaialgorithm/Blog
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
releasee8bd37502700f365b18a4d39acf7cb7947e11b1a
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://github.com/aialgorithm/Blog/issues/25#start-of-content
https://github.com/
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Faialgorithm%2FBlog%2Fissues%2F25
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Faialgorithm%2FBlog%2Fissues%2F25
Sign up https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=aialgorithm%2FBlog
Reloadhttps://github.com/aialgorithm/Blog/issues/25
Reloadhttps://github.com/aialgorithm/Blog/issues/25
Reloadhttps://github.com/aialgorithm/Blog/issues/25
aialgorithm https://github.com/aialgorithm
Bloghttps://github.com/aialgorithm/Blog
Notifications https://github.com/login?return_to=%2Faialgorithm%2FBlog
Fork 259 https://github.com/login?return_to=%2Faialgorithm%2FBlog
Star 942 https://github.com/login?return_to=%2Faialgorithm%2FBlog
Code https://github.com/aialgorithm/Blog
Issues 66 https://github.com/aialgorithm/Blog/issues
Pull requests 0 https://github.com/aialgorithm/Blog/pulls
Actions https://github.com/aialgorithm/Blog/actions
Projects 0 https://github.com/aialgorithm/Blog/projects
Security Uh oh! There was an error while loading. Please reload this page. https://github.com/aialgorithm/Blog/security
Please reload this pagehttps://github.com/aialgorithm/Blog/issues/25
Insights https://github.com/aialgorithm/Blog/pulse
Code https://github.com/aialgorithm/Blog
Issues https://github.com/aialgorithm/Blog/issues
Pull requests https://github.com/aialgorithm/Blog/pulls
Actions https://github.com/aialgorithm/Blog/actions
Projects https://github.com/aialgorithm/Blog/projects
Security https://github.com/aialgorithm/Blog/security
Insights https://github.com/aialgorithm/Blog/pulse
New issuehttps://github.com/login?return_to=https://github.com/aialgorithm/Blog/issues/25
New issuehttps://github.com/login?return_to=https://github.com/aialgorithm/Blog/issues/25
神经网络拟合能力的提升之路(Pyhton)https://github.com/aialgorithm/Blog/issues/25#top
https://github.com/aialgorithm
https://github.com/aialgorithm
aialgorithmhttps://github.com/aialgorithm
on Oct 15, 2021https://github.com/aialgorithm/Blog/issues/25#issue-1027120908
https://camo.githubusercontent.com/8e931010f92fc41662d5b88d06f1aaf5216c722be0f367aac626f02839a97e41/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d353037373561353830353833353161382e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/881d5d903b8c9b2b3761e9e8e30ef0f4eeb44c7fec0d99301435ef2aad892140/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d616361653939643839386131356236362e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/a95f1502b8431108068fc31616bb47a06eb214cc4c7c518f77f4a2b0cf68eb48/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d353862383863643662323863333232302e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/fa1ef332f1fbc582c1c31249ba521ce48d63228c5f4d49a5bedc8750e65c6509/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d323937373835353334663638666131352e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
https://camo.githubusercontent.com/50829f63e758eb6c6bee8de1809528187fd891baa542ca74b36f398b39d4915d/68747470733a2f2f75706c6f61642d696d616765732e6a69616e7368752e696f2f75706c6f61645f696d616765732f31313638323237312d393036303834393161323966316463652e706e673f696d6167654d6f6772322f6175746f2d6f7269656e742f7374726970253743696d61676556696577322f322f772f31323430
相关代码https://github.com/aialgorithm/Blog
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.