René's URL Explorer Experiment


Title: dashes in word_count.txt cause errors with WordCount.py · Issue #12 · jleetutorial/python-spark-tutorial · GitHub

Open Graph Title: dashes in word_count.txt cause errors with WordCount.py · Issue #12 · jleetutorial/python-spark-tutorial

X Title: dashes in word_count.txt cause errors with WordCount.py · Issue #12 · jleetutorial/python-spark-tutorial

Description: Issue: Thendash characters in word_count.txt cause an error when following the "Run your first Spark Job" tutorial. There are only two occurences of this character here: "from 1913–74." and here: "near–bankruptcy". To Recreate: using spa...

Open Graph Description: Issue: Thendash characters in word_count.txt cause an error when following the "Run your first Spark Job" tutorial. There are only two occurences of this character here: "from 1913–74." and here: "...

X Description: Issue: Thendash characters in word_count.txt cause an error when following the "Run your first Spark Job" tutorial. There are only two occurences of this character here: "from 1913–7...

Opengraph URL: https://github.com/jleetutorial/python-spark-tutorial/issues/12

X: @github

direct link

Domain: patch-diff.githubusercontent.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"dashes in word_count.txt cause errors with WordCount.py","articleBody":"### Issue:\r\nThe`ndash` characters in `word_count.txt` cause an error when following the \"Run your first Spark Job\" tutorial. There are only two occurences of this character here: \"`from 1913–74.`\" and here: \"`near–bankruptcy`\".\r\n\r\n#### To Recreate:\r\nusing `spark-2.3.2-bin-hadoop2.7` on Ubuntu18, pyspark/python 2.7, Installed following instructions from lecture 5, go to directory where you cloned `python-spark-tutorial` and run the following from lecture 6:\r\n\r\n`spark-submit ./rdd/WordCount.py`\r\n\r\nThe execution halts about halfway through the frequency counter with the following error:\r\n\r\n```\r\nUnicodeEncodeError: 'ascii' codec can't encode character u'\\u2013' in position4: ordinal not in range(128)\r\n```\r\nSpoiler, it's the dash. I'm not sure whether or not the utf16 dash was intentional, so I'm posting. \r\n\r\n#### Work-Around:\r\n\r\nI changed the two `ndash` characters to  \"`from 1913-74.`\" and \"`near-bankruptcy`\", which solved the issue for me. Related [stackoverflow thread](https://stackoverflow.com/questions/20329896/python-2-7-character-u2013) where someone else ran into a similar problem with python2.7 and used the same solution.\r\n","author":{"url":"https://github.com/HarryCaveMan","@type":"Person","name":"HarryCaveMan"},"datePublished":"2018-11-04T05:16:14.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":1},"url":"https://github.com/12/python-spark-tutorial/issues/12"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:54e66488-82f9-77a3-1dfe-2c4d49ad0688
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-id9C94:3AD161:1773068:1FB3440:6972AE69
html-safe-nonce9d4d1c083653c8a1f6e8cdf431f8615eb0a03aa124e62827a72ced8b8a159f3e
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5Qzk0OjNBRDE2MToxNzczMDY4OjFGQjM0NDA6Njk3MkFFNjkiLCJ2aXNpdG9yX2lkIjoiNTUyNjQ1NTMyNDg2MTcwNTgzMyIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9
visitor-hmac1e41c58a9a8183243ffccefcea15364b1435bd8a3da7977b8049a558c4b5cace
hovercard-subject-tagissue:377122208
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/jleetutorial/python-spark-tutorial/12/issue_layout
twitter:imagehttps://opengraph.githubassets.com/18bedf02a40c30c83f81d1b3eff875bc6b7cafea9dcc47b12903f9cdb29fff08/jleetutorial/python-spark-tutorial/issues/12
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/18bedf02a40c30c83f81d1b3eff875bc6b7cafea9dcc47b12903f9cdb29fff08/jleetutorial/python-spark-tutorial/issues/12
og:image:altIssue: Thendash characters in word_count.txt cause an error when following the "Run your first Spark Job" tutorial. There are only two occurences of this character here: "from 1913–74." and here: "...
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernameHarryCaveMan
hostnamegithub.com
expected-hostnamegithub.com
Noneae357919e9cc5fb635a01c9a2cc530478d3ac85f55090215eb70e1beca3385ac
turbo-cache-controlno-preview
go-importgithub.com/jleetutorial/python-spark-tutorial git https://github.com/jleetutorial/python-spark-tutorial.git
octolytics-dimension-user_id19826074
octolytics-dimension-user_loginjleetutorial
octolytics-dimension-repository_id104780751
octolytics-dimension-repository_nwojleetutorial/python-spark-tutorial
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id104780751
octolytics-dimension-repository_network_root_nwojleetutorial/python-spark-tutorial
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
release625f1bd9f76a617a9c0729e2de91edb56b6ce42f
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/issues/12#start-of-content
https://patch-diff.githubusercontent.com/
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Fjleetutorial%2Fpython-spark-tutorial%2Fissues%2F12
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Fjleetutorial%2Fpython-spark-tutorial%2Fissues%2F12
Sign up https://patch-diff.githubusercontent.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=jleetutorial%2Fpython-spark-tutorial
Reloadhttps://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/issues/12
Reloadhttps://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/issues/12
Reloadhttps://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/issues/12
jleetutorial https://patch-diff.githubusercontent.com/jleetutorial
python-spark-tutorialhttps://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial
Notifications https://patch-diff.githubusercontent.com/login?return_to=%2Fjleetutorial%2Fpython-spark-tutorial
Fork 299 https://patch-diff.githubusercontent.com/login?return_to=%2Fjleetutorial%2Fpython-spark-tutorial
Star 202 https://patch-diff.githubusercontent.com/login?return_to=%2Fjleetutorial%2Fpython-spark-tutorial
Code https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial
Issues 4 https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/issues
Pull requests 3 https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/pulls
Actions https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/actions
Projects 0 https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/projects
Security 0 https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/security
Insights https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/pulse
Code https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial
Issues https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/issues
Pull requests https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/pulls
Actions https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/actions
Projects https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/projects
Security https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/security
Insights https://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/pulse
New issuehttps://patch-diff.githubusercontent.com/login?return_to=https://github.com/jleetutorial/python-spark-tutorial/issues/12
New issuehttps://patch-diff.githubusercontent.com/login?return_to=https://github.com/jleetutorial/python-spark-tutorial/issues/12
dashes in word_count.txt cause errors with WordCount.pyhttps://patch-diff.githubusercontent.com/jleetutorial/python-spark-tutorial/issues/12#top
https://github.com/HarryCaveMan
https://github.com/HarryCaveMan
HarryCaveManhttps://github.com/HarryCaveMan
on Nov 4, 2018https://github.com/jleetutorial/python-spark-tutorial/issues/12#issue-377122208
stackoverflow threadhttps://stackoverflow.com/questions/20329896/python-2-7-character-u2013
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.