|
| https://realpython.com/ |
| Start Here | https://realpython.com/start-here/ |
|
Learn Python
| https://realpython.com/videos/ascii-python-string-module/ |
| Python Tutorials →In-depth articles and video courses | https://realpython.com/search?kind=article&kind=course&order=newest |
| Learning Paths →Guided study plans for accelerated learning | https://realpython.com/learning-paths/ |
| Quizzes & Exercises →Check your learning progress | https://realpython.com/quizzes/ |
| Browse Topics →Focus on a specific area or skill level | https://realpython.com/tutorials/all/ |
| Community Chat →Learn with other Pythonistas | https://realpython.com/community/ |
| Office Hours →Live Q&A calls with Python experts | https://realpython.com/office-hours/ |
| Podcast →Hear what’s new in the world of Python | https://realpython.com/podcasts/rpp/ |
| Books →Round out your knowledge and learn offline | https://realpython.com/products/books/ |
| Reference →Concise definitions for common Python terms | https://realpython.com/ref/ |
| Code Mentor →BetaPersonalized code assistance & learning tools | https://realpython.com/mentor/ |
| Unlock All Content → | https://realpython.com/account/join/ |
|
More
| https://realpython.com/videos/ascii-python-string-module/ |
| Learner Stories | https://realpython.com/learner-stories/ |
| Python Newsletter | https://realpython.com/newsletter/ |
| Python Job Board | https://www.pythonjobshq.com |
| Meet the Team | https://realpython.com/team/ |
| Become a Tutorial Writer | https://realpython.com/write-for-us/ |
| Become a Video Instructor | https://realpython.com/become-an-instructor/ |
| Search | https://realpython.com/search |
| https://realpython.com/search |
| Join | https://realpython.com/account/join/ |
| Sign‑In | https://realpython.com/account/login/?next=%2Fvideos%2Fascii-python-string-module%2F |
| https://realpython.com/courses/python-unicode/#team |
| Unicode in Python: Working With Character Encodings | https://realpython.com/courses/python-unicode/ |
| Christopher Trudeau | https://realpython.com/courses/python-unicode/#team |
| Recommended Tutorial | https://realpython.com/python-encodings-guide/ |
| Course Slides (.pdf) | https://realpython.com/courses/python-unicode/downloads/unicode-slides/ |
| Sample Code (.zip) | https://realpython.com/courses/python-unicode/downloads/unicode-sample-code/ |
| Ask a Question | https://realpython.com/videos/ascii-python-string-module/#discussion |
| https://realpython.com/feedback/survey/course/python-unicode/liked/?from=lesson-title |
| https://realpython.com/feedback/survey/course/python-unicode/disliked/?from=lesson-title |
| Contents | https://realpython.com/videos/ascii-python-string-module/#description |
| Transcript | https://realpython.com/videos/ascii-python-string-module/#transcript |
| Discussion | https://realpython.com/videos/ascii-python-string-module/#discussion |
| 00:00 | https://realpython.com/videos/ascii-python-string-module/#t=0.51 |
| In the previous lesson, I introduced you to characters, | https://realpython.com/videos/ascii-python-string-module/#t=0.51 |
| character points, and the encoding thereof. In this lesson, | https://realpython.com/videos/ascii-python-string-module/#t=2.76 |
| I’m going to dive further into ASCII and its support in the Python string | https://realpython.com/videos/ascii-python-string-module/#t=5.79 |
| module. | https://realpython.com/videos/ascii-python-string-module/#t=8.97 |
| 00:10 | https://realpython.com/videos/ascii-python-string-module/#t=10.68 |
| ASCII became one of most common standards for encoding because it was used by | https://realpython.com/videos/ascii-python-string-module/#t=10.68 |
| PCs early on. ASCII only encodes the basic Latin alphabet. | https://realpython.com/videos/ascii-python-string-module/#t=13.92 |
| There are no accented characters. The original encoding is 7 bits, | https://realpython.com/videos/ascii-python-string-module/#t=18.6 |
| so 128 characters in total, and it can be divided up into a series of groups. | https://realpython.com/videos/ascii-python-string-module/#t=22.83 |
| 00:28 | https://realpython.com/videos/ascii-python-string-module/#t=28.44 |
| The first 32 are control characters. They’re non-printable. | https://realpython.com/videos/ascii-python-string-module/#t=28.44 |
| These include things like printer controls, the bell sound, and carriage return. | https://realpython.com/videos/ascii-python-string-module/#t=32.07 |
| The next chunk is the space, a series of symbols, and numbers. | https://realpython.com/videos/ascii-python-string-module/#t=37.11 |
| 00:42 | https://realpython.com/videos/ascii-python-string-module/#t=42.06 |
| After that comes capital letters, a few more symbols, lower letters, | https://realpython.com/videos/ascii-python-string-module/#t=42.06 |
| a few more symbols, and then finally, the character for deletion. | https://realpython.com/videos/ascii-python-string-module/#t=47.04 |
| The original ASCII was a 7-bit encoding, | https://realpython.com/videos/ascii-python-string-module/#t=51.27 |
| and so went from 0 to 127. PCs used 8-bit bytes, | https://realpython.com/videos/ascii-python-string-module/#t=53.64 |
| so oftentimes, the leading 8th bit was used for parity checks during | https://realpython.com/videos/ascii-python-string-module/#t=58.59 |
| transmission. | https://realpython.com/videos/ascii-python-string-module/#t=62.37 |
| 01:04 | https://realpython.com/videos/ascii-python-string-module/#t=64.17 |
| It didn’t take long to figure out that ASCII was insufficient to handle other | https://realpython.com/videos/ascii-python-string-module/#t=64.17 |
| kinds of languages. | https://realpython.com/videos/ascii-python-string-module/#t=67.83 |
| Accented characters for Latin and Germanic languages were added by extending | https://realpython.com/videos/ascii-python-string-module/#t=69.36 |
| ASCII to use the full 8 bits. This wasn’t the only extension. | https://realpython.com/videos/ascii-python-string-module/#t=73.17 |
| 01:17 | https://realpython.com/videos/ascii-python-string-module/#t=77.97 |
| Another one was called Latin-1. Latin-1 was then modified by Microsoft to | https://realpython.com/videos/ascii-python-string-module/#t=77.97 |
| create Windows-1252. Latin-1 and 1252 are very, | https://realpython.com/videos/ascii-python-string-module/#t=83.01 |
| very close, | https://realpython.com/videos/ascii-python-string-module/#t=87.48 |
| which causes all sorts of problems because it looks like you can interchange | https://realpython.com/videos/ascii-python-string-module/#t=88.74 |
| them, but every once in a while, | https://realpython.com/videos/ascii-python-string-module/#t=92.07 |
| you’re going to run into a character difference. | https://realpython.com/videos/ascii-python-string-module/#t=93.3 |
| 01:36 | https://realpython.com/videos/ascii-python-string-module/#t=96.24 |
| If you’re wondering why I’m spending so much time on ASCII when this is a course | https://realpython.com/videos/ascii-python-string-module/#t=96.24 |
| on Unicode, well, | https://realpython.com/videos/ascii-python-string-module/#t=99.81 |
| it turns out that Unicode, Latin-1, Windows-1252— | https://realpython.com/videos/ascii-python-string-module/#t=101.16 |
| they all use the first 128 code points from ASCII. | https://realpython.com/videos/ascii-python-string-module/#t=105.63 |
| 01:50 | https://realpython.com/videos/ascii-python-string-module/#t=110.58 |
| So, if you’re sticking with the characters that I described in the previous | https://realpython.com/videos/ascii-python-string-module/#t=110.58 |
| screen, then the encoding is compatible across all four of these standards. | https://realpython.com/videos/ascii-python-string-module/#t=114.48 |
| Although Unicode is quickly becoming the defacto encoding, due to history, | https://realpython.com/videos/ascii-python-string-module/#t=119.34 |
| you still run into other encodings quite frequently. | https://realpython.com/videos/ascii-python-string-module/#t=123.24 |
| 02:06 | https://realpython.com/videos/ascii-python-string-module/#t=126.42 |
| The web is one of those places. | https://realpython.com/videos/ascii-python-string-module/#t=126.42 |
| Latin-1 was the original default encoding for documents delivered over HTTP. | https://realpython.com/videos/ascii-python-string-module/#t=128.639 |
| Anything with a MIME type of text/, unless you specify otherwise, is using Latin-1. | https://realpython.com/videos/ascii-python-string-module/#t=134.1 |
| 02:19 | https://realpython.com/videos/ascii-python-string-module/#t=139.26 |
| Of course, standards are, well, not always so standard, | https://realpython.com/videos/ascii-python-string-module/#t=139.26 |
| so depending on what web server you were using and what browsers you were using, | https://realpython.com/videos/ascii-python-string-module/#t=144.54 |
| there were subtle differences to this. In order to get around this, browsers | https://realpython.com/videos/ascii-python-string-module/#t=148.26 |
| try to guess the encoding. This works with a varying degree of success, | https://realpython.com/videos/ascii-python-string-module/#t=152.19 |
| although they’ve gotten much better in the recent past. Old coders like me used | https://realpython.com/videos/ascii-python-string-module/#t=155.91 |
| to spend a lot of time on Slashdot. If you’re not familiar, | https://realpython.com/videos/ascii-python-string-module/#t=160.44 |
| this is a website that aggregates technology news. It’s been around since 1997 | https://realpython.com/videos/ascii-python-string-module/#t=163.2 |
| and I’m pretty sure some of the code in there is still the original code. It’s | https://realpython.com/videos/ascii-python-string-module/#t=168.96 |
| notorious for not supporting Unicode, | https://realpython.com/videos/ascii-python-string-module/#t=173.04 |
| and you can see this in a comment that I’ve clipped here. | https://realpython.com/videos/ascii-python-string-module/#t=175.2 |
| 02:58 | https://realpython.com/videos/ascii-python-string-module/#t=178.71 |
| This isn’t because a cat ran across this person’s keyboard— | https://realpython.com/videos/ascii-python-string-module/#t=178.71 |
| this is because the apostrophe has been interpreted in a different encoding and | https://realpython.com/videos/ascii-python-string-module/#t=181.81 |
| you get a whole bunch of garbage instead of the poster’s intent. | https://realpython.com/videos/ascii-python-string-module/#t=186.46 |
| 03:10 | https://realpython.com/videos/ascii-python-string-module/#t=190.15 |
| You don’t see them as often anymore, | https://realpython.com/videos/ascii-python-string-module/#t=190.15 |
| but in the early 2000s, frequently web pages would be littered with these | https://realpython.com/videos/ascii-python-string-module/#t=192.01 |
| little question marks and blocks. Before browsers got better at guessing the | https://realpython.com/videos/ascii-python-string-module/#t=195.43 |
| encoding, this was the character that was shown if the character on the page | https://realpython.com/videos/ascii-python-string-module/#t=198.76 |
| couldn’t be shown in the browser’s current encoding. Thankfully, | https://realpython.com/videos/ascii-python-string-module/#t=202.69 |
| this problem is mostly solved | https://realpython.com/videos/ascii-python-string-module/#t=206.59 |
| now. The Python string module defines a whole bunch of constants that are useful | https://realpython.com/videos/ascii-python-string-module/#t=208.21 |
| for looking at ASCII. Let’s take a look at a few of them. | https://realpython.com/videos/ascii-python-string-module/#t=213.13 |
| 03:36 | https://realpython.com/videos/ascii-python-string-module/#t=216.61 |
| string.whitespace defines tab, | https://realpython.com/videos/ascii-python-string-module/#t=216.61 |
| newline, and others to be whitespace characters. ascii_lowercase | https://realpython.com/videos/ascii-python-string-module/#t=218.56 |
| and ascii_uppercase show the alphabet letters. | https://realpython.com/videos/ascii-python-string-module/#t=223.21 |
| 03:46 | https://realpython.com/videos/ascii-python-string-module/#t=226.03 |
| ascii_letters is the combination of those two. digits are the numbers. | https://realpython.com/videos/ascii-python-string-module/#t=226.03 |
| hexdigits are the numbers | https://realpython.com/videos/ascii-python-string-module/#t=231.43 |
| plus the first few characters in either lower or upper case. octdigits are the | https://realpython.com/videos/ascii-python-string-module/#t=232.99 |
| first eight numbers. punctuation symbols. | https://realpython.com/videos/ascii-python-string-module/#t=238.6 |
| 04:02 | https://realpython.com/videos/ascii-python-string-module/#t=242.08 |
| And finally, string.printable shows all of these combined. | https://realpython.com/videos/ascii-python-string-module/#t=242.08 |
| 04:08 | https://realpython.com/videos/ascii-python-string-module/#t=248.05 |
| Let’s crack open the REPL and take a look at this in practice. | https://realpython.com/videos/ascii-python-string-module/#t=248.05 |
| I’m going to import string so I can get access to those constants that I just | https://realpython.com/videos/ascii-python-string-module/#t=250.96 |
| showed you. Type in a question. | https://realpython.com/videos/ascii-python-string-module/#t=254.83 |
| 04:20 | https://realpython.com/videos/ascii-python-string-module/#t=260.2 |
| Now, let’s say you wanted to pull the punctuation and space off the right-hand | https://realpython.com/videos/ascii-python-string-module/#t=260.2 |
| side. The .rstrip() method will pull characters out of a string. | https://realpython.com/videos/ascii-python-string-module/#t=263.38 |
| 04:28 | https://realpython.com/videos/ascii-python-string-module/#t=268.99 |
| If you pass in values to .rstrip(), | https://realpython.com/videos/ascii-python-string-module/#t=268.99 |
| it’ll tell it what characters to pull. Passing in string.punctuation and | https://realpython.com/videos/ascii-python-string-module/#t=270.91 |
| string.whitespace will pull all of the question marks and the exclamation marks | https://realpython.com/videos/ascii-python-string-module/#t=275.8 |
| and the space between them off the right-hand side of that string. | https://realpython.com/videos/ascii-python-string-module/#t=279.94 |
| 04:45 | https://realpython.com/videos/ascii-python-string-module/#t=285.22 |
| You can use the .isascii() method to see whether or not a value is ASCII. | https://realpython.com/videos/ascii-python-string-module/#t=285.22 |
| 04:51 | https://realpython.com/videos/ascii-python-string-module/#t=291.28 |
| You can use the .isprintable() method to see whether or not it contains printable | https://realpython.com/videos/ascii-python-string-module/#t=291.28 |
| characters. One word of caution: .isprintable() | https://realpython.com/videos/ascii-python-string-module/#t=294.67 |
| doesn’t actually use string.printable, | https://realpython.com/videos/ascii-python-string-module/#t=298.15 |
| so there’s a subtle difference between the two. | https://realpython.com/videos/ascii-python-string-module/#t=300.64 |
| 05:06 | https://realpython.com/videos/ascii-python-string-module/#t=306.64 |
| .isprintable() on blanks is False, | https://realpython.com/videos/ascii-python-string-module/#t=306.64 |
| even though string.printable includes the tab and newline characters. | https://realpython.com/videos/ascii-python-string-module/#t=309.58 |
| 05:13 | https://realpython.com/videos/ascii-python-string-module/#t=313.81 |
| This is because .isprintable() is an older method that tells you whether or not | https://realpython.com/videos/ascii-python-string-module/#t=313.81 |
| something is printable within the repr() representation. That repr() representation | https://realpython.com/videos/ascii-python-string-module/#t=317.47 |
| doesn’t actually include tabs and newlines, | https://realpython.com/videos/ascii-python-string-module/#t=323.26 |
| so you get into the strange situation where string.printable— | https://realpython.com/videos/ascii-python-string-module/#t=326.08 |
| which does include those characters— | https://realpython.com/videos/ascii-python-string-module/#t=329.74 |
| 05:33 | https://realpython.com/videos/ascii-python-string-module/#t=333.7 |
| isn’t printable. Before digging into Unicode and how it’s represented, | https://realpython.com/videos/ascii-python-string-module/#t=333.7 |
| you’re going to need a little bit of computer science math. | https://realpython.com/videos/ascii-python-string-module/#t=339.1 |
| So in the next episode, | https://realpython.com/videos/ascii-python-string-module/#t=341.98 |
| I’ll be reviewing bits, bytes, octal, and hex representations. | https://realpython.com/videos/ascii-python-string-module/#t=343.24 |
| Become a Member | https://realpython.com/account/join/ |
| https://realpython.com/videos/python-unicode-overview/ |
| Overview | https://realpython.com/courses/python-unicode/ |
| https://realpython.com/lessons/bits-bytes-oct-hex/ |
|
Unicode in Python: Working With Character Encodings (Overview) 07:56
| https://realpython.com/videos/python-unicode-overview/ |
|
Working With ASCII and the Python String Module 05:49
| https://realpython.com/videos/ascii-python-string-module/ |
|
Working in Binary: Bits, Bytes, Oct, and Hex 06:26
| https://realpython.com/lessons/bits-bytes-oct-hex/ |
|
Using Unicode 04:15
| https://realpython.com/lessons/using-unicode/ |
|
Encoding UTF-8 06:19
| https://realpython.com/lessons/encoding-utf8/ |
|
Combining Characters 05:40
| https://realpython.com/lessons/combining-characters/ |
|
Using Built-In Functions 05:38
| https://realpython.com/lessons/built-in-functions/ |
|
Using Other Encodings 04:45
| https://realpython.com/lessons/other-encodings/ |
|
Unicode in Python: Working With Character Encodings (Summary) 04:53
| https://realpython.com/lessons/python-unicode-summary/ |
| Privacy Policy | https://realpython.com/privacy-policy/ |
Viewport: width=device-width, initial-scale=1, shrink-to-fit=no, viewport-fit=cover