Title: The preview of the new Python 3 port has broken HTML escaping in the XML feeds · Issue #582 · python/planet · GitHub
Open Graph Title: The preview of the new Python 3 port has broken HTML escaping in the XML feeds · Issue #582 · python/planet
X Title: The preview of the new Python 3 port has broken HTML escaping in the XML feeds · Issue #582 · python/planet
Description: I am using: O.S: Fedora 40 Browser: Firefox 131.0.2 Platform: desktop Problem The preview of the new Python 3 port has broken HTML escaping in the XML feeds eg try to view this in the browser: https://planetpython.org/3/rss10.xml and it ...
Open Graph Description: I am using: O.S: Fedora 40 Browser: Firefox 131.0.2 Platform: desktop Problem The preview of the new Python 3 port has broken HTML escaping in the XML feeds eg try to view this in the browser: http...
X Description: I am using: O.S: Fedora 40 Browser: Firefox 131.0.2 Platform: desktop Problem The preview of the new Python 3 port has broken HTML escaping in the XML feeds eg try to view this in the browser: http...
Opengraph URL: https://github.com/python/planet/issues/582
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"The preview of the new Python 3 port has broken HTML escaping in the XML feeds","articleBody":"I am using: \r\n**O.S**: Fedora 40\r\n**Browser**: Firefox 131.0.2\r\n**Platform**: desktop\r\n\r\n## Problem\r\nThe preview of the new Python 3 port has broken HTML escaping in the XML feeds\r\n\r\neg try to view this in the browser:\r\n\r\n https://planetpython.org/3/rss10.xml\r\n\r\nand it will complain about undefined entities, due to having raw unescaped HTML in the XML document\r\n\r\nBy comparison the original Python 2 code escaped HTML in the feed\r\n\r\n```\r\n$ wget https://planetpython.org/rss10.xml\r\n$ grep \"content:encoded\" rss10.xml | head -1\r\n\t\u003ccontent:encoded\u003e\u0026lt;p\u0026gt;As is probably apparent from the sequence of blog posts about the topic in the\r\n$ wget https://planetpython.org/3/rss10.xml\r\n$ grep \"content:encoded\" rss10.xml.1 | head -1\r\n\t\u003ccontent:encoded\u003e\u003cp\u003eAs is probably apparent from the sequence of blog posts about the topic in the\r\n```\r\n\r\n## Details\r\n\r\n\r\nThis problem is caused by a mistake in the python 3 conversion done in #577, specially in commit https://github.com/python/planet/pull/577/commits/86e31f90403c4659471396beeba922584e08d12e replaced code patterns like:\r\n\r\n```\r\nfeed[key] = sanitize.HTML(feed[key])\r\n```\r\n\r\nwith\r\n\r\n```\r\nfeed[key] = Markup(feed[key])\r\n```\r\n\r\nwhich is not providing functionally equivalent behaviour.\r\n\r\nThe `sanitize.HTML` method would parse the HTML and strip out various undesirable elements and attributes, and escaping was later performed by the template processor.\r\n\r\nThe `Markup` method will not parse anything, it'll just wrap the `str` in a `Markup` class, as a way to designate it as being safe to use as-is without further escaping. As a result when you later try to escape the variable in jinga using `... | e`, it will do nothing at all, resulting in raw HTML being put into the XML document, leading to the later parsing errors.\r\n\r\nI think either the original sanitizer code needs to be re-instated and made to work with py3, or perhaps an external library such as https://github.com/matthiask/html-sanitizer/ could be leveraged ?","author":{"url":"https://github.com/berrange","@type":"Person","name":"berrange"},"datePublished":"2024-10-24T13:28:20.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":3},"url":"https://github.com/582/planet/issues/582"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:86bfeff6-7d4c-bb6d-b8b7-13da18f1ebef |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | CFFE:29A429:10D8C59:160E785:69801036 |
| html-safe-nonce | 07549e8ec1bb2e8f539a0ae6b96118569aeaa7962e5fec1f5146243a757e477f |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDRkZFOjI5QTQyOToxMEQ4QzU5OjE2MEU3ODU6Njk4MDEwMzYiLCJ2aXNpdG9yX2lkIjoiNTQ1NjMwMTAxMTIyMjEzODkzNCIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | 00f91986c80e6cab0187d46501e972a4f3ec0b8445814f7a2aca0ff19d683a2a |
| hovercard-subject-tag | issue:2611623032 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python/planet/582/issue_layout |
| twitter:image | https://opengraph.githubassets.com/5645c1db2492576d0973953cbc853ecfe8307fe574be9072c81f3353124dbb6b/python/planet/issues/582 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/5645c1db2492576d0973953cbc853ecfe8307fe574be9072c81f3353124dbb6b/python/planet/issues/582 |
| og:image:alt | I am using: O.S: Fedora 40 Browser: Firefox 131.0.2 Platform: desktop Problem The preview of the new Python 3 port has broken HTML escaping in the XML feeds eg try to view this in the browser: http... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | berrange |
| hostname | github.com |
| expected-hostname | github.com |
| None | 60279d4097367e16897439d16d6bbe4180663db828c666eeed2656988ffe59f6 |
| turbo-cache-control | no-preview |
| go-import | github.com/python/planet git https://github.com/python/planet.git |
| octolytics-dimension-user_id | 1525981 |
| octolytics-dimension-user_login | python |
| octolytics-dimension-repository_id | 24216135 |
| octolytics-dimension-repository_nwo | python/planet |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 24216135 |
| octolytics-dimension-repository_network_root_nwo | python/planet |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 7c85641c598ad130c74f7bcc27f58575cac69551 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width