Title: HTMLTokenizer.stream.chunkOffset not updating on string with no html elements · Issue #571 · html5lib/html5lib-python · GitHub
Open Graph Title: HTMLTokenizer.stream.chunkOffset not updating on string with no html elements · Issue #571 · html5lib/html5lib-python
X Title: HTMLTokenizer.stream.chunkOffset not updating on string with no html elements · Issue #571 · html5lib/html5lib-python
Description: from html5lib._tokenizer import HTMLTokenizer from io import StringIO class T(): def __init__(self, data): print("Object from string: " + data) self.src = StringIO() self.tokenizer = HTMLTokenizer(self.src) pos = self.src.tell() self.src...
Open Graph Description: from html5lib._tokenizer import HTMLTokenizer from io import StringIO class T(): def __init__(self, data): print("Object from string: " + data) self.src = StringIO() self.tokenizer = HTMLTokenizer(...
X Description: from html5lib._tokenizer import HTMLTokenizer from io import StringIO class T(): def __init__(self, data): print("Object from string: " + data) self.src = StringIO() self.tokenizer = HTML...
Opengraph URL: https://github.com/html5lib/html5lib-python/issues/571
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"HTMLTokenizer.stream.chunkOffset not updating on string with no html elements","articleBody":"```\r\nfrom html5lib._tokenizer import HTMLTokenizer\r\nfrom io import StringIO\r\n\r\nclass T():\r\n def __init__(self, data):\r\n print(\"Object from string: \" + data)\r\n self.src = StringIO()\r\n self.tokenizer = HTMLTokenizer(self.src)\r\n\r\n pos = self.src.tell()\r\n self.src.write(data)\r\n self.src.seek(pos)\r\n self.handle_tokens()\r\n self.src.close()\r\n\r\n def handle_tokens(self):\r\n for token in self.tokenizer:\r\n print(str(self.tokenizer.stream.chunkOffset))\r\n\r\nT(\"klas katt\")\r\nT(\"klas katt\u003cbr\u003e\")\r\n```\r\n\r\n-\u003e\r\n\r\nObject from string: klas katt\r\n0\r\nObject from string: klas katt\u003cbr\u003e\r\n9\r\n13\r\n\r\n\r\nI expected first number outputted to be 9.\r\n\r\nApologies if this is an internal variable I should not use.\r\nI'm trying to deduce tag offsets (start,stop) in the html document.","author":{"url":"https://github.com/ehsmeng","@type":"Person","name":"ehsmeng"},"datePublished":"2023-07-29T15:31:34.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/571/html5lib-python/issues/571"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:5e017968-118d-03e2-61ba-964d1fdf824a |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | 9718:2F3D0F:1E84E6:280FF3:696EC42A |
| html-safe-nonce | 6b88bdd91c6cea0c74d687289152c909d7660cdbd02ab8b5e1bc766ed0f4a4bf |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5NzE4OjJGM0QwRjoxRTg0RTY6MjgwRkYzOjY5NkVDNDJBIiwidmlzaXRvcl9pZCI6IjQzNTE2ODA4NjM0NDM3OTcwMzQiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== |
| visitor-hmac | dcf4f2e6218e37b600edb73df820a85ef965868c6b03c4c7528198b42fbf7d75 |
| hovercard-subject-tag | issue:1827523700 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/html5lib/html5lib-python/571/issue_layout |
| twitter:image | https://opengraph.githubassets.com/c075ab088fbbb1447c0830ca391d7aa4cad4f405f36e66bd9d0706571e9d4c30/html5lib/html5lib-python/issues/571 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/c075ab088fbbb1447c0830ca391d7aa4cad4f405f36e66bd9d0706571e9d4c30/html5lib/html5lib-python/issues/571 |
| og:image:alt | from html5lib._tokenizer import HTMLTokenizer from io import StringIO class T(): def __init__(self, data): print("Object from string: " + data) self.src = StringIO() self.tokenizer = HTMLTokenizer(... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | ehsmeng |
| hostname | github.com |
| expected-hostname | github.com |
| None | b278ad162d35332b6de714dfb005de04386c4d92df6475522bef910f491a35ee |
| turbo-cache-control | no-preview |
| go-import | github.com/html5lib/html5lib-python git https://github.com/html5lib/html5lib-python.git |
| octolytics-dimension-user_id | 4092973 |
| octolytics-dimension-user_login | html5lib |
| octolytics-dimension-repository_id | 9322649 |
| octolytics-dimension-repository_nwo | html5lib/html5lib-python |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 9322649 |
| octolytics-dimension-repository_network_root_nwo | html5lib/html5lib-python |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 39aed5006635ab6f45e6b77d23e73b08a00272a3 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width