Title: Improve the error message when `build_multivariate_dataframe` has the list of stat_vars more than the batch_size · Issue #184 · datacommonsorg/api-python · GitHub
Open Graph Title: Improve the error message when `build_multivariate_dataframe` has the list of stat_vars more than the batch_size · Issue #184 · datacommonsorg/api-python
X Title: Improve the error message when `build_multivariate_dataframe` has the list of stat_vars more than the batch_size · Issue #184 · datacommonsorg/api-python
Description: cc: @shifucun I was using a script to build_multivariate_dataframe for a stat_var list of length more than 50 and got the following error: Traceback (most recent call last): File "/home/sharadshriram/accessible_charts/datasets/datacommon...
Open Graph Description: cc: @shifucun I was using a script to build_multivariate_dataframe for a stat_var list of length more than 50 and got the following error: Traceback (most recent call last): File "/home/sharadshrir...
X Description: cc: @shifucun I was using a script to build_multivariate_dataframe for a stat_var list of length more than 50 and got the following error: Traceback (most recent call last): File "/home/sharad...
Opengraph URL: https://github.com/datacommonsorg/api-python/issues/184
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Improve the error message when `build_multivariate_dataframe` has the list of stat_vars more than the batch_size","articleBody":"cc: @shifucun \r\n\r\nI was using a script to `build_multivariate_dataframe` for a stat_var list of length more than 50 and got the following error:\r\n\r\n```shell\r\nTraceback (most recent call last):\r\n File \"/home/sharadshriram/accessible_charts/datasets/datacommons/get_data.py\", line 88, in \u003cmodule\u003e\r\n save_statvar_to_csv(place, 'data.csv')\r\n File \"/home/sharadshriram/accessible_charts/datasets/datacommons/get_data.py\", line 67, in save_statvar_to_csv\r\n df = dpd.build_multivariate_dataframe([place], stat_vars)\r\n File \"/home/sharadshriram/env/lib/python3.10/site-packages/datacommons_pandas/df_builder.py\", line 314, in build_multivariate_dataframe\r\n df = pd.DataFrame.from_records(_multivariate_pd_input(places, stat_vars))\r\n File \"/home/sharadshriram/env/lib/python3.10/site-packages/datacommons_pandas/df_builder.py\", line 238, in _multivariate_pd_input\r\n rows_dict = _group_stat_all_by_obs_options(places,\r\n File \"/home/sharadshriram/env/lib/python3.10/site-packages/datacommons_pandas/df_builder.py\", line 88, in _group_stat_all_by_obs_options\r\n stat_all = dc.get_stat_all(places, stat_vars)\r\n File \"/home/sharadshriram/env/lib/python3.10/site-packages/datacommons_pandas/stat_vars.py\", line 226, in get_stat_all\r\n batches = -(-len(places) // places_per_batch)\r\nZeroDivisionError: integer division or modulo by zero\r\n```\r\n\r\nHowever, `ZeroDivisionError: integer division or modulo by zero` did not help me understand what caused the ZeroDivisionError. After backtracking, I observed the error was caused not because of batching, but because the `len(stat_var)` passed to `dc.get_stat_all(places, stat_vars)` was greater than 50.\r\n\r\nIs it possible for the error message to read out that the length of `stat_var` list passed is more than the batch_size limit of 50?\r\n\r\nI also wonder whether, we can extend the `get_stat_all()` method to chunk long lists of `stat_var` to length 50, and do the API query. Would like to hear your thoughts?","author":{"url":"https://github.com/sharadshriram","@type":"Person","name":"sharadshriram"},"datePublished":"2022-10-13T04:40:12.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/184/api-python/issues/184"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:15c8d436-bd66-a3c8-9dd1-f9b4fc8f1d9f |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | 81EA:2B7320:ED9BA:14DF0E:69789350 |
| html-safe-nonce | 7b9c94e1df9faa114449c0495915668e31423db69530b093c03abef821c83425 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI4MUVBOjJCNzMyMDpFRDlCQToxNERGMEU6Njk3ODkzNTAiLCJ2aXNpdG9yX2lkIjoiNzE1NzUzMzM1MjM5NDI2NTQyNCIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | 934bf609244b9f9c6ea16931aa4f35b45476b0a84423c693a13024c2dcd00a08 |
| hovercard-subject-tag | issue:1407134829 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/datacommonsorg/api-python/184/issue_layout |
| twitter:image | https://opengraph.githubassets.com/921fd9be7eb4cff21ad079b6b1750805f2c6b4a5c4f96363900bd451a3bbde92/datacommonsorg/api-python/issues/184 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/921fd9be7eb4cff21ad079b6b1750805f2c6b4a5c4f96363900bd451a3bbde92/datacommonsorg/api-python/issues/184 |
| og:image:alt | cc: @shifucun I was using a script to build_multivariate_dataframe for a stat_var list of length more than 50 and got the following error: Traceback (most recent call last): File "/home/sharadshrir... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | sharadshriram |
| hostname | github.com |
| expected-hostname | github.com |
| None | 2981c597c945c1d90ac6fa355ce7929b2f413dfe7872ca5c435ee53a24a1de50 |
| turbo-cache-control | no-preview |
| go-import | github.com/datacommonsorg/api-python git https://github.com/datacommonsorg/api-python.git |
| octolytics-dimension-user_id | 52017486 |
| octolytics-dimension-user_login | datacommonsorg |
| octolytics-dimension-repository_id | 150300890 |
| octolytics-dimension-repository_nwo | datacommonsorg/api-python |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 150300890 |
| octolytics-dimension-repository_network_root_nwo | datacommonsorg/api-python |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | a49c4df0dedcf1a70c07cccb99337400ec284247 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width