René's URL Explorer Experiment


Title: Cannot do udaf that returns list of timestamps · Issue #1339 · apache/datafusion-python · GitHub

Open Graph Title: Cannot do udaf that returns list of timestamps · Issue #1339 · apache/datafusion-python

X Title: Cannot do udaf that returns list of timestamps · Issue #1339 · apache/datafusion-python

Description: Describe the bug I'm trying to generate a udaf that returns multiple timestamps for each partition id. To Reproduce import datafusion as dfn from datafusion import udf, udaf, Accumulator, col import pyarrow as pa import pyarrow.compute a...

Open Graph Description: Describe the bug I'm trying to generate a udaf that returns multiple timestamps for each partition id. To Reproduce import datafusion as dfn from datafusion import udf, udaf, Accumulator, col impor...

X Description: Describe the bug I'm trying to generate a udaf that returns multiple timestamps for each partition id. To Reproduce import datafusion as dfn from datafusion import udf, udaf, Accumulator, col i...

Opengraph URL: https://github.com/apache/datafusion-python/issues/1339

X: @github

direct link

Domain: patch-diff.githubusercontent.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Cannot do udaf that returns list of timestamps","articleBody":"**Describe the bug**\nI'm trying to generate a udaf that returns multiple timestamps for each partition id.\n\n**To Reproduce**\n```python\nimport datafusion as dfn\nfrom datafusion import udf, udaf, Accumulator, col\nimport pyarrow as pa\nimport pyarrow.compute as pc\nimport numpy as np\n\nclass ResampleAccumulator(Accumulator):\n    def __init__(self):\n        self._min = float('inf')\n        self._max = 0\n        # 10 Hz\n        self._timestep = 100 # ms\n\n    def update(self, array):\n        # Logic to update the sum and count from an input array\n        # In a real implementation, you would process the pyarrow array efficiently\n        print(\"Enter update\")\n        local_min, local_max = pc.min_max(array).values()\n        local_min_ns = local_min.cast(pa.timestamp('ns')).value\n        local_max_ns = local_max.cast(pa.timestamp('ns')).value\n\n        self._min = min(local_min_ns, self._min)\n        self._max = max(local_max_ns, self._max)\n        print(f\"update {self._min=}, {self._max=}\")\n\n    def merge(self, states_array):\n        print(\"Enter merge\")\n        # Is there a better way to do this with pc?\n        # or maybe just throw it into numpy\n        self._min = min(states_array[0][0].as_py(), self._min)\n        self._max = max(states_array[1][0].as_py(), self._max)\n        print(f\"merge {self._min=}, {self._max=}\")\n\n\n    def state(self):\n        print(\"Enter state\")\n        # Return the current state as a list of scalars\n        return pa.array([self._min, self._max], type=pa.int64())\n\n    def evaluate(self):\n        print(\"Enter evaluate\")\n        desired_timestamps = np.arange(np.datetime64(self._min, 'ns'), np.datetime64(self._max, 'ns'), np.timedelta64(self._timestep, \"ms\"))\n        print(f\"{len(desired_timestamps)=}\")\n        array_result = pa.array(desired_timestamps, type=pa.timestamp('ns'))\n        print(array_result)\n        return array_result\nresample_udaf = udaf(ResampleAccumulator, [pa.timestamp('ns')], pa.list_(pa.timestamp('ns')), [pa.int64(), pa.int64()], volatility=\"stable\")\n\nctx = dfn.SessionContext()\n\ndf = ctx.from_pydict({\"id\": [0,1], \"time\": [np.datetime64(0, 'ns'), np.datetime64(1_000_000_000, 'ns')]})\nprint(df)\n\nresult = df.aggregate(\n    \"id\",\n    [resample_udaf(col(\"time\"))]\n)\nprint(result.schema())\nresult.collect()\n```\nOutput\n```bash\nTraceback (most recent call last):\n  File \"\u003cpath\u003e/\u003cfile\u003e.py\", line 60, in \u003cmodule\u003e\n    result.collect()\n  File \"\u003cpath\u003e/.venv/lib/python3.12/site-packages/datafusion/dataframe.py\", line 729, in collect\n    return self.df.collect()\n           ^^^^^^^^^^^^^^^^^\nException: DataFusion error: Execution(\"ArrowTypeError: object of type \u003cclass 'pyarrow.lib.TimestampArray'\u003e cannot be converted to int\")\n```\n\n**Expected behavior**\nThis works or provides a clearer error.\n\n**Additional context**\nFails on datafusion 51. If I return just a single timestamp and update the udaf call then this works.","author":{"url":"https://github.com/ntjohnson1","@type":"Person","name":"ntjohnson1"},"datePublished":"2026-01-15T15:53:22.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/1339/datafusion-python/issues/1339"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:c51dc5dc-25f2-3385-41b3-60ebc9f4671f
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-idC434:DA633:E97F640:12F1206C:69772E10
html-safe-nonce549fc8c17643b842d19c3c9e28931054c059cdffa3cc18e09ea84708ed59f18f
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDNDM0OkRBNjMzOkU5N0Y2NDA6MTJGMTIwNkM6Njk3NzJFMTAiLCJ2aXNpdG9yX2lkIjoiNDEwNzc2MDQyNjY3OTc4MjU2IiwicmVnaW9uX2VkZ2UiOiJpYWQiLCJyZWdpb25fcmVuZGVyIjoiaWFkIn0=
visitor-hmac4db1ab2fc88bb79f3bc0f6bfdee46901148cde275d165c037765d0fef7eedd8a
hovercard-subject-tagissue:3818124290
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/apache/datafusion-python/1339/issue_layout
twitter:imagehttps://opengraph.githubassets.com/ccec414f6b986cbb39712e48e34e0f2a85a704056f4f7b07b71f8080916d40e3/apache/datafusion-python/issues/1339
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/ccec414f6b986cbb39712e48e34e0f2a85a704056f4f7b07b71f8080916d40e3/apache/datafusion-python/issues/1339
og:image:altDescribe the bug I'm trying to generate a udaf that returns multiple timestamps for each partition id. To Reproduce import datafusion as dfn from datafusion import udf, udaf, Accumulator, col impor...
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernamentjohnson1
hostnamegithub.com
expected-hostnamegithub.com
None01d198479908d09a841b2febe8eb105a81af2af7d81830960fe0971e1f4adc09
turbo-cache-controlno-preview
go-importgithub.com/apache/datafusion-python git https://github.com/apache/datafusion-python.git
octolytics-dimension-user_id47359
octolytics-dimension-user_loginapache
octolytics-dimension-repository_id515951203
octolytics-dimension-repository_nwoapache/datafusion-python
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id515951203
octolytics-dimension-repository_network_root_nwoapache/datafusion-python
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
releasef752335dbbea672610081196a1998e39aec5e14b
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://patch-diff.githubusercontent.com/apache/datafusion-python/issues/1339#start-of-content
https://patch-diff.githubusercontent.com/
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Fapache%2Fdatafusion-python%2Fissues%2F1339
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Fapache%2Fdatafusion-python%2Fissues%2F1339
Sign up https://patch-diff.githubusercontent.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=apache%2Fdatafusion-python
Reloadhttps://patch-diff.githubusercontent.com/apache/datafusion-python/issues/1339
Reloadhttps://patch-diff.githubusercontent.com/apache/datafusion-python/issues/1339
Reloadhttps://patch-diff.githubusercontent.com/apache/datafusion-python/issues/1339
apache https://patch-diff.githubusercontent.com/apache
datafusion-pythonhttps://patch-diff.githubusercontent.com/apache/datafusion-python
Notifications https://patch-diff.githubusercontent.com/login?return_to=%2Fapache%2Fdatafusion-python
Fork 142 https://patch-diff.githubusercontent.com/login?return_to=%2Fapache%2Fdatafusion-python
Star 553 https://patch-diff.githubusercontent.com/login?return_to=%2Fapache%2Fdatafusion-python
Code https://patch-diff.githubusercontent.com/apache/datafusion-python
Issues 80 https://patch-diff.githubusercontent.com/apache/datafusion-python/issues
Pull requests 27 https://patch-diff.githubusercontent.com/apache/datafusion-python/pulls
Actions https://patch-diff.githubusercontent.com/apache/datafusion-python/actions
Security 0 https://patch-diff.githubusercontent.com/apache/datafusion-python/security
Insights https://patch-diff.githubusercontent.com/apache/datafusion-python/pulse
Code https://patch-diff.githubusercontent.com/apache/datafusion-python
Issues https://patch-diff.githubusercontent.com/apache/datafusion-python/issues
Pull requests https://patch-diff.githubusercontent.com/apache/datafusion-python/pulls
Actions https://patch-diff.githubusercontent.com/apache/datafusion-python/actions
Security https://patch-diff.githubusercontent.com/apache/datafusion-python/security
Insights https://patch-diff.githubusercontent.com/apache/datafusion-python/pulse
New issuehttps://patch-diff.githubusercontent.com/login?return_to=https://github.com/apache/datafusion-python/issues/1339
New issuehttps://patch-diff.githubusercontent.com/login?return_to=https://github.com/apache/datafusion-python/issues/1339
#1347https://github.com/apache/datafusion-python/pull/1347
Cannot do udaf that returns list of timestampshttps://patch-diff.githubusercontent.com/apache/datafusion-python/issues/1339#top
#1347https://github.com/apache/datafusion-python/pull/1347
bugSomething isn't workinghttps://github.com/apache/datafusion-python/issues?q=state%3Aopen%20label%3A%22bug%22
https://github.com/ntjohnson1
https://github.com/ntjohnson1
ntjohnson1https://github.com/ntjohnson1
on Jan 15, 2026https://github.com/apache/datafusion-python/issues/1339#issue-3818124290
bugSomething isn't workinghttps://github.com/apache/datafusion-python/issues?q=state%3Aopen%20label%3A%22bug%22
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.