Title: array.device is sometimes returning raw pointers · Issue #1450 · arrayfire/arrayfire · GitHub
Open Graph Title: array.device is sometimes returning raw pointers · Issue #1450 · arrayfire/arrayfire
X Title: array.device is sometimes returning raw pointers · Issue #1450 · arrayfire/arrayfire
Description: I am mixing CUDA and arrayfire code extensively. I am using CUDA surface objects, so I need to do a copy from the arrayfire device memory to the cuda array memory. Currently, I'm using cudaMemcpy3D(), and I am assuming that the arrayfire...
Open Graph Description: I am mixing CUDA and arrayfire code extensively. I am using CUDA surface objects, so I need to do a copy from the arrayfire device memory to the cuda array memory. Currently, I'm using cudaMemcpy3D...
X Description: I am mixing CUDA and arrayfire code extensively. I am using CUDA surface objects, so I need to do a copy from the arrayfire device memory to the cuda array memory. Currently, I'm using cudaMemc...
Opengraph URL: https://github.com/arrayfire/arrayfire/issues/1450
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"array.device is sometimes returning raw pointers","articleBody":"I am mixing CUDA and arrayfire code extensively. I am using CUDA surface objects, so I need to do a copy from the arrayfire device memory to the cuda array memory. Currently, I'm using cudaMemcpy3D(), and I am assuming that the arrayfire data is in packed form, in H x W x C order (height varies first.) I'm using float32 elements here.\n\nHere's the code I use:\n\n```\n// release any previous content, make a cuda array, copy from arrayfire array\nvoid CudaArray::to_surface(const af::array \u0026in) \n{\n if (in.dims()[3] != 1)\n throw \"CudaArray: attempt to allocate a 4D CUDA surface\";\n\n // free up old CUDA array and allocate new one\n free();\n dims_ = in.dims();\n allocate();\n\n void *d_srcptr = in.device\u003cfloat\u003e(); // does an implicit eval()\n af::sync(); // ensure all arrayfire work is complete before we trigger CUDA code\n\n cudaMemcpy3DParms params = { 0 };\n params.srcPtr = make_cudaPitchedPtr(d_srcptr, dims_[0] * sizeof(float), dims_[0], dims_[1]); // dims[0] is what varies fastest\n params.dstArray = cuda_array;\n params.kind = cudaMemcpyDeviceToDevice;\n params.extent = make_cudaExtent(dims_[0], dims_[1], dims_[2]);\n\n checkCuda(cudaMemcpy3D(\u0026params));\n in.unlock(); // done with d_srcptr\n}\n```\n\nI use the following to get the data out of the surface back to an array:\n\n```\n// make a new arrayfire array, copy data from surface to the array\naf::array CudaArray::to_array() \n{\n // ensure CUDA is done before we start copying\n checkCuda(cudaDeviceSynchronize());\n checkCuda(cudaGetLastError());\n\n af::array out(dims_);\n\n void *d_dstptr = out.device\u003cfloat\u003e();\n\n cudaMemcpy3DParms params = { 0 };\n params.srcArray = cuda_array;\n params.dstPtr = make_cudaPitchedPtr(d_dstptr, dims_[0] * sizeof(float), dims_[0], dims_[1]);\n params.kind = cudaMemcpyDeviceToDevice;\n params.extent = make_cudaExtent(dims_[0], dims_[1], dims_[2]);\n\n checkCuda(cudaMemcpy3D(\u0026params));\n out.unlock(); // done with d_dstptr\n\n return out;\n}\n```\n\nEverything works fine as long as to_surface() is invoked only on a fresh array, i.e., an array created by `array foo = loadImage(\"file.png\", true); CudaArray baz(foo); // this works just fine`\n\nHowever, if I do any kind of arrayfire processing on the array, e.g.,\n\n```\narray subwindow(const array \u0026in)\n{\n int h = in.dims()[0], w = in.dims()[1];\n array out = in(seq(h/4, h/4+h/2-1), seq(w/4, w/4+w/2-1), span);\n return out;\n}\n\nint main()\n{\n af::setDevice(0); .. make a window ..\n\n array in = loadImage(\"baz.jpg\", true);\n in = subwindow(in); // remove this line and the code works\n CudaArray baz(in);\n win.image(in.to_array()); // shows what appears to be result of a bad stride \n}\n```\n\nSo the question is, after the call to subwindow, is the memory layout of the array in still tightly packed with floats with no padding and a pitch in bytes equal to the height \\* 4?\n","author":{"url":"https://github.com/AEHousman","@type":"Person","name":"AEHousman"},"datePublished":"2016-06-06T00:07:41.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":11},"url":"https://github.com/1450/arrayfire/issues/1450"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:7b882db4-a014-5f8b-f427-03b25f6e1cf7 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | CB9A:96AF8:A6489:DEB68:6969C434 |
| html-safe-nonce | 3d8dc42ae54cf08b0bc86d23fbc31997757f998633d0cd0a23811b2026e77161 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDQjlBOjk2QUY4OkE2NDg5OkRFQjY4OjY5NjlDNDM0IiwidmlzaXRvcl9pZCI6IjkyMDgyNjk0MDQ2MDIyMjE2MjAiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== |
| visitor-hmac | 74b427e191433e57f1e90411da8bc2b65a581c0ec28a911e6ffc501bb1b5ed84 |
| hovercard-subject-tag | issue:158582503 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/arrayfire/arrayfire/1450/issue_layout |
| twitter:image | https://opengraph.githubassets.com/43a7d826906e173df8095c49776165c94e0ae30232d1c09492baf02dc623e127/arrayfire/arrayfire/issues/1450 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/43a7d826906e173df8095c49776165c94e0ae30232d1c09492baf02dc623e127/arrayfire/arrayfire/issues/1450 |
| og:image:alt | I am mixing CUDA and arrayfire code extensively. I am using CUDA surface objects, so I need to do a copy from the arrayfire device memory to the cuda array memory. Currently, I'm using cudaMemcpy3D... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | AEHousman |
| hostname | github.com |
| expected-hostname | github.com |
| None | acedec8b5f975d9e3d494ddd8f949b0b8a0de59d393901e26f73df9dcba80056 |
| turbo-cache-control | no-preview |
| go-import | github.com/arrayfire/arrayfire git https://github.com/arrayfire/arrayfire.git |
| octolytics-dimension-user_id | 5395442 |
| octolytics-dimension-user_login | arrayfire |
| octolytics-dimension-repository_id | 25889802 |
| octolytics-dimension-repository_nwo | arrayfire/arrayfire |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 25889802 |
| octolytics-dimension-repository_network_root_nwo | arrayfire/arrayfire |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 83c08c21cdda978090dc44364b71aa5bc6dcea79 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width