Title: Memory Leak Caused by the circular reference. · Issue #559 · python-graphblas/python-graphblas · GitHub
Open Graph Title: Memory Leak Caused by the circular reference. · Issue #559 · python-graphblas/python-graphblas
X Title: Memory Leak Caused by the circular reference. · Issue #559 · python-graphblas/python-graphblas
Description: First of all, thank you so much for developing such a powerful and interesting project. We are a group working on libraries such as LIBSVM, LIBLINEAR, LibMultiLabel, and so on. Recently, we have integrating python-graphblas into our code...
Open Graph Description: First of all, thank you so much for developing such a powerful and interesting project. We are a group working on libraries such as LIBSVM, LIBLINEAR, LibMultiLabel, and so on. Recently, we have in...
X Description: First of all, thank you so much for developing such a powerful and interesting project. We are a group working on libraries such as LIBSVM, LIBLINEAR, LibMultiLabel, and so on. Recently, we have in...
Opengraph URL: https://github.com/python-graphblas/python-graphblas/issues/559
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Memory Leak Caused by the circular reference.","articleBody":"First of all, thank you so much for developing such a powerful and interesting project.\nWe are a group working on libraries such as **LIBSVM**, **LIBLINEAR**, **LibMultiLabel**, and so on. \nRecently, we have integrating **python-graphblas** into our codebase to speed up sparse matrix operation.\nHowever, during the integrations, we encountered a memory issue when using sparse matrix multiplication. \nWe would like to share our observation and findings, and hope to get some feedback on potential improvements.\n\nThe information about our sparse matrices and predefined variables is shown below:\n```\nweights shape: (135909, 826017) # CSR and NNZ: 1,818,756,815\ninstances shape: (153025, 135909) # CSR and NNZ: 14,013,838\nbatch_size = 256\n```\nOur usage is simplified in the following examples: \n```python\ndef predict_values(A, B):\n C = gb.Matrix(float, A.shape[0], B.shape[1])\n C \u003c\u003c A.mxm(B, op=gb.semiring.min_plus)\n return C.to_dense(fill_value=0)\n\ndef main():\n for idx in range(0, instances.shape[0], batch_size):\n batch_x = instances[idx : idx + batch_size, :]\n results = predict_values(batch_x, weights)\n```\nWe initially assumed that once `predict_values` returns and the local variable `C` goes out of the `predict_values` function scope, its memory would be released (since the reference to `C` no longer exists). \nHowever, based on our observations, the memory occupied by `C` is not released. \nInstead, the memory usage grows linearly with each function call, until some conditions are met (i.e., the threshold of Python's garbage collector) and Python eventually release it.\nThis issue becomes more severe when `C` consumes a large amount of memory.\nIf `C` consumes a large amount of memory, this delayed release causes the system to consume all available memory (including swap), which in turn degrades the performance of matrix multiplication.\nUpon investigation, we found that the reason of the memory issue comes from the circular reference in GraphBLAS.\n\n```python\nclass Matrix(BaseType):\n # SKIP\n def __new__(cls, dtype=FP64, nrows=0, ncols=0, *, name=None):\n # SKIP\n if backend == \"suitesparse\":\n self.ss = ss(self)\n``` \ncopied from [graphblas/core/matrix.py](https://github.com/python-graphblas/python-graphblas/blob/main/graphblas/core/matrix.py#L167-L203)\n```python\nclass ss:\n __slots__ = \"_parent\", \"config\"\n\n def __init__(self, parent):\n self._parent = parent\n self.config = MatrixConfig(parent)\n``` \ncopied from [graphblas/core/ss/matrix.py](https://github.com/python-graphblas/python-graphblas/blob/main/graphblas/core/ss/matrix.py#L183-L188). \nThe simplified reference structure is as follows:\n```pysql\nMatrix M\n └── .ss (attribute of M)\n └── (references back to) M\n```\nThis creates a circular reference between the `Matrix` object and its `.ss` attribute.\nSince Python uses reference counting to manage memory, the reference count of objects involved in a circular reference never becomes zero, even when local variables go out of scope (e.g., our simplified example).\nAs a result, these objects remain in memory until Python's garbage collector detects and frees them.\nHowever, the garbage collector does not detect them immediately. \nIn fact, it sets a threshold and uses it to decide when to detect and free those circular reference objects.\nTherefore, if the threshold is not reached, the unreleased memory may gradually consume all available memory, which is consistent with our observations. \n\nThere are several ways to address this memory issue:\n* Call `gc.collect()` frequently to force the garbage collector to clean up circular references. \n However, this can introduce significant overhead and degrade performance.\n* Manually break the reference in the `.ss` attribute (e.g., `C.ss = None` in our example) after the matrix is no longer needed. However, this is not a general or scalable solution, as it requires explicit intervention.\n* Replace the strong reference with `weakref`, which avoids reference cycles entirely.\nWould replacing the strong reference with `weakref` be a safe and effective solution in this case?\n\nWe were curious about the use of circular references in `python-graphblas`, so we investigated it further.\nCurrently, we know that some functions in the `Matrix` class depend on the `.ss` attribute, and that the `ss` object also needs to access information from the `Matrix` itself (e.g., `nrows` and `ncols`). \nHowever, there might be alternative ways to support this interaction without the circular reference?!\nThat said, we may be missing some context. Is there a specific reason or design requirement for using this circular reference structure?\n\nIs it possible to refactor this and improve this memory behavior in `python-graphblas`?\n\nThanks","author":{"url":"https://github.com/zhi-bao","@type":"Person","name":"zhi-bao"},"datePublished":"2025-07-10T12:08:25.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/559/python-graphblas/issues/559"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:24fb5745-e8e0-45b9-fff2-21e9157beedd |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | 9306:9C235:2638065:3351E5D:698DACEF |
| html-safe-nonce | 2bf11393210406fd88d654b6c5b63260c58d2e9e098a9e0f94f55745d1067189 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5MzA2OjlDMjM1OjI2MzgwNjU6MzM1MUU1RDo2OThEQUNFRiIsInZpc2l0b3JfaWQiOiIxNzYxOTIzNjI4MTU1MTIwODc5IiwicmVnaW9uX2VkZ2UiOiJpYWQiLCJyZWdpb25fcmVuZGVyIjoiaWFkIn0= |
| visitor-hmac | b0facb92a3ef01521c9a0239d4baaf87b1950413acd5826cbf9a8d4c9295073f |
| hovercard-subject-tag | issue:3219095377 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python-graphblas/python-graphblas/559/issue_layout |
| twitter:image | https://opengraph.githubassets.com/109103695d7bfbf9847d739ee582beb3e705149e483f68ef39823bbc604bac08/python-graphblas/python-graphblas/issues/559 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/109103695d7bfbf9847d739ee582beb3e705149e483f68ef39823bbc604bac08/python-graphblas/python-graphblas/issues/559 |
| og:image:alt | First of all, thank you so much for developing such a powerful and interesting project. We are a group working on libraries such as LIBSVM, LIBLINEAR, LibMultiLabel, and so on. Recently, we have in... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | zhi-bao |
| hostname | github.com |
| expected-hostname | github.com |
| None | 8c7947c0c592efeab6162b9909ad11fa43bff8b0cb5ff43273dc25e41979d43e |
| turbo-cache-control | no-preview |
| go-import | github.com/python-graphblas/python-graphblas git https://github.com/python-graphblas/python-graphblas.git |
| octolytics-dimension-user_id | 103965858 |
| octolytics-dimension-user_login | python-graphblas |
| octolytics-dimension-repository_id | 221014819 |
| octolytics-dimension-repository_nwo | python-graphblas/python-graphblas |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 221014819 |
| octolytics-dimension-repository_network_root_nwo | python-graphblas/python-graphblas |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | b22a9fbf4dea601ec149a9e5362e0558df79b505 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width