Title: STIMP result mechanism · stumpy-dev/stumpy · Discussion #1043 · GitHub
Open Graph Title: STIMP result mechanism · stumpy-dev/stumpy · Discussion #1043
X Title: STIMP result mechanism · stumpy-dev/stumpy · Discussion #1043
Description: STIMP result mechanism
Open Graph Description: Hi community, I have a question about the class STIMP which is implementation of SKIMP algorithm. For better understanding, I used the dataset in the official SKIMP Dataset page: Termitte DNA https...
X Description: Hi community, I have a question about the class STIMP which is implementation of SKIMP algorithm. For better understanding, I used the dataset in the official SKIMP Dataset page: Termitte DNA https...
Opengraph URL: https://github.com/stumpy-dev/stumpy/discussions/1043
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"QAPage","mainEntity":{"@type":"Question","name":"STIMP result mechanism","text":"Hi community,
\nI have a question about the class STIMP which is implementation of SKIMP algorithm. For better understanding, I used the dataset in the official SKIMP Dataset page: Termitte DNA https://sites.google.com/view/pan-matrix-profile/datasets
\nThe dataset length is 16326
\n\n- I used stimp and get pmp results.
\n- I checked the length of outputs respectively.
len(pmp.M_) len(pmp.PAN_). They have same length that is what I expected. \n- I want to control the M_values.
pmp.M_[:100]. The largest value is 13060. Why this value is greater than half of the dataset, it is closed to the length of the dataset.
\n
\n- My other question is, let's say I picked the first element of M_ array (M_[0]. As I understood, this is subsequence length. Then I investigate it in PAN_. I would expect to see the length of 1 values in the related PAN_ array as M_[0]*2.
\n
\n
","upvoteCount":1,"answerCount":1,"acceptedAnswer":{"@type":"Answer","text":"\nI mean based on the 1000 length T that you have provided, why pmp._PAN[0] differs from m = 401 stump = stumpy.stump(T, m) stump[:,0]
\n
\nI think I see the problem. pmp._PAN stores the matrix profiles ordered from smallest m to largest m. However, the matrix profiles are NOT computed in this order and, instead, are computed in \"breadth-first-search\" (BFS) order. Thus, in order to retrieve the correct BFS order, we must follow the procedure in the pan transformation function (shown in the permalink above) but this example should work:
\nimport numpy as np\nimport stumpy\n\nT = np.random.rand(1000)\npmp = stumpy.stimp(T, percentage=1.0)\npmp.update()\n\n# retrieve the BFS indices for the matrix profiles that have already been processed\nidx = pmp._bfs_indices[: pmp._n_processed]\n\n# retrieve the specific raw matrix profiles from the pan matrix profile and make a copy\n# Note that `_PAN` is a private variable and should NEVER be accessed directly as there is no guarantee that it will not change!!\nprocessed_mps = pmp._PAN[idx].copy() # Don't do this!!\n\n# Compute the first matrix profile by hand\nmp = stumpy.stump(T, pmp.M_[0])\n\n# Compare `mp[:, 0]` and `processed_mps[0, :len(T)-pmp.M_[0]+1]`\n
\nI think what you've helped me identify is that it is somewhat difficult to retrieve the raw matrix profiles from the pmp object. I will take a look at whether there is a more convenient way to do this. Perhaps, it would make sense to add another class attribute for something like pmp.P_ that represents the raw matrix profiles but ordered by BFS?
","upvoteCount":0,"url":"https://github.com/stumpy-dev/stumpy/discussions/1043#discussioncomment-11177428"}}}
| route-pattern | /_view_fragments/Voltron::DiscussionsFragmentsController/show/:user_id/:repository/:discussion_number/discussion_layout(.:format) |
| route-controller | voltron_discussions_fragments |
| route-action | discussion_layout |
| fetch-nonce | v2:03cadf95-526d-9087-bd3b-4f5605ead374 |
| current-catalog-service-hash | 9f0abe34da433c9b6db74bffa2466494a717b579a96b30a5d252e5090baea7be |
| request-id | DF04:1ACC7D:9751A5:CD0F39:6992E0E4 |
| html-safe-nonce | 607f1e14bd70d5cc6bbefa107560f81c6b6625d65b55429dbeb15ea3217bce7f |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJERjA0OjFBQ0M3RDo5NzUxQTU6Q0QwRjM5OjY5OTJFMEU0IiwidmlzaXRvcl9pZCI6IjY0ODEyNTQzNDc0OTc5ODgzMjUiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== |
| visitor-hmac | 31bc997cbe4d6756af4bcb30571b3612ba7d717c421174541482b7c77c234c48 |
| hovercard-subject-tag | discussion:7411984 |
| github-keyboard-shortcuts | repository,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/Voltron::DiscussionsFragmentsController/show/stumpy-dev/stumpy/1043/discussion_layout |
| twitter:image | https://opengraph.githubassets.com/6bd73bf68f9b648c1ed61775585a2aef8095ee2101110c5a8b6a5ec123373e62/stumpy-dev/stumpy/discussions/1043 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/6bd73bf68f9b648c1ed61775585a2aef8095ee2101110c5a8b6a5ec123373e62/stumpy-dev/stumpy/discussions/1043 |
| og:image:alt | Hi community, I have a question about the class STIMP which is implementation of SKIMP algorithm. For better understanding, I used the dataset in the official SKIMP Dataset page: Termitte DNA https... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| hostname | github.com |
| expected-hostname | github.com |
| None | 42c603b9d642c4a9065a51770f75e5e27132fef0e858607f5c9cb7e422831a7b |
| turbo-cache-control | no-preview |
| go-import | github.com/stumpy-dev/stumpy git https://github.com/stumpy-dev/stumpy.git |
| octolytics-dimension-user_id | 58273319 |
| octolytics-dimension-user_login | stumpy-dev |
| octolytics-dimension-repository_id | 184809315 |
| octolytics-dimension-repository_nwo | stumpy-dev/stumpy |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 184809315 |
| octolytics-dimension-repository_network_root_nwo | stumpy-dev/stumpy |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 84dcb133269e3cfe6e0296cc85fbacb92cae92bb |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width