René's URL Explorer Experiment


Title: Some of the step tasks have been OOM Killed. · Issue #189 · modAL-python/modAL · GitHub

Open Graph Title: Some of the step tasks have been OOM Killed. · Issue #189 · modAL-python/modAL

X Title: Some of the step tasks have been OOM Killed. · Issue #189 · modAL-python/modAL

Description: I am facing "oom_kill event in StepId=866679.batch. Some of the step tasks have been OOM Killed." while using avg_confidence strategy for my multilabel dataset with around 38000 images of size 224. I use torch Dataloader with batch size ...

Open Graph Description: I am facing "oom_kill event in StepId=866679.batch. Some of the step tasks have been OOM Killed." while using avg_confidence strategy for my multilabel dataset with around 38000 images of size 224....

X Description: I am facing "oom_kill event in StepId=866679.batch. Some of the step tasks have been OOM Killed." while using avg_confidence strategy for my multilabel dataset with around 38000 images of...

Opengraph URL: https://github.com/modAL-python/modAL/issues/189

X: @github

direct link

Domain: patch-diff.githubusercontent.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Some of the step tasks have been OOM Killed.","articleBody":"I am facing \"oom_kill event in StepId=866679.batch. Some of the step tasks have been OOM Killed.\" while using avg_confidence strategy for my multilabel dataset with around 38000 images of size 224. I use torch Dataloader with batch size 8 to load the data. Here's a snippet of the code covering Active Learning loop -\r\n\r\nn_queries = 14\r\nfor i in range(n_queries):\r\n    if i == 0:\r\n        n_instances = 8\r\n    else:\r\n        power += 0.25\r\n        n_instances = batch(int(np.ceil(np.power(10, power))), batch_size)\r\n    total_samples += n_instances\r\n    n_instances_list.append(total_samples)\r\n    \r\n    print(f\"\\nQuery {i + 1}: Requesting {n_instances} samples.\")\r\n    print(f\"Number of samples in pool before query: {X_pool.shape[0]}\")\r\n\r\n    \r\n\r\n    with torch.device(\"cpu\"):\r\n        query_idx, _ = learner.query(X_pool, n_instances=n_instances) \r\n        query_idx = np.unique(query_idx)\r\n        query_idx = np.array(query_idx).flatten() \r\n\r\n    # Extract the samples based on the query indices\r\n    X_query = X_pool[query_idx]\r\n    y_query = y_pool[query_idx]\r\n    filenames_query = [filenames_pool[idx] for idx in query_idx]\r\n\r\n    print(\"Shape of X_query after indexing:\", X_query.shape)\r\n\r\n    if X_query.ndim != 4:\r\n        raise ValueError(f\"Unexpected number of dimensions in X_query: {X_query.ndim}\")\r\n    if X_query.shape[1:] != (224, 224, 3):\r\n        raise ValueError(f\"Unexpected shape in X_query dimensions: {X_query.shape}\")\r\n\r\n    X_cumulative = np.vstack((X_cumulative, X_query))\r\n    y_cumulative = np.vstack((y_cumulative, y_query))\r\n    filenames_cumulative.extend(filenames_query)\r\n\r\n    save_checkpoint(i + 1, X_cumulative, y_cumulative, filenames_cumulative, save_dir)\r\n\r\n    learner.teach(X=X_cumulative, y=y_cumulative)\r\n\r\n    y_pred = learner.predict(X_test_np)\r\n    accuracy = accuracy_score(y_test_np, y_pred)\r\n    f1 = f1_score(y_test_np, y_pred, average='macro')\r\n    acc_test_data.append(accuracy)\r\n    f1_test_data.append(f1)\r\n\r\n    print(f\"Accuracy after query {i + 1}: {accuracy}\")\r\n    print(f\"F1 Score after query {i + 1}: {f1}\")\r\n\r\n\r\n    # Early stopping check\r\n    if f1 \u003e best_f1_score:\r\n        best_f1_score = f1\r\n        wait = 0  # reset the wait counter\r\n    else:\r\n        wait += 1  # increment the wait counter\r\n        if wait \u003e= patience:\r\n            print(\"Stopping early due to no improvement in F1 score.\")\r\n            break\r\n\r\n    # Remove queried instances from the pool\r\n    X_pool = np.delete(X_pool, query_idx, axis=0)\r\n    y_pool = np.delete(y_pool, query_idx, axis=0)\r\n    filenames_pool = [filename for idx, filename in enumerate(filenames_pool) if idx not in query_idx]\r\n    print(f\"Number of samples in pool after query: {X_pool.shape[0]}\")\r\n\r\nThis code runs well till 11 iterations but in the 12th iteration I get the OOM kill error. \r\n\r\nI am using A100 GPU with 40GB RAM which should be sufficient for this loop. Could you please help me identify what could be going wrong which leads to excessive memory requirement. Is there a bottleneck in my code that I should address? Could it be the case that for every iterarion the data is held in the main memory and can it be freed somehow without breaking the code and distorting the results. ","author":{"url":"https://github.com/shubhamgp47","@type":"Person","name":"shubhamgp47"},"datePublished":"2024-07-27T13:07:40.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/189/modAL/issues/189"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:11a86914-d4af-b744-c77f-58b6c22636d1
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-idD874:170A5:ADEFDF:E2EE13:698EBD36
html-safe-nonce53dc6409c48f9e1b33dca906b35e9e77b41223bd8bdb57a76334b179b9e9a07c
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJEODc0OjE3MEE1OkFERUZERjpFMkVFMTM6Njk4RUJEMzYiLCJ2aXNpdG9yX2lkIjoiNTE4MTQxOTg5NTkzODUzMDYxNCIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9
visitor-hmac03d673899e879e90da9c85f42e4f7338f5fef3ff635b69df72e978d965bd05dd
hovercard-subject-tagissue:2433474346
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/modAL-python/modAL/189/issue_layout
twitter:imagehttps://opengraph.githubassets.com/f6109675c567a872f4b75170b31d6ad8dd5df99beb36d20a6cdca85f769cd781/modAL-python/modAL/issues/189
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/f6109675c567a872f4b75170b31d6ad8dd5df99beb36d20a6cdca85f769cd781/modAL-python/modAL/issues/189
og:image:altI am facing "oom_kill event in StepId=866679.batch. Some of the step tasks have been OOM Killed." while using avg_confidence strategy for my multilabel dataset with around 38000 images of size 224....
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernameshubhamgp47
hostnamegithub.com
expected-hostnamegithub.com
Nonecb2828a801ee6b7be618f3ac76fbf55def35bbc30f053a9c41bf90210b8b72ba
turbo-cache-controlno-preview
go-importgithub.com/modAL-python/modAL git https://github.com/modAL-python/modAL.git
octolytics-dimension-user_id42179679
octolytics-dimension-user_loginmodAL-python
octolytics-dimension-repository_id110697473
octolytics-dimension-repository_nwomodAL-python/modAL
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id110697473
octolytics-dimension-repository_network_root_nwomodAL-python/modAL
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
releasee6b91a7e6e46287d26887e3fb7a4161657bab8f7
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://patch-diff.githubusercontent.com/modAL-python/modAL/issues/189#start-of-content
https://patch-diff.githubusercontent.com/
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2FmodAL-python%2FmodAL%2Fissues%2F189
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2FmodAL-python%2FmodAL%2Fissues%2F189
Sign up https://patch-diff.githubusercontent.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=modAL-python%2FmodAL
Reloadhttps://patch-diff.githubusercontent.com/modAL-python/modAL/issues/189
Reloadhttps://patch-diff.githubusercontent.com/modAL-python/modAL/issues/189
Reloadhttps://patch-diff.githubusercontent.com/modAL-python/modAL/issues/189
modAL-python https://patch-diff.githubusercontent.com/modAL-python
modALhttps://patch-diff.githubusercontent.com/modAL-python/modAL
Notifications https://patch-diff.githubusercontent.com/login?return_to=%2FmodAL-python%2FmodAL
Fork 327 https://patch-diff.githubusercontent.com/login?return_to=%2FmodAL-python%2FmodAL
Star 2.3k https://patch-diff.githubusercontent.com/login?return_to=%2FmodAL-python%2FmodAL
Code https://patch-diff.githubusercontent.com/modAL-python/modAL
Issues 93 https://patch-diff.githubusercontent.com/modAL-python/modAL/issues
Pull requests 12 https://patch-diff.githubusercontent.com/modAL-python/modAL/pulls
Actions https://patch-diff.githubusercontent.com/modAL-python/modAL/actions
Projects 0 https://patch-diff.githubusercontent.com/modAL-python/modAL/projects
Wiki https://patch-diff.githubusercontent.com/modAL-python/modAL/wiki
Security 0 https://patch-diff.githubusercontent.com/modAL-python/modAL/security
Insights https://patch-diff.githubusercontent.com/modAL-python/modAL/pulse
Code https://patch-diff.githubusercontent.com/modAL-python/modAL
Issues https://patch-diff.githubusercontent.com/modAL-python/modAL/issues
Pull requests https://patch-diff.githubusercontent.com/modAL-python/modAL/pulls
Actions https://patch-diff.githubusercontent.com/modAL-python/modAL/actions
Projects https://patch-diff.githubusercontent.com/modAL-python/modAL/projects
Wiki https://patch-diff.githubusercontent.com/modAL-python/modAL/wiki
Security https://patch-diff.githubusercontent.com/modAL-python/modAL/security
Insights https://patch-diff.githubusercontent.com/modAL-python/modAL/pulse
New issuehttps://patch-diff.githubusercontent.com/login?return_to=https://github.com/modAL-python/modAL/issues/189
New issuehttps://patch-diff.githubusercontent.com/login?return_to=https://github.com/modAL-python/modAL/issues/189
Some of the step tasks have been OOM Killed.https://patch-diff.githubusercontent.com/modAL-python/modAL/issues/189#top
https://github.com/shubhamgp47
https://github.com/shubhamgp47
shubhamgp47https://github.com/shubhamgp47
on Jul 27, 2024https://github.com/modAL-python/modAL/issues/189#issue-2433474346
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.