René's URL Explorer Experiment


Title: CUDA out of memory during inference with SQL-R1-14B · Issue #29 · DataArcTech/SQL-R1 · GitHub

Open Graph Title: CUDA out of memory during inference with SQL-R1-14B · Issue #29 · DataArcTech/SQL-R1

X Title: CUDA out of memory during inference with SQL-R1-14B · Issue #29 · DataArcTech/SQL-R1

Description: Hello!! I tried to inference SQL-R1-14B on 1*A100 80GB. No matter I set the gpu_memory_utilization=0.9\0.8\0.5, I always get the CUDA out of memory error. The SQL-R1-3B and 7B have all been successfully ran on my device. Besides, I can a...

Open Graph Description: Hello!! I tried to inference SQL-R1-14B on 1*A100 80GB. No matter I set the gpu_memory_utilization=0.9\0.8\0.5, I always get the CUDA out of memory error. The SQL-R1-3B and 7B have all been success...

X Description: Hello!! I tried to inference SQL-R1-14B on 1*A100 80GB. No matter I set the gpu_memory_utilization=0.9\0.8\0.5, I always get the CUDA out of memory error. The SQL-R1-3B and 7B have all been success...

Opengraph URL: https://github.com/DataArcTech/SQL-R1/issues/29

X: @github

direct link

Domain: patch-diff.githubusercontent.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"CUDA out of memory during inference with SQL-R1-14B","articleBody":"Hello!! I tried to inference SQL-R1-14B on 1*A100 80GB. No matter I set the gpu_memory_utilization=0.9\\0.8\\0.5, I always get the CUDA out of memory error. The SQL-R1-3B and 7B have all been successfully ran on my device. Besides, I can also run other models around 14B on my device. Do you have any ideas about this error? Thanks :)\n\nHere is the config log of vLLM:\n```\nInitializing an LLM engine (vdev) with config: model='MPX0222forHF/SQL-R1-14B', speculative_config=None, tokenizer='MPX0222forHF/SQL-R1-14B', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=8192, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=True, quantization=None, enforce_eager=True, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=MPX0222forHF/SQL-R1-14B, use_v2_block_manager=True, num_scheduler_steps=1, chunked_prefill_enabled=False multi_step_stream_outputs=True, enable_prefix_caching=False, use_async_output_proc=False, use_cached_outputs=False, mm_processor_kwargs=None)\n```\n\nHere is the error message:\n```\ntorch.OutOfMemoryError: CUDA out of memory. Tried to allocate 270.00 MiB. GPU 0 has a total capacity of 79.15 GiB of which 236.69 MiB is free. Process 2457479 has 60.40 GiB memory in use. Including non-PyTorch memory, this process has 18.51 GiB memory in use. Of the allocated memory 18.01 GiB is allocated by PyTorch, and 12.92 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)\n```","author":{"url":"https://github.com/thatmee","@type":"Person","name":"thatmee"},"datePublished":"2026-01-20T10:27:32.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":1},"url":"https://github.com/29/SQL-R1/issues/29"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:67bc9ed4-995b-8e92-e305-ea2a63bed702
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-idDFC4:3AFAA9:B57ECB:F4E5E7:698E42AA
html-safe-nonce2d80056d1cc1cc985b55233d2216b83d15ac259dd71ce3eef18329bd6d611d0a
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJERkM0OjNBRkFBOTpCNTdFQ0I6RjRFNUU3OjY5OEU0MkFBIiwidmlzaXRvcl9pZCI6IjI3NjYxOTIyMTcyNDg2NDU4MDIiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ==
visitor-hmac1ec5f85502cb53db846f25ad0aad980515d5d71597c9e1cacf06c5b8739c1ecb
hovercard-subject-tagissue:3833075541
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/DataArcTech/SQL-R1/29/issue_layout
twitter:imagehttps://opengraph.githubassets.com/64eab7412456c732190e8e4a709bd2cd961d6384884a553d42e1b9d5ca9cd0b8/DataArcTech/SQL-R1/issues/29
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/64eab7412456c732190e8e4a709bd2cd961d6384884a553d42e1b9d5ca9cd0b8/DataArcTech/SQL-R1/issues/29
og:image:altHello!! I tried to inference SQL-R1-14B on 1*A100 80GB. No matter I set the gpu_memory_utilization=0.9\0.8\0.5, I always get the CUDA out of memory error. The SQL-R1-3B and 7B have all been success...
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernamethatmee
hostnamegithub.com
expected-hostnamegithub.com
Nonea5632af64f7fed7bff1d6a428d1aca1b94fa7a48f760de2d39d9b1effdbf0082
turbo-cache-controlno-preview
go-importgithub.com/DataArcTech/SQL-R1 git https://github.com/DataArcTech/SQL-R1.git
octolytics-dimension-user_id149999489
octolytics-dimension-user_loginDataArcTech
octolytics-dimension-repository_id981865038
octolytics-dimension-repository_nwoDataArcTech/SQL-R1
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id981865038
octolytics-dimension-repository_network_root_nwoDataArcTech/SQL-R1
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
release3dda52e29a416820ced574e74040033b820613a2
ui-targetcanary-1
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues/29#start-of-content
https://patch-diff.githubusercontent.com/
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2FDataArcTech%2FSQL-R1%2Fissues%2F29
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2FDataArcTech%2FSQL-R1%2Fissues%2F29
Sign up https://patch-diff.githubusercontent.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=DataArcTech%2FSQL-R1
Reloadhttps://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues/29
Reloadhttps://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues/29
Reloadhttps://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues/29
DataArcTech https://patch-diff.githubusercontent.com/DataArcTech
SQL-R1https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1
Notifications https://patch-diff.githubusercontent.com/login?return_to=%2FDataArcTech%2FSQL-R1
Fork 16 https://patch-diff.githubusercontent.com/login?return_to=%2FDataArcTech%2FSQL-R1
Star 125 https://patch-diff.githubusercontent.com/login?return_to=%2FDataArcTech%2FSQL-R1
Code https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1
Issues 2 https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues
Pull requests 0 https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/pulls
Actions https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/actions
Projects 0 https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/projects
Security 0 https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/security
Insights https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/pulse
Code https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1
Issues https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues
Pull requests https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/pulls
Actions https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/actions
Projects https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/projects
Security https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/security
Insights https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/pulse
New issuehttps://patch-diff.githubusercontent.com/login?return_to=https://github.com/DataArcTech/SQL-R1/issues/29
New issuehttps://patch-diff.githubusercontent.com/login?return_to=https://github.com/DataArcTech/SQL-R1/issues/29
CUDA out of memory during inference with SQL-R1-14Bhttps://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues/29#top
https://github.com/thatmee
https://github.com/thatmee
thatmeehttps://github.com/thatmee
on Jan 20, 2026https://github.com/DataArcTech/SQL-R1/issues/29#issue-3833075541
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.