René's URL Explorer Experiment

Title: CUDA out of memory during inference with SQL-R1-14B · Issue #29 · DataArcTech/SQL-R1 · GitHub

Open Graph Title: CUDA out of memory during inference with SQL-R1-14B · Issue #29 · DataArcTech/SQL-R1

X Title: CUDA out of memory during inference with SQL-R1-14B · Issue #29 · DataArcTech/SQL-R1

Description: Hello!! I tried to inference SQL-R1-14B on 1*A100 80GB. No matter I set the gpu_memory_utilization=0.9\0.8\0.5, I always get the CUDA out of memory error. The SQL-R1-3B and 7B have all been successfully ran on my device. Besides, I can a...

Open Graph Description: Hello!! I tried to inference SQL-R1-14B on 1*A100 80GB. No matter I set the gpu_memory_utilization=0.9\0.8\0.5, I always get the CUDA out of memory error. The SQL-R1-3B and 7B have all been success...

X Description: Hello!! I tried to inference SQL-R1-14B on 1*A100 80GB. No matter I set the gpu_memory_utilization=0.9\0.8\0.5, I always get the CUDA out of memory error. The SQL-R1-3B and 7B have all been success...

Opengraph URL: https://github.com/DataArcTech/SQL-R1/issues/29

X: @github

direct link

Domain: patch-diff.githubusercontent.com

Hey, it has json ld scripts:

{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"CUDA out of memory during inference with SQL-R1-14B","articleBody":"Hello!! I tried to inference SQL-R1-14B on 1*A100 80GB. No matter I set the gpu_memory_utilization=0.9\\0.8\\0.5, I always get the CUDA out of memory error. The SQL-R1-3B and 7B have all been successfully ran on my device. Besides, I can also run other models around 14B on my device. Do you have any ideas about this error? Thanks :)\n\nHere is the config log of vLLM:\n```\nInitializing an LLM engine (vdev) with config: model='MPX0222forHF/SQL-R1-14B', speculative_config=None, tokenizer='MPX0222forHF/SQL-R1-14B', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=8192, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=True, quantization=None, enforce_eager=True, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=MPX0222forHF/SQL-R1-14B, use_v2_block_manager=True, num_scheduler_steps=1, chunked_prefill_enabled=False multi_step_stream_outputs=True, enable_prefix_caching=False, use_async_output_proc=False, use_cached_outputs=False, mm_processor_kwargs=None)\n```\n\nHere is the error message:\n```\ntorch.OutOfMemoryError: CUDA out of memory. Tried to allocate 270.00 MiB. GPU 0 has a total capacity of 79.15 GiB of which 236.69 MiB is free. Process 2457479 has 60.40 GiB memory in use. Including non-PyTorch memory, this process has 18.51 GiB memory in use. Of the allocated memory 18.01 GiB is allocated by PyTorch, and 12.92 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)\n```","author":{"url":"https://github.com/thatmee","@type":"Person","name":"thatmee"},"datePublished":"2026-01-20T10:27:32.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":1},"url":"https://github.com/29/SQL-R1/issues/29"}

route-pattern	/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controller	voltron_issues_fragments
route-action	issue_layout
fetch-nonce	v2:67bc9ed4-995b-8e92-e305-ea2a63bed702
current-catalog-service-hash	81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-id	DFC4:3AFAA9:B57ECB:F4E5E7:698E42AA
html-safe-nonce	2d80056d1cc1cc985b55233d2216b83d15ac259dd71ce3eef18329bd6d611d0a
visitor-payload	eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJERkM0OjNBRkFBOTpCNTdFQ0I6RjRFNUU3OjY5OEU0MkFBIiwidmlzaXRvcl9pZCI6IjI3NjYxOTIyMTcyNDg2NDU4MDIiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ==
visitor-hmac	1ec5f85502cb53db846f25ad0aad980515d5d71597c9e1cacf06c5b8739c1ecb
hovercard-subject-tag	issue:3833075541
github-keyboard-shortcuts	repository,issues,copilot
google-site-verification	Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-url	https://collector.github.com/github/collect
analytics-location	///voltron/issues_fragments/issue_layout
fb:app_id	1401488693436528
apple-itunes-app	app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/DataArcTech/SQL-R1/29/issue_layout
twitter:image	https://opengraph.githubassets.com/64eab7412456c732190e8e4a709bd2cd961d6384884a553d42e1b9d5ca9cd0b8/DataArcTech/SQL-R1/issues/29
twitter:card	summary_large_image
og:image	https://opengraph.githubassets.com/64eab7412456c732190e8e4a709bd2cd961d6384884a553d42e1b9d5ca9cd0b8/DataArcTech/SQL-R1/issues/29
og:image:alt	Hello!! I tried to inference SQL-R1-14B on 1*A100 80GB. No matter I set the gpu_memory_utilization=0.9\0.8\0.5, I always get the CUDA out of memory error. The SQL-R1-3B and 7B have all been success...
og:image:width	1200
og:image:height	600
og:site_name	GitHub
og:type	object
og:author:username	thatmee
hostname	github.com
expected-hostname	github.com
None	a5632af64f7fed7bff1d6a428d1aca1b94fa7a48f760de2d39d9b1effdbf0082
turbo-cache-control	no-preview
go-import	github.com/DataArcTech/SQL-R1 git https://github.com/DataArcTech/SQL-R1.git
octolytics-dimension-user_id	149999489
octolytics-dimension-user_login	DataArcTech
octolytics-dimension-repository_id	981865038
octolytics-dimension-repository_nwo	DataArcTech/SQL-R1
octolytics-dimension-repository_public	true
octolytics-dimension-repository_is_fork	false
octolytics-dimension-repository_network_root_id	981865038
octolytics-dimension-repository_network_root_nwo	DataArcTech/SQL-R1
turbo-body-classes	logged-out env-production page-responsive
disable-turbo	false
browser-stats-url	https://api.github.com/_private/browser/stats
browser-errors-url	https://api.github.com/_private/browser/errors
release	3dda52e29a416820ced574e74040033b820613a2
ui-target	canary-1
theme-color	#1e2327
color-scheme	light dark

Links:

Skip to content	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues/29#start-of-content
	https://patch-diff.githubusercontent.com/
Sign in	https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2FDataArcTech%2FSQL-R1%2Fissues%2F29
GitHub CopilotWrite better code with AI	https://github.com/features/copilot
GitHub SparkBuild and deploy intelligent apps	https://github.com/features/spark
GitHub ModelsManage and compare prompts	https://github.com/features/models
MCP RegistryNewIntegrate external tools	https://github.com/mcp
ActionsAutomate any workflow	https://github.com/features/actions
CodespacesInstant dev environments	https://github.com/features/codespaces
IssuesPlan and track work	https://github.com/features/issues
Code ReviewManage code changes	https://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilities	https://github.com/security/advanced-security
Code securitySecure your code as you build	https://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they start	https://github.com/security/advanced-security/secret-protection
Why GitHub	https://github.com/why-github
Documentation	https://docs.github.com
Blog	https://github.blog
Changelog	https://github.blog/changelog
Marketplace	https://github.com/marketplace
View all features	https://github.com/features
Enterprises	https://github.com/enterprise
Small and medium teams	https://github.com/team
Startups	https://github.com/enterprise/startups
Nonprofits	https://github.com/solutions/industry/nonprofits
App Modernization	https://github.com/solutions/use-case/app-modernization
DevSecOps	https://github.com/solutions/use-case/devsecops
DevOps	https://github.com/solutions/use-case/devops
CI/CD	https://github.com/solutions/use-case/ci-cd
View all use cases	https://github.com/solutions/use-case
Healthcare	https://github.com/solutions/industry/healthcare
Financial services	https://github.com/solutions/industry/financial-services
Manufacturing	https://github.com/solutions/industry/manufacturing
Government	https://github.com/solutions/industry/government
View all industries	https://github.com/solutions/industry
View all solutions	https://github.com/solutions
AI	https://github.com/resources/articles?topic=ai
Software Development	https://github.com/resources/articles?topic=software-development
DevOps	https://github.com/resources/articles?topic=devops
Security	https://github.com/resources/articles?topic=security
View all topics	https://github.com/resources/articles
Customer stories	https://github.com/customer-stories
Events & webinars	https://github.com/resources/events
Ebooks & reports	https://github.com/resources/whitepapers
Business insights	https://github.com/solutions/executive-insights
GitHub Skills	https://skills.github.com
Documentation	https://docs.github.com
Customer support	https://support.github.com
Community forum	https://github.com/orgs/community/discussions
Trust center	https://github.com/trust-center
Partners	https://github.com/partners
GitHub SponsorsFund open source developers	https://github.com/sponsors
Security Lab	https://securitylab.github.com
Maintainer Community	https://maintainers.github.com
Accelerator	https://github.com/accelerator
Archive Program	https://archiveprogram.github.com
Topics	https://github.com/topics
Trending	https://github.com/trending
Collections	https://github.com/collections
Enterprise platformAI-powered developer platform	https://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security features	https://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI features	https://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 support	https://github.com/premium-support
Pricing	https://github.com/pricing
Search syntax tips	https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentation	https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in	https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2FDataArcTech%2FSQL-R1%2Fissues%2F29
Sign up	https://patch-diff.githubusercontent.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=DataArcTech%2FSQL-R1
Reload	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues/29
Reload	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues/29
Reload	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues/29
DataArcTech	https://patch-diff.githubusercontent.com/DataArcTech
SQL-R1	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1
Notifications	https://patch-diff.githubusercontent.com/login?return_to=%2FDataArcTech%2FSQL-R1
Fork 16	https://patch-diff.githubusercontent.com/login?return_to=%2FDataArcTech%2FSQL-R1
Star 125	https://patch-diff.githubusercontent.com/login?return_to=%2FDataArcTech%2FSQL-R1
Code	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1
Issues 2	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues
Pull requests 0	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/pulls
Actions	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/actions
Projects 0	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/projects
Security 0	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/security
Insights	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/pulse
Code	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1
Issues	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues
Pull requests	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/pulls
Actions	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/actions
Projects	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/projects
Security	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/security
Insights	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/pulse
New issue	https://patch-diff.githubusercontent.com/login?return_to=https://github.com/DataArcTech/SQL-R1/issues/29
New issue	https://patch-diff.githubusercontent.com/login?return_to=https://github.com/DataArcTech/SQL-R1/issues/29
CUDA out of memory during inference with SQL-R1-14B	https://patch-diff.githubusercontent.com/DataArcTech/SQL-R1/issues/29#top
	https://github.com/thatmee
	https://github.com/thatmee
thatmee	https://github.com/thatmee
on Jan 20, 2026	https://github.com/DataArcTech/SQL-R1/issues/29#issue-3833075541
	https://github.com
Terms	https://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacy	https://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Security	https://github.com/security
Status	https://www.githubstatus.com/
Community	https://github.community/
Docs	https://docs.github.com/
Contact	https://support.github.com?tags=dotcom-footer

Viewport: width=device-width

URLs of crawlers that visited me.