René's URL Explorer Experiment


Title: GitHub - gpu-mode/resource-stream: GPU programming related news and material links

Open Graph Title: GitHub - gpu-mode/resource-stream: GPU programming related news and material links

X Title: GitHub - gpu-mode/resource-stream: GPU programming related news and material links

Description: GPU programming related news and material links. Contribute to gpu-mode/resource-stream development by creating an account on GitHub.

Open Graph Description: GPU programming related news and material links. Contribute to gpu-mode/resource-stream development by creating an account on GitHub.

X Description: GPU programming related news and material links. Contribute to gpu-mode/resource-stream development by creating an account on GitHub.

Opengraph URL: https://github.com/gpu-mode/resource-stream

X: @github

direct link

Domain: patch-diff.githubusercontent.com

route-pattern/:user_id/:repository
route-controllerfiles
route-actiondisambiguate
fetch-noncev2:50101990-3c5a-43cf-2c16-ce4328484b19
current-catalog-service-hashf3abb0cc802f3d7b95fc8762b94bdcb13bf39634c40c357301c4aa1d67a256fb
request-id91A0:1296DF:4092B:522D4:698D5248
html-safe-nonce10cfc43da1c1cfb3cb63669ce0fee9f1812fa1c90e78ae213f0335512f4faaac
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5MUEwOjEyOTZERjo0MDkyQjo1MjJENDo2OThENTI0OCIsInZpc2l0b3JfaWQiOiI5MzY2NTI4NTYyOTg5MTg0NzIiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ==
visitor-hmace5d25fa4378da48656aa56a847c0eed912d8ffbd2717094f1f18a1f861e92d21
hovercard-subject-tagrepository:736261057
github-keyboard-shortcutsrepository,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location//
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/gpu-mode/resource-stream
twitter:imagehttps://opengraph.githubassets.com/2153d85f4a3e3e189af49948938550632f1c2c207662b699be77d258e59d927b/gpu-mode/resource-stream
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/2153d85f4a3e3e189af49948938550632f1c2c207662b699be77d258e59d927b/gpu-mode/resource-stream
og:image:altGPU programming related news and material links. Contribute to gpu-mode/resource-stream development by creating an account on GitHub.
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
hostnamegithub.com
expected-hostnamegithub.com
Nonec0818105fa276287e9369cfdefa0a0fa7953719791ceff9b94d69623c0a4fe8a
turbo-cache-controlno-preview
go-importgithub.com/gpu-mode/resource-stream git https://github.com/gpu-mode/resource-stream.git
octolytics-dimension-user_id154984337
octolytics-dimension-user_logingpu-mode
octolytics-dimension-repository_id736261057
octolytics-dimension-repository_nwogpu-mode/resource-stream
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id736261057
octolytics-dimension-repository_network_root_nwogpu-mode/resource-stream
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
releasea95a17cc440c14d4fcddc0641bc1136fa8d908f0
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream#start-of-content
https://patch-diff.githubusercontent.com/
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Fgpu-mode%2Fresource-stream
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Fgpu-mode%2Fresource-stream
Sign up https://patch-diff.githubusercontent.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E&source=header-repo&source_repo=gpu-mode%2Fresource-stream
Reloadhttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream
Reloadhttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream
Reloadhttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream
gpu-mode https://patch-diff.githubusercontent.com/gpu-mode
resource-streamhttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream
Notifications https://patch-diff.githubusercontent.com/login?return_to=%2Fgpu-mode%2Fresource-stream
Fork 113 https://patch-diff.githubusercontent.com/login?return_to=%2Fgpu-mode%2Fresource-stream
Star 2k https://patch-diff.githubusercontent.com/login?return_to=%2Fgpu-mode%2Fresource-stream
discord.gg/gpumodehttps://discord.gg/gpumode
MIT license https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/blob/main/LICENSE
2k stars https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/stargazers
113 forks https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/forks
Branches https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/branches
Tags https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/tags
Activity https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/activity
Star https://patch-diff.githubusercontent.com/login?return_to=%2Fgpu-mode%2Fresource-stream
Notifications https://patch-diff.githubusercontent.com/login?return_to=%2Fgpu-mode%2Fresource-stream
Code https://patch-diff.githubusercontent.com/gpu-mode/resource-stream
Issues 1 https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/issues
Pull requests 1 https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/pulls
Actions https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/actions
Projects 0 https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/projects
Security 0 https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/security
Insights https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/pulse
Code https://patch-diff.githubusercontent.com/gpu-mode/resource-stream
Issues https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/issues
Pull requests https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/pulls
Actions https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/actions
Projects https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/projects
Security https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/security
Insights https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/pulse
Brancheshttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/branches
Tagshttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/tags
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/branches
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/tags
107 Commitshttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/commits/main/
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/commits/main/
.gitignorehttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/blob/main/.gitignore
.gitignorehttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/blob/main/.gitignore
LICENSEhttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/blob/main/LICENSE
LICENSEhttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/blob/main/LICENSE
README.mdhttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/blob/main/README.md
README.mdhttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/blob/main/README.md
READMEhttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream
MIT licensehttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#gpu-mode-resource-stream
https://discord.gg/gpumode
https://discord.gg/gpumodehttps://discord.gg/gpumode
Tritonhttps://triton-lang.org
How to contributehttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream#how-to-contribute
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#lectures--reading-group-live-sessions
discord serverhttps://discord.gg/gpumode
YouTube channelhttps://www.youtube.com/@GPUMODE
lectureshttps://github.com/gpu-mode/lectures
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#1st-contact-with-cuda
An Easy Introduction to CUDA C and C++https://developer.nvidia.com/blog/easy-introduction-cuda-c-and-c/
An Even Easier Introduction to CUDAhttps://developer.nvidia.com/blog/even-easier-introduction-cuda/
CUDA Toolkit Documentation https://docs.nvidia.com/cuda/
Accelerated Computing Hubhttps://github.com/NVIDIA/accelerated-computing-hub/
Wiki: Thread Blockhttps://en.wikipedia.org/wiki/Thread_block_(CUDA_programming)
A tour of CUDAhttps://tbetcke.github.io/hpc_lecture_notes/cuda_introduction.html
GPU Performance Background User's Guidehttps://docs.nvidia.com/deeplearning/performance/dl-performance-gpu-background/index.html
OLCF NVIDIA CUDA Training Serieshttps://www.olcf.ornl.gov/cuda-training-series/
exerciseshttps://github.com/olcf/cuda-training-series
GTC 2022 - CUDA: New Features and Beyond - Stephen Joneshttps://www.youtube.com/watch?v=SAm4gwkj2Ko
Writing Code That Runs FAST on a GPUhttps://youtu.be/8sDg-lD1fZQ
Introduction of CUDA and writing kernels in CUDAhttps://www.youtube.com/watch?v=86FAWCzIe_4
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#2nd-contact
CUDA Refresherhttps://developer.nvidia.com/blog/tag/cuda-refresher/
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#hazy-research
Building Blocks for AI Systemshttps://github.com/HazyResearch/aisys-building-blocks
Data-Centric AIhttps://github.com/HazyResearch/data-centric-ai
Bloghttps://hazyresearch.stanford.edu/blog
ThunderKittenshttps://hazyresearch.stanford.edu/blog/2024-05-12-tk
Systems for Foundation Models, and Foundation Models for Systemshttps://neurips.cc/virtual/2023/invited-talk/73990
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#papers-case-studies
A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS Libraryhttps://arxiv.org/abs/2312.11918
How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Workloghttps://siboehm.com/articles/22/CUDA-MMM
Anatomy of high-performance matrix multiplicationhttps://dl.acm.org/doi/10.1145/1356052.1356053
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#books
Programming Massively Parallel Processors: A Hands-on Approachhttps://www.amazon.com/Programming-Massively-Parallel-Processors-Hands/dp/0323912311
Cuda by Example: An Introduction to General-Purpose Gpu Programminghttps://edoras.sdsu.edu/~mthomas/docs/cuda/cuda_by_example.book.pdf
codehttps://github.com/tpn/cuda-by-example
The CUDA Handbookhttps://www.cudahandbook.com/
The Book of Shadershttps://thebookofshaders.com/
Art of HPChttps://theartofhpc.com/
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#cuda-courses
HetSys: Programming Heterogeneous Computing Systems with GPUs and other Acceleratorshttps://safari.ethz.ch/projects_and_seminars/fall2022/doku.php?id%253Dheterogeneous_systems
Heterogeneous Parallel Programming Classhttps://www.youtube.com/playlist?list=PLzn6LN6WhlN06hIOA_ge6SrgdeSiuf9Tb
Official YouTube channel for "Programming Massively Parallel Processors: A Hands-on Approach"https://www.youtube.com/@pmpp-book
Applied Parallel Programminghttps://www.youtube.com/playlist?list=PLRRuQYjFhpmvu5ODQoY2l7D0ADgWEcYAX
Programming Parallel Computershttps://ppc-exercises.cs.aalto.fi/courses
Open Course Versionhttps://ppc-exercises.cs.aalto.fi/course/open2024a
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#cuda-grandmasters
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#tri-dao
@tri_daohttps://twitter.com/tri_dao
tridaohttps://github.com/tridao
Dao-AILab/flash-attentionhttps://github.com/Dao-AILab/flash-attention
paperhttps://arxiv.org/abs/2205.14135
state-spaces/mambahttps://github.com/state-spaces/mamba
Mamba: Linear-Time Sequence Modeling with Selective State Spaceshttps://arxiv.org/abs/2312.00752
mamba-minimalhttps://github.com/johnma2006/mamba-minimal
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#tim-dettmers
@Tim_Dettmershttps://twitter.com/Tim_Dettmers
TimDettmershttps://github.com/TimDettmers
TimDettmers/bitsandbyteshttps://github.com/TimDettmers/bitsandbytes
docshttps://bitsandbytes.readthedocs.io/en/latest/
QLoRA: Efficient Finetuning of Quantized LLMshttps://arxiv.org/abs/2305.14314
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#sasha-rush
@srush_nlphttps://twitter.com/srush_nlp
srushhttps://github.com/srush
Sasha Rush's GPU Puzzleshttps://github.com/srush/GPU-Puzzles
CUDA C++ versionhttps://github.com/dshah3/GPU-Puzzles
walkthrough videohttps://www.youtube.com/watch?v=3frRR6fycgM
Mamba: The Hard Wayhttps://srush.github.io/annotated-mamba/hard.html
srush/annotated-mambahttps://github.com/srush/annotated-mamba
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#practice
Adnan Aziz and Anupam Bhatnagar GPU Puzzlershttp://www.gpupuzzlers.com/
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#pytorch-performance-optimization
Accelerating Generative AI with PyTorch: Segment Anything, Fasthttps://pytorch.org/blog/accelerating-generative-ai/
Accelerating Generative AI with PyTorch II: GPT, Fasthttps://pytorch.org/blog/accelerating-generative-ai-2/
Speed, Python: Pick Two. How CUDA Graphs Enable Fast Python Code for Deep Learninghttps://blog.fireworks.ai/speed-python-pick-two-how-cuda-graphs-enable-fast-python-code-for-deep-learning-353bf6241248
Performance Debugging of Production PyTorch Models at Metahttps://pytorch.org/blog/performance-debugging-of-production-pytorch-models-at-meta/
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#pytorch-internals--debugging
TorchDynamo Deep Divehttps://pytorch.org/docs/stable/torch.compiler_dynamo_overview.html
PyTorch Compiler Troubleshootinghttps://github.com/pytorch/pytorch/blob/main/docs/source/torch.compiler_troubleshooting.rst
PyTorch internalshttp://blog.ezyang.com/2019/05/pytorch-internals/
Pytorch 2 internalshttps://drive.google.com/file/d/1XBox0G3FI-71efQQjmqGh0-VkCd-AHPL/view
1: Visualizing All Allocations over Timehttps://pytorch.org/blog/understanding-gpu-memory-1/
2: Finding and Removing Reference Cycleshttps://pytorch.org/blog/understanding-gpu-memory-2/
Debugging PyTorch memory use with snapshotshttps://zdevito.github.io/2022/08/16/memory-snapshots.html
https://zdevito.github.io/2022/08/04/cuda-caching-allocator.htmlhttps://zdevito.github.io/2022/08/04/cuda-caching-allocator.html
PyTorch Trace Analysis for the Masseshttps://pytorch.org/blog/trace-analysis-for-masses/
Holistic Trace Analysis (HTA)https://hta.readthedocs.io/en/latest/
facebookresearch/HolisticTraceAnalysishttps://github.com/facebookresearch/HolisticTraceAnalysis
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#code--libs
NVIDIA/cutlasshttps://github.com/NVIDIA/cutlass
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#essentials
Triton compiler tutorialshttps://triton-lang.org/main/getting-started/tutorials/index.html
CUDA C++ Programming Guidehttps://docs.nvidia.com/cuda/cuda-c-programming-guide/
PyTorch: Custom C++ and CUDA Extensionshttps://pytorch.org/tutorials/advanced/cpp_extension.html
pytorch/extension-cpphttps://github.com/pytorch/extension-cpp/tree/master
PyTorch C++ APIhttps://pytorch.org/cppdocs/index.html
pybind11 documentationhttps://pybind11.readthedocs.io/en/stable/
NVIDIA Tensor Core Programminghttps://leimao.github.io/blog/NVIDIA-Tensor-Core-Programming/
GPU Programming: When, Why and How?https://enccs.github.io/gpu-programming/
How GPU Computing Works | GTC 2021https://youtu.be/3l10o0DYJXg?si=t5FHswnibAbo3s0t
How CUDA Programming Works | GTC 2022https://youtu.be/n6M8R8-PlnE?si=cJ4dWtpYaPoIuJ0q
CUDA Kernel optimization Part 1https://www.youtube.com/watch?v=hOi3NWOPVR8
Part 2https://www.youtube.com/watch?v=NrWhZMHrP4w
PTX and ISA Programming Guidehttps://docs.nvidia.com/cuda/parallel-thread-execution/index.html
div 256 -> shr 8 examplehttps://godbolt.org/z/odb3191vK
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#profiling
Nsight Compute Profiling Guidehttps://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html
mcarilli/nsight.shhttps://gist.github.com/mcarilli/376821aa1a7182dfcf59928a7cde3223
Profiling GPU Applications with Nsight Systemshttps://www.youtube.com/watch?v=kKANP0kL_hk
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#python-gpu-computing
PyTorchhttps://pytorch.org/
Trtionhttps://triton-lang.org/main/index.html
openai/tritonhttps://github.com/openai/triton/
numba @cuda.jithttps://numba.readthedocs.io/en/stable/cuda/kernels.html
Apache TVMhttps://tvm.apache.org/
JAX Pallashttps://jax.readthedocs.io/en/latest/pallas/index.html
CuPyhttps://cupy.dev/
NVidia Fuserhttps://github.com/NVIDIA/Fuser/
Codon @gpu.kernelhttps://docs.exaloop.io/codon/advanced/gpu
exaloop/codonhttps://github.com/exaloop/codon
Mojohttps://docs.modular.com/mojo/manual/
MAX Plattformhttps://www.modular.com/max
Modularhttps://www.modular.com
CUDA Pythonhttps://github.com/NVIDIA/cuda-python
cuDNN FrontEnd(FE) APIhttps://github.com/NVIDIA/cudnn-frontend
CUTLASS Python Interfacehttps://github.com/NVIDIA/cutlass/tree/main/python
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#advanced-topics-research-compilers
TACOhttp://tensor-compiler.org/
tensor-compiler/tacohttps://github.com/tensor-compiler/taco
Mosaic compilerhttps://github.com/manya-bansal/mosaic
paperhttps://dl.acm.org/doi/10.1145/3591236
presentationhttps://aha.stanford.edu/mosaic-interoperable-compiler-tensor-algebra
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#news
SemiAnalysishttps://www.semianalysis.com/
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#technical-blog-posts
Cooperative Groups: Flexible CUDA Thread Programminghttps://developer.nvidia.com/blog/cooperative-groups/
A friendly introduction to machine learning compilers and optimizershttps://huyenchip.com/2021/09/07/a-friendly-introduction-to-machine-learning-compilers-and-optimizers.html
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#hardware-architecture
NVIDIA H100 Whitepaperhttps://resources.nvidia.com/en-us-tensor-core/gtc22-whitepaper-hopper
NVIDIA GH200 Whitepaperhttps://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper
AMD CDNA 3 Whitepaperhttps://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf
AMD MI300X Data Sheethttps://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/data-sheets/amd-instinct-mi300x-data-sheet.pdf
Can SRAM Keep Shrinking?https://youtu.be/2G4_RZo41Zw
Asianometryhttps://www.asianometry.com/
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#gpu-mode-community-projects
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#ring-attention
ring-attentionhttps://github.com/gpu-mode/ring-attention
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#pscan
Parallel Prefix Sum (Scan) with CUDAhttps://developer.nvidia.com/gpugems/gpugems3/part-vi-gpu-computing/chapter-39-parallel-prefix-sum-scan-cuda
PDF version (2007)https://developer.download.nvidia.com/compute/cuda/1.1-Beta/x86_website/projects/scan/doc/scan.pdf
stack overflowhttps://stackoverflow.com/a/30835030/387870
mattdean1/cudahttps://github.com/mattdean1/cuda
Accelerating Reduction and Scan Using Tensor Core Unitshttps://arxiv.org/abs/1811.09736
Prefix Sumshttps://docs.nvidia.com/cuda/thrust/index.html#prefix-sums
scan variantshttps://thrust.github.io/doc/group__prefixsums.html
CUBhttps://nvlabs.github.io/cub/
NVIDIA/cccl/tree/main/cubhttps://github.com/NVIDIA/cccl/tree/main/cub
Higher-Order and Tuple-Based Massively-Parallel Prefix Sumshttps://userweb.cs.txstate.edu/~mb92/papers/pldi16.pdf
Single-pass Parallel Prefix Scan with Decoupled Look-backhttps://research.nvidia.com/publication/2016-03_single-pass-parallel-prefix-scan-decoupled-look-back
johnryan465/pscanhttps://github.com/johnryan465/pscan
andreaskoepf/pscan_kernelhttps://github.com/andreaskoepf/pscan_kernel
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#triton-kernels--examples
unslothhttps://github.com/unslothai/unsloth
linkhttps://github.com/pytorch-labs/segment-anything-fast/blob/main/segment_anything_fast/flash_4.py
flash_attn_triton.pyhttps://github.com/Dao-AILab/flash-attention/blob/main/flash_attn/flash_attn_triton.py
Triton Conference 2023https://www.youtube.com/watch?v=ZGU0Yw7mORE&list=PLc_vA1r0qoiRZfUC3o4_yjj0FtWvodKAz
LightLLMhttps://github.com/ModelTC/lightllm
https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#how-to-contribute
editing fileshttps://docs.github.com/en/repositories/working-with-files/managing-files/editing-files
https://discord.gg/gpumodehttps://discord.gg/gpumode
discord.gg/gpumodehttps://discord.gg/gpumode
Readme https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#readme-ov-file
MIT license https://patch-diff.githubusercontent.com/gpu-mode/resource-stream#MIT-1-ov-file
Please reload this pagehttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream
Activityhttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/activity
Custom propertieshttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/custom-properties
2k starshttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/stargazers
46 watchinghttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/watchers
113 forkshttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/forks
Report repository https://patch-diff.githubusercontent.com/contact/report-content?content_url=https%3A%2F%2Fgithub.com%2Fgpu-mode%2Fresource-stream&report=gpu-mode+%28user%29
Please reload this pagehttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream
Contributors 18https://patch-diff.githubusercontent.com/gpu-mode/resource-stream/graphs/contributors
https://github.com/andreaskoepf
https://github.com/msaroufim
https://github.com/usainzg
https://github.com/ericauld
https://github.com/t-vi
https://github.com/megaserg
https://github.com/seemanne
https://github.com/melvinebenezer
https://github.com/ngc92
https://github.com/RicardoHS
https://github.com/philipbutler
https://github.com/mory91
https://github.com/jrhemstad
https://github.com/shreyansh26
+ 4 contributorshttps://patch-diff.githubusercontent.com/gpu-mode/resource-stream/graphs/contributors
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.