René's URL Explorer Experiment

Title: GitHub - tmc/llama.cpp at server-parallel

Open Graph Title: GitHub - tmc/llama.cpp at server-parallel

X Title: GitHub - tmc/llama.cpp at server-parallel

Description: Port of Facebook's LLaMA model in C/C++. Contribute to tmc/llama.cpp development by creating an account on GitHub.

Open Graph Description: Port of Facebook's LLaMA model in C/C++. Contribute to tmc/llama.cpp development by creating an account on GitHub.

X Description: Port of Facebook's LLaMA model in C/C++. Contribute to tmc/llama.cpp development by creating an account on GitHub.

Opengraph URL: https://github.com/tmc/llama.cpp

X: @github

direct link

Domain: patch-diff.githubusercontent.com

route-pattern	/:user_id/:repository/tree/name(/path)
route-controller	files
route-action	disambiguate
fetch-nonce	v2:0fc364bf-a7a6-0fbf-4428-52f4275098a4
current-catalog-service-hash	f3abb0cc802f3d7b95fc8762b94bdcb13bf39634c40c357301c4aa1d67a256fb
request-id	A98C:3172B9:F16F5:153883:6978654C
html-safe-nonce	ba16bb7f211e51aef1f28dd8c1a2c91aa3454ac89a078e9f00fcaf9730f0125d
visitor-payload	eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBOThDOjMxNzJCOTpGMTZGNToxNTM4ODM6Njk3ODY1NEMiLCJ2aXNpdG9yX2lkIjoiMjUxOTk0NTg5MzczODIxMDYzNiIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9
visitor-hmac	f6b0a3fa8b0935bfdfacdd6063775baf0008b24535b68def908e9b28f68b18cf
hovercard-subject-tag	repository:730934953
github-keyboard-shortcuts	repository,source-code,file-tree,copilot
google-site-verification	Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-url	https://collector.github.com/github/collect
analytics-location	///files/disambiguate
fb:app_id	1401488693436528
apple-itunes-app	app-id=1477376905, app-argument=https://github.com/tmc/llama.cpp/tree/server-parallel
twitter:image	https://opengraph.githubassets.com/1d4f9be40f823244cb0e340878d33f7fd109c4cdfeb150f26d5baa56de844ab8/tmc/llama.cpp
twitter:card	summary_large_image
og:image	https://opengraph.githubassets.com/1d4f9be40f823244cb0e340878d33f7fd109c4cdfeb150f26d5baa56de844ab8/tmc/llama.cpp
og:image:alt	Port of Facebook's LLaMA model in C/C++. Contribute to tmc/llama.cpp development by creating an account on GitHub.
og:image:width	1200
og:image:height	600
og:site_name	GitHub
og:type	object
hostname	github.com
expected-hostname	github.com
None	2981c597c945c1d90ac6fa355ce7929b2f413dfe7872ca5c435ee53a24a1de50
turbo-cache-control	no-preview
go-import	github.com/tmc/llama.cpp git https://github.com/tmc/llama.cpp.git
octolytics-dimension-user_id	3977
octolytics-dimension-user_login	tmc
octolytics-dimension-repository_id	730934953
octolytics-dimension-repository_nwo	tmc/llama.cpp
octolytics-dimension-repository_public	true
octolytics-dimension-repository_is_fork	true
octolytics-dimension-repository_parent_id	612354784
octolytics-dimension-repository_parent_nwo	ggml-org/llama.cpp
octolytics-dimension-repository_network_root_id	612354784
octolytics-dimension-repository_network_root_nwo	ggml-org/llama.cpp
turbo-body-classes	logged-out env-production page-responsive
disable-turbo	false
browser-stats-url	https://api.github.com/_private/browser/stats
browser-errors-url	https://api.github.com/_private/browser/errors
release	520b65a872113b919c1bbdb03834a50af15859fd
ui-target	full
theme-color	#1e2327
color-scheme	light dark

Links:

Skip to content	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#start-of-content
	https://patch-diff.githubusercontent.com/
Sign in	https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Ftmc%2Fllama.cpp%2Ftree%2Fserver-parallel
GitHub CopilotWrite better code with AI	https://github.com/features/copilot
GitHub SparkBuild and deploy intelligent apps	https://github.com/features/spark
GitHub ModelsManage and compare prompts	https://github.com/features/models
MCP RegistryNewIntegrate external tools	https://github.com/mcp
ActionsAutomate any workflow	https://github.com/features/actions
CodespacesInstant dev environments	https://github.com/features/codespaces
IssuesPlan and track work	https://github.com/features/issues
Code ReviewManage code changes	https://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilities	https://github.com/security/advanced-security
Code securitySecure your code as you build	https://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they start	https://github.com/security/advanced-security/secret-protection
Why GitHub	https://github.com/why-github
Documentation	https://docs.github.com
Blog	https://github.blog
Changelog	https://github.blog/changelog
Marketplace	https://github.com/marketplace
View all features	https://github.com/features
Enterprises	https://github.com/enterprise
Small and medium teams	https://github.com/team
Startups	https://github.com/enterprise/startups
Nonprofits	https://github.com/solutions/industry/nonprofits
App Modernization	https://github.com/solutions/use-case/app-modernization
DevSecOps	https://github.com/solutions/use-case/devsecops
DevOps	https://github.com/solutions/use-case/devops
CI/CD	https://github.com/solutions/use-case/ci-cd
View all use cases	https://github.com/solutions/use-case
Healthcare	https://github.com/solutions/industry/healthcare
Financial services	https://github.com/solutions/industry/financial-services
Manufacturing	https://github.com/solutions/industry/manufacturing
Government	https://github.com/solutions/industry/government
View all industries	https://github.com/solutions/industry
View all solutions	https://github.com/solutions
AI	https://github.com/resources/articles?topic=ai
Software Development	https://github.com/resources/articles?topic=software-development
DevOps	https://github.com/resources/articles?topic=devops
Security	https://github.com/resources/articles?topic=security
View all topics	https://github.com/resources/articles
Customer stories	https://github.com/customer-stories
Events & webinars	https://github.com/resources/events
Ebooks & reports	https://github.com/resources/whitepapers
Business insights	https://github.com/solutions/executive-insights
GitHub Skills	https://skills.github.com
Documentation	https://docs.github.com
Customer support	https://support.github.com
Community forum	https://github.com/orgs/community/discussions
Trust center	https://github.com/trust-center
Partners	https://github.com/partners
GitHub SponsorsFund open source developers	https://github.com/sponsors
Security Lab	https://securitylab.github.com
Maintainer Community	https://maintainers.github.com
Accelerator	https://github.com/accelerator
Archive Program	https://archiveprogram.github.com
Topics	https://github.com/topics
Trending	https://github.com/trending
Collections	https://github.com/collections
Enterprise platformAI-powered developer platform	https://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security features	https://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI features	https://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 support	https://github.com/premium-support
Pricing	https://github.com/pricing
Search syntax tips	https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentation	https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in	https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Ftmc%2Fllama.cpp%2Ftree%2Fserver-parallel
Sign up	https://patch-diff.githubusercontent.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Ffiles%2Fdisambiguate&source=header-repo&source_repo=tmc%2Fllama.cpp
Reload	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel
Reload	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel
Reload	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel
tmc	https://patch-diff.githubusercontent.com/tmc
llama.cpp	https://patch-diff.githubusercontent.com/tmc/llama.cpp
ggml-org/llama.cpp	https://patch-diff.githubusercontent.com/ggml-org/llama.cpp
Notifications	https://patch-diff.githubusercontent.com/login?return_to=%2Ftmc%2Fllama.cpp
Fork 0	https://patch-diff.githubusercontent.com/login?return_to=%2Ftmc%2Fllama.cpp
Star 1	https://patch-diff.githubusercontent.com/login?return_to=%2Ftmc%2Fllama.cpp
MIT license	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/master/LICENSE
1 star	https://patch-diff.githubusercontent.com/tmc/llama.cpp/stargazers
14.6k forks	https://patch-diff.githubusercontent.com/tmc/llama.cpp/forks
Branches	https://patch-diff.githubusercontent.com/tmc/llama.cpp/branches
Tags	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tags
Activity	https://patch-diff.githubusercontent.com/tmc/llama.cpp/activity
Star	https://patch-diff.githubusercontent.com/login?return_to=%2Ftmc%2Fllama.cpp
Notifications	https://patch-diff.githubusercontent.com/login?return_to=%2Ftmc%2Fllama.cpp
Code	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel
Pull requests 0	https://patch-diff.githubusercontent.com/tmc/llama.cpp/pulls
Actions	https://patch-diff.githubusercontent.com/tmc/llama.cpp/actions
Projects 0	https://patch-diff.githubusercontent.com/tmc/llama.cpp/projects
Security 0	https://patch-diff.githubusercontent.com/tmc/llama.cpp/security
Insights	https://patch-diff.githubusercontent.com/tmc/llama.cpp/pulse
Code	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel
Pull requests	https://patch-diff.githubusercontent.com/tmc/llama.cpp/pulls
Actions	https://patch-diff.githubusercontent.com/tmc/llama.cpp/actions
Projects	https://patch-diff.githubusercontent.com/tmc/llama.cpp/projects
Security	https://patch-diff.githubusercontent.com/tmc/llama.cpp/security
Insights	https://patch-diff.githubusercontent.com/tmc/llama.cpp/pulse
Branches	https://patch-diff.githubusercontent.com/tmc/llama.cpp/branches
Tags	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tags
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/branches
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tags
1,334 Commits	https://patch-diff.githubusercontent.com/tmc/llama.cpp/commits/server-parallel/
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/commits/server-parallel/
.devops	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/.devops
.devops	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/.devops
.github	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/.github
.github	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/.github
ci	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/ci
ci	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/ci
common	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/common
common	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/common
docs	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/docs
docs	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/docs
examples	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/examples
examples	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/examples
gguf-py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/gguf-py
gguf-py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/gguf-py
grammars	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/grammars
grammars	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/grammars
media	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/media
media	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/media
models	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/models
models	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/models
pocs	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/pocs
pocs	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/pocs
prompts	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/prompts
prompts	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/prompts
scripts	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/scripts
scripts	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/scripts
spm-headers	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/spm-headers
spm-headers	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/spm-headers
tests	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/tests
tests	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel/tests
.clang-tidy	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.clang-tidy
.clang-tidy	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.clang-tidy
.dockerignore	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.dockerignore
.dockerignore	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.dockerignore
.ecrc	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.ecrc
.ecrc	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.ecrc
.editorconfig	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.editorconfig
.editorconfig	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.editorconfig
.flake8	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.flake8
.flake8	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.flake8
.gitignore	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.gitignore
.gitignore	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.gitignore
.pre-commit-config.yaml	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.pre-commit-config.yaml
.pre-commit-config.yaml	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.pre-commit-config.yaml
CMakeLists.txt	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/CMakeLists.txt
CMakeLists.txt	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/CMakeLists.txt
LICENSE	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/LICENSE
LICENSE	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/LICENSE
Makefile	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/Makefile
Makefile	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/Makefile
Package.swift	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/Package.swift
Package.swift	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/Package.swift
README.md	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/README.md
README.md	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/README.md
SHA256SUMS	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/SHA256SUMS
SHA256SUMS	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/SHA256SUMS
build.zig	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/build.zig
build.zig	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/build.zig
codecov.yml	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/codecov.yml
codecov.yml	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/codecov.yml
convert-baichuan-hf-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-baichuan-hf-to-gguf.py
convert-baichuan-hf-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-baichuan-hf-to-gguf.py
convert-falcon-hf-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-falcon-hf-to-gguf.py
convert-falcon-hf-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-falcon-hf-to-gguf.py
convert-gptneox-hf-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-gptneox-hf-to-gguf.py
convert-gptneox-hf-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-gptneox-hf-to-gguf.py
convert-llama-ggml-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-llama-ggml-to-gguf.py
convert-llama-ggml-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-llama-ggml-to-gguf.py
convert-lora-to-ggml.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-lora-to-ggml.py
convert-lora-to-ggml.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-lora-to-ggml.py
convert-refact-hf-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-refact-hf-to-gguf.py
convert-refact-hf-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-refact-hf-to-gguf.py
convert-starcoder-hf-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-starcoder-hf-to-gguf.py
convert-starcoder-hf-to-gguf.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert-starcoder-hf-to-gguf.py
convert.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert.py
convert.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/convert.py
flake.lock	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/flake.lock
flake.lock	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/flake.lock
flake.nix	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/flake.nix
flake.nix	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/flake.nix
ggml-alloc.c	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-alloc.c
ggml-alloc.c	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-alloc.c
ggml-alloc.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-alloc.h
ggml-alloc.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-alloc.h
ggml-cuda.cu	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-cuda.cu
ggml-cuda.cu	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-cuda.cu
ggml-cuda.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-cuda.h
ggml-cuda.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-cuda.h
ggml-metal.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-metal.h
ggml-metal.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-metal.h
ggml-metal.m	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-metal.m
ggml-metal.m	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-metal.m
ggml-metal.metal	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-metal.metal
ggml-metal.metal	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-metal.metal
ggml-mpi.c	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-mpi.c
ggml-mpi.c	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-mpi.c
ggml-mpi.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-mpi.h
ggml-mpi.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-mpi.h
ggml-opencl.cpp	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-opencl.cpp
ggml-opencl.cpp	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-opencl.cpp
ggml-opencl.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-opencl.h
ggml-opencl.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml-opencl.h
ggml.c	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml.c
ggml.c	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml.c
ggml.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml.h
ggml.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/ggml.h
k_quants.c	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/k_quants.c
k_quants.c	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/k_quants.c
k_quants.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/k_quants.h
k_quants.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/k_quants.h
llama.cpp	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/llama.cpp
llama.cpp	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/llama.cpp
llama.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/llama.h
llama.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/llama.h
mypy.ini	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/mypy.ini
mypy.ini	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/mypy.ini
requirements.txt	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/requirements.txt
requirements.txt	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/requirements.txt
run_with_preset.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/run_with_preset.py
run_with_preset.py	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/run_with_preset.py
unicode.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/unicode.h
unicode.h	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/unicode.h
README	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel
License	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#llamacpp
	https://user-images.githubusercontent.com/1991296/230134379-7181e485-c521-4d23-a0d6-f7b3b61ba524.png
	https://github.com/ggerganov/llama.cpp/actions
	https://opensource.org/licenses/MIT
Roadmap	https://github.com/users/ggerganov/projects/7
Project status	https://github.com/ggerganov/llama.cpp/discussions/3471
Manifesto	https://github.com/ggerganov/llama.cpp/discussions/205
ggml	https://github.com/ggerganov/ggml
LLaMA	https://arxiv.org/abs/2302.13971
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#hot-topics
#3401	https://github.com/ggerganov/llama.cpp/pull/3401
#3228	https://github.com/ggerganov/llama.cpp/pull/3228
Description	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#description
Usage	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#usage
Get the Code	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#get-the-code
Build	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#build
BLAS Build	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#blas-build
Prepare Data & Run	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#prepare-data--run
Memory/Disk Requirements	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#memorydisk-requirements
Quantization	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#quantization
Interactive mode	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#interactive-mode
Constrained output with grammars	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#constrained-output-with-grammars
Instruction mode with Alpaca	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#instruction-mode-with-alpaca
Using OpenLLaMA	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#using-openllama
Using GPT4All	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#using-gpt4all
Using Pygmalion 7B & Metharme 7B	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#using-pygmalion-7b--metharme-7b
Obtaining the Facebook LLaMA original model and Stanford Alpaca model data	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#obtaining-the-facebook-llama-original-model-and-stanford-alpaca-model-data
Verifying the model files	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#verifying-the-model-files
Seminal papers and background on the models	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#seminal-papers-and-background-on-the-models
Perplexity (measuring model quality)	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#perplexity-measuring-model-quality
Android	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#android
Docker	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#docker
Contributing	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#contributing
Coding guidelines	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#coding-guidelines
Docs	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#docs
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#description
hacked in an evening	https://github.com/ggerganov/llama.cpp/issues/33#issuecomment-1465108022
ggml	https://github.com/ggerganov/ggml
Alpaca	https://github.com/ggerganov/llama.cpp#instruction-mode-with-alpaca
GPT4All	https://github.com/ggerganov/llama.cpp#using-gpt4all
Chinese LLaMA / Alpaca	https://github.com/ymcui/Chinese-LLaMA-Alpaca
Chinese LLaMA-2 / Alpaca-2	https://github.com/ymcui/Chinese-LLaMA-Alpaca-2
Vigogne (French)	https://github.com/bofenghuang/vigogne
Vicuna	https://github.com/ggerganov/llama.cpp/discussions/643#discussioncomment-5533894
Koala	https://bair.berkeley.edu/blog/2023/04/03/koala/
OpenBuddy 🐶 (Multilingual)	https://github.com/OpenBuddy/OpenBuddy
Pygmalion 7B / Metharme 7B	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#using-pygmalion-7b--metharme-7b
WizardLM	https://github.com/nlpxucan/WizardLM
Baichuan-7B	https://huggingface.co/baichuan-inc/baichuan-7B
baichuan-7b-sft	https://huggingface.co/hiyouga/baichuan-7b-sft
Aquila-7B	https://huggingface.co/BAAI/Aquila-7B
AquilaChat-7B	https://huggingface.co/BAAI/AquilaChat-7B
Starcoder models	https://github.com/ggerganov/llama.cpp/pull/3187
Mistral AI v0.1	https://huggingface.co/mistralai/Mistral-7B-v0.1
abetlen/llama-cpp-python	https://github.com/abetlen/llama-cpp-python
go-skynet/go-llama.cpp	https://github.com/go-skynet/go-llama.cpp
withcatai/node-llama-cpp	https://github.com/withcatai/node-llama-cpp
hlhr202/llama-node	https://github.com/hlhr202/llama-node
yoshoku/llama_cpp.rb	https://github.com/yoshoku/llama_cpp.rb
mdrokz/rust-llama.cpp	https://github.com/mdrokz/rust-llama.cpp
SciSharp/LLamaSharp	https://github.com/SciSharp/LLamaSharp
donderom/llm4s	https://github.com/donderom/llm4s
phronmophobic/llama.clj	https://github.com/phronmophobic/llama.clj
mybigday/llama.rn	https://github.com/mybigday/llama.rn
kherud/java-llama.cpp	https://github.com/kherud/java-llama.cpp
nat/openplayground	https://github.com/nat/openplayground
oobabooga/text-generation-webui	https://github.com/oobabooga/text-generation-webui
withcatai/catai	https://github.com/withcatai/catai
whisper.cpp	https://github.com/ggerganov/whisper.cpp
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#usage
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#get-the-code
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#build
w64devkit	https://github.com/skeeto/w64devkit/releases
DRM in FreeBSD	https://wiki.freebsd.org/Graphics
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#metal-build
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#mpi-build
MPICH	https://www.mpich.org
OpenMPI	https://www.open-mpi.org
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#blas-build
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#accelerate-framework
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#openblas
w64devkit	https://github.com/skeeto/w64devkit/releases
OpenBLAS for Windows	https://github.com/xianyi/OpenBLAS/releases
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#blis
BLIS.md	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/docs/BLIS.md
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#intel-mkl
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#cublas
CUDA Toolkit	https://developer.nvidia.com/cuda-downloads
CUDA_VISIBLE_DEVICES	https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#hipblas
ROCm Quick Start (Linux)	https://rocm.docs.amd.com/en/latest/deploy/linux/quick_start.html
HIP_VISIBLE_DEVICES	https://rocm.docs.amd.com/en/latest/understand/gpu_isolation.html#hip-visible-devices
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#clblast
CLBlast	https://github.com/CNugteren/CLBlast
OpenCL SDK	https://github.com/KhronosGroup/OpenCL-SDK
OpenCL Releases	https://github.com/KhronosGroup/OpenCL-SDK/releases
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#installing-clblast
CLBlast Releases	https://github.com/CNugteren/CLBlast/releases
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#building-llama-with-clblast
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#running-llama-with-clblast
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#prepare-data--run
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#memorydisk-requirements
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#quantization
k-quants	https://github.com/ggerganov/llama.cpp/pull/1684
#2707	https://github.com/ggerganov/llama.cpp/pull/2707
#2807	https://github.com/ggerganov/llama.cpp/pull/2807
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#perplexity-measuring-model-quality
https://huggingface.co/docs/transformers/perplexity	https://huggingface.co/docs/transformers/perplexity
https://paperswithcode.com/dataset/wikitext-2	https://paperswithcode.com/dataset/wikitext-2
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#interactive-mode
README	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/examples/main/README.md
	https://user-images.githubusercontent.com/1991296/224575029-2af3c7dc-5a65-4f64-a6bb-517a532aea38.png
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#persistent-interaction
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#constrained-output-with-grammars
GBNF Guide	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/grammars/README.md
https://grammar.intrinsiclabs.ai/	https://grammar.intrinsiclabs.ai/
its repo	http://github.com/intrinsiclabsai/gbnfgen
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#instruction-mode-with-alpaca
OpenLLaMA	https://github.com/openlm-research/open_llama
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#using-openllama
3B	https://huggingface.co/openlm-research/open_llama_3b
7B	https://huggingface.co/openlm-research/open_llama_7b
13B	https://huggingface.co/openlm-research/open_llama_13b
GPT4All	https://github.com/nomic-ai/gpt4all
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#using-gpt4all
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#using-pygmalion-7b--metharme-7b
LLaMA weights	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#obtaining-the-facebook-llama-original-model-and-stanford-alpaca-model-data
Pygmalion 7B	https://huggingface.co/PygmalionAI/pygmalion-7b/
Metharme 7B	https://huggingface.co/PygmalionAI/metharme-7b
the latest HF convert script	https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py
xor_codec	https://huggingface.co/PygmalionAI/pygmalion-7b/blob/main/xor_codec.py
bfloat16	https://en.wikipedia.org/wiki/Bfloat16_floating-point_format
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#obtaining-the-facebook-llama-original-model-and-stanford-alpaca-model-data
Facebook's LLaMA repository	https://github.com/facebookresearch/llama/pull/73/files
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#obtaining-and-using-the-facebook-llama-2-model
Facebook's LLaMA download page	https://ai.meta.com/resources/models-and-libraries/llama-downloads/
TheBloke	https://huggingface.co/TheBloke
LLaMA 2 7B base	https://huggingface.co/TheBloke/Llama-2-7B-GGUF
LLaMA 2 13B base	https://huggingface.co/TheBloke/Llama-2-13B-GGUF
LLaMA 2 70B base	https://huggingface.co/TheBloke/Llama-2-70B-GGUF
LLaMA 2 7B chat	https://huggingface.co/TheBloke/Llama-2-7B-chat-GGUF
LLaMA 2 13B chat	https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF
LLaMA 2 70B chat	https://huggingface.co/TheBloke/Llama-2-70B-chat-GGUF
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#verifying-the-model-files
sha256 checksums	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/SHA256SUMS
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#seminal-papers-and-background-on-the-models
Introducing LLaMA: A foundational, 65-billion-parameter large language model	https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
LLaMA: Open and Efficient Foundation Language Models	https://arxiv.org/abs/2302.13971
Language Models are Few-Shot Learners	https://arxiv.org/abs/2005.14165
Aligning language models to follow instructions	https://openai.com/research/instruction-following
Training language models to follow instructions with human feedback	https://arxiv.org/abs/2203.02155
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#how-to-run
https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-raw-v1.zip?ref=salesforce-research	https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-raw-v1.zip?ref=salesforce-research
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#android
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#building-the-project-using-android-ndk
termux	https://termux.dev/
Android NDK	https://developer.android.com/ndk
termux	https://termux.dev/
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#building-the-project-using-termux-f-droid
https://github.com/CNugteren/CLBlast	https://github.com/CNugteren/CLBlast
https://www.reddit.com/r/termux/comments/kc3ynp/opencl_working_in_termux_more_in_comments/	https://www.reddit.com/r/termux/comments/kc3ynp/opencl_working_in_termux_more_in_comments/
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#docker
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#prerequisites
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#images
.devops/	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.devops
.github/workflows/docker.yml	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/.github/workflows/docker.yml
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#usage-1
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#docker-with-cuda
nvidia-container-toolkit	https://github.com/NVIDIA/nvidia-container-toolkit
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#building-locally
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#usage-2
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#contributing
Inference at the edge	https://github.com/ggerganov/llama.cpp/discussions/205
Changelog podcast	https://changelog.com/podcast/532
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#coding-guidelines
good first issues	https://github.com/ggerganov/llama.cpp/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22
	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#docs
main	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/examples/main/README.md
server	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/examples/server/README.md
embd-input	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/examples/embd-input/README.md
jeopardy	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/examples/jeopardy/README.md
BLIS	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/docs/BLIS.md
Performance troubleshooting	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/docs/token_generation_performance_tips.md
GGML tips & tricks	https://github.com/ggerganov/llama.cpp/wiki/GGML-Tips-&-Tricks
GBNF grammars	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/server-parallel/grammars/README.md
Readme	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel#readme-ov-file
MIT license	https://patch-diff.githubusercontent.com/tmc/llama.cpp/blob/master/LICENSE
Please reload this page	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tree/server-parallel
Activity	https://patch-diff.githubusercontent.com/tmc/llama.cpp/activity
1 star	https://patch-diff.githubusercontent.com/tmc/llama.cpp/stargazers
0 watching	https://patch-diff.githubusercontent.com/tmc/llama.cpp/watchers
0 forks	https://patch-diff.githubusercontent.com/tmc/llama.cpp/forks
Report repository	https://patch-diff.githubusercontent.com/contact/report-content?content_url=https%3A%2F%2Fgithub.com%2Ftmc%2Fllama.cpp&report=tmc+%28user%29
Releases	https://patch-diff.githubusercontent.com/tmc/llama.cpp/releases
1,071 tags	https://patch-diff.githubusercontent.com/tmc/llama.cpp/tags
Packages 0	https://patch-diff.githubusercontent.com/users/tmc/packages?repo_name=llama.cpp
	https://github.com
Terms	https://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacy	https://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Security	https://github.com/security
Status	https://www.githubstatus.com/
Community	https://github.community/
Docs	https://docs.github.com/
Contact	https://support.github.com?tags=dotcom-footer

Viewport: width=device-width

URLs of crawlers that visited me.