Title: merge API server example / module · Issue #2 · Stonelinks/llama-cpp-python · GitHub
Open Graph Title: merge API server example / module · Issue #2 · Stonelinks/llama-cpp-python
X Title: merge API server example / module · Issue #2 · Stonelinks/llama-cpp-python
Description: Currently there exist two server implementations: llama_cpp/server/__main__.py, the module that's runnable by consumers of the library with python3 -m llama_cpp.server examples/high_level_api/fastapi_server.py, which is probably a copy-p...
Open Graph Description: Currently there exist two server implementations: llama_cpp/server/__main__.py, the module that's runnable by consumers of the library with python3 -m llama_cpp.server examples/high_level_api/fasta...
X Description: Currently there exist two server implementations: llama_cpp/server/__main__.py, the module that's runnable by consumers of the library with python3 -m llama_cpp.server examples/high_level_api/f...
Opengraph URL: https://github.com/Stonelinks/llama-cpp-python/issues/2
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"merge API server example / module","articleBody":"Currently there exist two server implementations:\r\n\r\n- `llama_cpp/server/__main__.py`, the module that's runnable by consumers of the library with `python3 -m llama_cpp.server`\r\n- `examples/high_level_api/fastapi_server.py`, which is probably a copy-pasted example by folks hacking around\r\n\r\nIMO this is confusing. As a new user of the library I see they've both been updated relatively recently but looking side-by-side there's a diff.\r\n\r\nThe one in the module seems better:\r\n- supports logits_all\r\n- supports use_mmap\r\n- has experimental cache support (with some mutex thing going on)\r\n- some stuff with streaming support was moved around more recently than fastapi_server.py\r\n\r\nSo IMO the example server should go away (perhaps just import the module's server and run it after #1 is done)\r\n\r\n\r\n\r\n","author":{"url":"https://github.com/Stonelinks","@type":"Person","name":"Stonelinks"},"datePublished":"2023-04-29T05:06:29.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/2/llama-cpp-python/issues/2"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:b3e4542c-aab9-6217-f78e-a9470a477448 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | E192:380F1C:7428B3:A3EB77:6978ACFA |
| html-safe-nonce | d977adefba9e9b7e8696a551a636958e5f14a5459dfe1d80a0d8831085d114ae |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJFMTkyOjM4MEYxQzo3NDI4QjM6QTNFQjc3OjY5NzhBQ0ZBIiwidmlzaXRvcl9pZCI6IjEwNTIxNTYxNzkxMzMyNzk0ODIiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== |
| visitor-hmac | 485e76b8a7447d310b91504985571d8c9a461cbb57ad8052b398016e4d0d8ac8 |
| hovercard-subject-tag | issue:1689394866 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/Stonelinks/llama-cpp-python/2/issue_layout |
| twitter:image | https://opengraph.githubassets.com/251b10c9c7219439e7ab3983bc498227f365bbadd2e5e533bc8069a3e036fbcd/Stonelinks/llama-cpp-python/issues/2 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/251b10c9c7219439e7ab3983bc498227f365bbadd2e5e533bc8069a3e036fbcd/Stonelinks/llama-cpp-python/issues/2 |
| og:image:alt | Currently there exist two server implementations: llama_cpp/server/__main__.py, the module that's runnable by consumers of the library with python3 -m llama_cpp.server examples/high_level_api/fasta... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | Stonelinks |
| hostname | github.com |
| expected-hostname | github.com |
| None | 2981c597c945c1d90ac6fa355ce7929b2f413dfe7872ca5c435ee53a24a1de50 |
| turbo-cache-control | no-preview |
| go-import | github.com/Stonelinks/llama-cpp-python git https://github.com/Stonelinks/llama-cpp-python.git |
| octolytics-dimension-user_id | 556340 |
| octolytics-dimension-user_login | Stonelinks |
| octolytics-dimension-repository_id | 634091346 |
| octolytics-dimension-repository_nwo | Stonelinks/llama-cpp-python |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | true |
| octolytics-dimension-repository_parent_id | 617868717 |
| octolytics-dimension-repository_parent_nwo | abetlen/llama-cpp-python |
| octolytics-dimension-repository_network_root_id | 617868717 |
| octolytics-dimension-repository_network_root_nwo | abetlen/llama-cpp-python |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | f8aa86d87c47054170094daaf9699b27a28a8448 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width