llama.cpp native API provider #2
Labels
No labels
bug
commercial
documentation
duplicate
enhancement
feature
good first issue
help wanted
invalid
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
jasoncouture/llama-shears#2
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Separate provider targeting
llama-server's native endpoints (/completion,/embedding, slot management) for the extra knobs OpenAI-compat hides — logprobs, multi-slot batching, finer-grained sampling controls.Sibling to the OpenAI-compatible provider; choose one or the other per agent based on which control surface the deployment needs.
Tracked in TASKS.md.
Closed by PR #43 — design pivoted from a separate
/completion//embeddingprovider to the OpenAI-compat provider withOpenAIProviderOptions.ExtraRequestParams(deep-merged into every request body). llama-server's native knobs (cache_prompt,slot_id,samplers,n_probs,min_p, etc.) round-trip via that field, so a single provider covers api.openai.com, llama-server, vLLM, LM Studio, and TabbyAPI. If a future need surfaces for the raw/completionendpoint specifically, reopen with the use case.