OpenAI / OpenAI-compatible provider #1

New issue

Closed

opened 2026-05-07 01:27:07 -04:00 by jasoncouture · 1 comment

jasoncouture commented

2026-05-07 01:27:07 -04:00

(Migrated from github.com)

Mirror LlamaShears.Provider.Ollama's shape for an OpenAI-compatible provider.

llama-server (ships with llama.cpp) exposes /v1/chat/completions, /v1/completions, /v1/embeddings, and /v1/models natively, so the same provider also covers any OpenAI-API-compatible local server (vLLM, LM Studio, TabbyAPI, etc.) — one provider, many backends.

Open questions:

Per-agent base URL (so an agent can target a specific local server) vs host-level default?
Tool-call surface — OpenAI's function-call schema vs Ollama's flatter shape; pick one and adapt the dispatcher, or carry both.

Tracked in TASKS.md.

Mirror `LlamaShears.Provider.Ollama`'s shape for an OpenAI-compatible provider. `llama-server` (ships with llama.cpp) exposes `/v1/chat/completions`, `/v1/completions`, `/v1/embeddings`, and `/v1/models` natively, so the same provider also covers any OpenAI-API-compatible local server (vLLM, LM Studio, TabbyAPI, etc.) — one provider, many backends. Open questions: - Per-agent base URL (so an agent can target a specific local server) vs host-level default? - Tool-call surface — OpenAI's function-call schema vs Ollama's flatter shape; pick one and adapt the dispatcher, or carry both. Tracked in [TASKS.md](../blob/main/TASKS.md).

jasoncouture commented

2026-05-07 20:41:05 -04:00

(Migrated from github.com)

Closed by PR #43 — LlamaShears.Provider.OpenAI ships against /v1/chat/completions with streaming, tool calls, multimodal, and reasoning-content support.

Closed by PR #43 — `LlamaShears.Provider.OpenAI` ships against `/v1/chat/completions` with streaming, tool calls, multimodal, and reasoning-content support.