3.9 KiB
Ollama-OpenAI Proxy
This is a Go-based proxy server that enables applications designed to work with the Ollama API to interact seamlessly with an OpenAI-compatible endpoint. It translates and forwards requests and responses between the two APIs while applying custom transformations to the model names and data formats.
Note: This is a pet project I use to forward requests to LiteLLM for use with Kerlig and Enchanted, which doesn’t support custom OpenAI endpoints. As this is a personal project, there might be issues and rough edges. Contributions and feedback are welcome!
Features
Endpoint Proxying:
- /v1/models & /v1/completions: These endpoints are forwarded directly to the downstream OpenAI-compatible server.
- /api/tags: Queries the downstream
/v1/modelsendpoint, transforms the response into the Ollama-style model list, and appends:proxyto model names if they don’t already contain a colon. - /api/chat: Rewrites the request to the downstream
/v1/chat/completionsendpoint. It intercepts and transforms streaming NDJSON responses from the OpenAI format into the expected Ollama format, including stripping any trailing:proxyfrom model names. - /api/pull and other unknown endpoints: Forwarded to a local Ollama instance running on
127.0.0.1:11505.
Debug Logging: When running in debug mode, the proxy logs:
- Every incoming request.
- The outgoing
/api/chatpayload. - Raw downstream streaming chunks and their transformed equivalents.
Model Name Handling:
- For
/api/tags, if a model ID does not contain a colon, the proxy appends:proxyto the name. - For other endpoints, any
:proxysuffix in model names is stripped before forwarding.
Prerequisites
- Go 1.18+ installed.
- An OpenAI-compatible server endpoint (e.g., running on
http://127.0.0.1:4000). - (Optional) A local Ollama instance running on
127.0.0.1:11505for endpoints not handled by the downstream server.
Installation
Clone this repository:
git clone https://github.com/regismesquita/ollama_proxy.git
cd ollama_proxy
Build the project:
go build -o proxy-server ollama_proxy.go
Usage
Run the proxy server with the desired flags:
./proxy-server --listen=":11434" --target="http://127.0.0.1:4000/v1" --api-key="YOUR_API_KEY" --debug
Command-Line Flags
--listen: The address and port the proxy server listens on (default:11434).--target: The base URL of the OpenAI-compatible downstream server (e.g.,http://127.0.0.1:4000).--api-key: (Optional) The API key for the downstream server.--debug: Enable detailed debug logging for every request and response.
How It Works
-
Request Routing: The proxy intercepts requests and routes them based on the endpoint:
- Requests to
/v1/modelsand/v1/completionsare forwarded directly. - Requests to
/api/tagsare handled locally by querying/v1/modelson the downstream, transforming the JSON response, and appending:proxywhere needed. - Requests to
/api/chatare rewritten to/v1/chat/completions, with the payload and response processed to strip or add the:proxysuffix as appropriate. - All other endpoints are forwarded to the local Ollama instance.
- Requests to
-
Response Transformation: Streaming responses from the downstream
/v1/chat/completionsendpoint (in NDJSON format) are read line by line. Each chunk is parsed, transformed into the Ollama format, and streamed back to the client. -
Logging: With debug mode enabled, detailed logs of incoming requests, outgoing payloads, and both raw and transformed response chunks are printed.
Contributing
Contributions are welcome! As this is a pet project, there may be rough edges and issues. Please feel free to open issues or submit pull requests for improvements and bug fixes.
License
This project is licensed under the MIT License. See the LICENSE file for details.