72 lines
3.9 KiB
Markdown
72 lines
3.9 KiB
Markdown
Below is an updated README.md with the additional context:
|
||
|
||
Ollama-OpenAI Proxy
|
||
|
||
This is a Go-based proxy server that enables applications designed to work with the Ollama API to interact seamlessly with an OpenAI-compatible endpoint. It translates and forwards requests and responses between the two APIs while applying custom transformations to the model names and data formats.
|
||
|
||
Note: This is a pet project I use to forward requests to LiteLLM for use with Kerlig, which doesn’t support custom OpenAI endpoints. As this is a personal project, there might be issues and rough edges. Contributions and feedback are welcome!
|
||
|
||
Features
|
||
• Endpoint Proxying:
|
||
• /v1/models & /v1/completions: These endpoints are forwarded directly to the downstream OpenAI-compatible server.
|
||
• /api/tags: Queries the downstream /v1/models endpoint, transforms the response into the Ollama-style model list, and appends :proxy to model names if they don’t already contain a colon.
|
||
• /api/chat: Rewrites the request to the downstream /v1/chat/completions endpoint. It intercepts and transforms streaming NDJSON responses from the OpenAI format into the expected Ollama format, including stripping any trailing :proxy from model names.
|
||
• /api/pull and other unknown endpoints: Forwarded to a local Ollama instance running on 127.0.0.1:11505.
|
||
• Debug Logging:
|
||
When running in debug mode, the proxy logs:
|
||
• Every incoming request.
|
||
• The outgoing /api/chat payload.
|
||
• Raw downstream streaming chunks and their transformed equivalents.
|
||
• Model Name Handling:
|
||
• For /api/tags, if a model ID does not contain a colon, the proxy appends :proxy to the name.
|
||
• For other endpoints, any :proxy suffix in model names is stripped before forwarding.
|
||
|
||
Prerequisites
|
||
• Go 1.18+ installed.
|
||
• An OpenAI-compatible server endpoint (e.g., running on http://127.0.0.1:4000).
|
||
• (Optional) A local Ollama instance running on 127.0.0.1:11505 for endpoints not handled by the downstream server.
|
||
|
||
Installation
|
||
|
||
Clone this repository:
|
||
|
||
git clone https://github.com/yourusername/ollama-openai-proxy.git
|
||
cd ollama-openai-proxy
|
||
|
||
Build the project:
|
||
|
||
go build -o proxy-server main.go
|
||
|
||
Usage
|
||
|
||
Run the proxy server with the desired flags:
|
||
|
||
./proxy-server --listen=":11434" --target="http://127.0.0.1:4000" --api-key="YOUR_API_KEY" --debug
|
||
|
||
Command-Line Flags
|
||
• --listen: The address and port the proxy server listens on (default :11434).
|
||
• --target: The base URL of the OpenAI-compatible downstream server (e.g., http://127.0.0.1:4000).
|
||
• --api-key: (Optional) The API key for the downstream server.
|
||
• --debug: Enable detailed debug logging for every request and response.
|
||
|
||
How It Works
|
||
1. Request Routing:
|
||
The proxy intercepts requests and routes them based on the endpoint:
|
||
• Requests to /v1/models and /v1/completions are forwarded directly.
|
||
• Requests to /api/tags are handled locally by querying /v1/models on the downstream, transforming the JSON response, and appending :proxy where needed.
|
||
• Requests to /api/chat are rewritten to /v1/chat/completions, with the payload and response processed to strip or add the :proxy suffix as appropriate.
|
||
• All other endpoints are forwarded to the local Ollama instance.
|
||
2. Response Transformation:
|
||
Streaming responses from the downstream /v1/chat/completions endpoint (in NDJSON format) are read line by line. Each chunk is parsed, transformed into the Ollama format, and streamed back to the client.
|
||
3. Logging:
|
||
With debug mode enabled, detailed logs of incoming requests, outgoing payloads, and both raw and transformed response chunks are printed.
|
||
|
||
Contributing
|
||
|
||
Contributions are welcome! As this is a pet project, there may be rough edges and issues. Please feel free to open issues or submit pull requests for improvements and bug fixes.
|
||
|
||
License
|
||
|
||
This project is licensed under the MIT License. See the LICENSE file for details.
|
||
|
||
Feel free to adjust this README to better fit your project details before publishing it to GitHub.
|