Merge remote-tracking branch 'refs/remotes/origin/main'

2025-02-15 22:47:24 +00:00 · 2025-02-15 22:47:24 +00:00 · b6411ea639
commit b6411ea639
parent f3e6e1101f 73c9b230ab
1 changed files with 57 additions and 47 deletions
--- a/README.md
+++ b/README.md
@ -1,72 +1,82 @@
-Below is an updated README.md with the additional context:
-
-Ollama-OpenAI Proxy
+# Ollama-OpenAI Proxy

 This is a Go-based proxy server that enables applications designed to work with the Ollama API to interact seamlessly with an OpenAI-compatible endpoint. It translates and forwards requests and responses between the two APIs while applying custom transformations to the model names and data formats.

-	Note: This is a pet project I use to forward requests to LiteLLM for use with Kerlig, which doesn’t support custom OpenAI endpoints. As this is a personal project, there might be issues and rough edges. Contributions and feedback are welcome!
+**Note:** This is a pet project I use to forward requests to LiteLLM for use with Kerlig, which doesn’t support custom OpenAI endpoints. As this is a personal project, there might be issues and rough edges. Contributions and feedback are welcome!

-Features
-	•	Endpoint Proxying:
-	•	/v1/models & /v1/completions: These endpoints are forwarded directly to the downstream OpenAI-compatible server.
-	•	/api/tags: Queries the downstream /v1/models endpoint, transforms the response into the Ollama-style model list, and appends :proxy to model names if they don’t already contain a colon.
-	•	/api/chat: Rewrites the request to the downstream /v1/chat/completions endpoint. It intercepts and transforms streaming NDJSON responses from the OpenAI format into the expected Ollama format, including stripping any trailing :proxy from model names.
-	•	/api/pull and other unknown endpoints: Forwarded to a local Ollama instance running on 127.0.0.1:11505.
-	•	Debug Logging:
+## Features
+
+**Endpoint Proxying:**
+- **/v1/models & /v1/completions:** These endpoints are forwarded directly to the downstream OpenAI-compatible server.
+- **/api/tags:** Queries the downstream `/v1/models` endpoint, transforms the response into the Ollama-style model list, and appends `:proxy` to model names if they don’t already contain a colon.
+- **/api/chat:** Rewrites the request to the downstream `/v1/chat/completions` endpoint. It intercepts and transforms streaming NDJSON responses from the OpenAI format into the expected Ollama format, including stripping any trailing `:proxy` from model names.
+- **/api/pull and other unknown endpoints:** Forwarded to a local Ollama instance running on `127.0.0.1:11505`.
+
+**Debug Logging:**
 When running in debug mode, the proxy logs:
-	•	Every incoming request.
-	•	The outgoing /api/chat payload.
-	•	Raw downstream streaming chunks and their transformed equivalents.
-	•	Model Name Handling:
-	•	For /api/tags, if a model ID does not contain a colon, the proxy appends :proxy to the name.
-	•	For other endpoints, any :proxy suffix in model names is stripped before forwarding.
+- Every incoming request.
+- The outgoing `/api/chat` payload.
+- Raw downstream streaming chunks and their transformed equivalents.

-Prerequisites
-	•	Go 1.18+ installed.
-	•	An OpenAI-compatible server endpoint (e.g., running on http://127.0.0.1:4000).
-	•	(Optional) A local Ollama instance running on 127.0.0.1:11505 for endpoints not handled by the downstream server.
+**Model Name Handling:**
+- For `/api/tags`, if a model ID does not contain a colon, the proxy appends `:proxy` to the name.
+- For other endpoints, any `:proxy` suffix in model names is stripped before forwarding.

-Installation
+## Prerequisites
+
+- **Go 1.18+** installed.
+- An OpenAI-compatible server endpoint (e.g., running on `http://127.0.0.1:4000`).
+- *(Optional)* A local Ollama instance running on `127.0.0.1:11505` for endpoints not handled by the downstream server.
+
+## Installation

 Clone this repository:

-git clone https://github.com/yourusername/ollama-openai-proxy.git
-cd ollama-openai-proxy
+```
+git clone https://github.com/regismesquita/ollama_proxy.git
+cd ollama_proxy
+```

 Build the project:

-go build -o proxy-server main.go
+```
+go build -o proxy-server ollama_proxy.go
+```

-Usage
+## Usage

 Run the proxy server with the desired flags:

+```
 ./proxy-server --listen=":11434" --target="http://127.0.0.1:4000" --api-key="YOUR_API_KEY" --debug
+```

-Command-Line Flags
-	•	--listen: The address and port the proxy server listens on (default :11434).
-	•	--target: The base URL of the OpenAI-compatible downstream server (e.g., http://127.0.0.1:4000).
-	•	--api-key: (Optional) The API key for the downstream server.
-	•	--debug: Enable detailed debug logging for every request and response.
+## Command-Line Flags

-How It Works
-	1.	Request Routing:
-The proxy intercepts requests and routes them based on the endpoint:
-	•	Requests to /v1/models and /v1/completions are forwarded directly.
-	•	Requests to /api/tags are handled locally by querying /v1/models on the downstream, transforming the JSON response, and appending :proxy where needed.
-	•	Requests to /api/chat are rewritten to /v1/chat/completions, with the payload and response processed to strip or add the :proxy suffix as appropriate.
-	•	All other endpoints are forwarded to the local Ollama instance.
-	2.	Response Transformation:
-Streaming responses from the downstream /v1/chat/completions endpoint (in NDJSON format) are read line by line. Each chunk is parsed, transformed into the Ollama format, and streamed back to the client.
-	3.	Logging:
-With debug mode enabled, detailed logs of incoming requests, outgoing payloads, and both raw and transformed response chunks are printed.
+- `--listen`: The address and port the proxy server listens on (default `:11434`).
+- `--target`: The base URL of the OpenAI-compatible downstream server (e.g., `http://127.0.0.1:4000`).
+- `--api-key`: *(Optional)* The API key for the downstream server.
+- `--debug`: Enable detailed debug logging for every request and response.

-Contributing
+## How It Works
+
+1. **Request Routing:**
+   The proxy intercepts requests and routes them based on the endpoint:
+   - Requests to `/v1/models` and `/v1/completions` are forwarded directly.
+   - Requests to `/api/tags` are handled locally by querying `/v1/models` on the downstream, transforming the JSON response, and appending `:proxy` where needed.
+   - Requests to `/api/chat` are rewritten to `/v1/chat/completions`, with the payload and response processed to strip or add the `:proxy` suffix as appropriate.
+   - All other endpoints are forwarded to the local Ollama instance.
+
+2. **Response Transformation:**
+   Streaming responses from the downstream `/v1/chat/completions` endpoint (in NDJSON format) are read line by line. Each chunk is parsed, transformed into the Ollama format, and streamed back to the client.
+
+3. **Logging:**
+   With debug mode enabled, detailed logs of incoming requests, outgoing payloads, and both raw and transformed response chunks are printed.
+
+## Contributing

 Contributions are welcome! As this is a pet project, there may be rough edges and issues. Please feel free to open issues or submit pull requests for improvements and bug fixes.

-License
+## License

-This project is licensed under the MIT License. See the LICENSE file for details.
-
-Feel free to adjust this README to better fit your project details before publishing it to GitHub.
+This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.