103 lines
No EOL
5.1 KiB
Markdown
103 lines
No EOL
5.1 KiB
Markdown
# vibe: Article Summarization & TTS Pipeline
|
|
|
|
vibe is a Python-based pipeline that automatically fetches the latest Computer Science research articles from arXiv, filters them for relevance using a language model (LLM), converts article PDFs to Markdown with Docling, generates narrative summaries, and synthesizes the summaries into an MP3 audio file using a text-to-speech (TTS) system. This tool is ideal for users who prefer listening to curated research summaries on the go or integrating the process into a larger system via an API.
|
|
|
|
## Features
|
|
|
|
- **Fetch Articles:** Retrieves the latest Computer Science articles from arXiv.
|
|
- **Cache Mechanism:** Caches article metadata and converted content to speed up subsequent requests.
|
|
- **Relevance Filtering:** Uses an LLM to filter articles based on user-provided interests.
|
|
- **PDF Conversion:** Converts PDF articles to Markdown format using Docling.
|
|
- **Summarization:** Generates a fluid, narrative-style summary for each relevant article with the help of an LLM.
|
|
- **Text-to-Speech:** Converts the final narrative summary into an MP3 file using KPipeline.
|
|
- **Flask API:** Exposes the functionality via a RESTful endpoint for dynamic requests.
|
|
- **CLI and Server Modes:** Run the pipeline as a one-off CLI command or as a continuously running Flask server.
|
|
|
|
## Why Use vibe?
|
|
|
|
- **Stay Updated:** Automatically curate and summarize the latest research articles so you can keep up with advancements in your field.
|
|
- **Hands-Free Listening:** Enjoy audio summaries during your commute or while multitasking.
|
|
- **Automated Workflow:** Seamlessly integrate multiple processing steps—from fetching and filtering to summarization and TTS.
|
|
- **Flexible Deployment:** Use the CLI mode for quick summaries or deploy the Flask API for integration with other systems.
|
|
|
|
## Installation
|
|
|
|
1. **Prerequisites:**
|
|
Ensure you have Python 3.x installed on your system.
|
|
|
|
2. **Clone the Repository:**
|
|
Clone this repository to your local machine.
|
|
|
|
3. **Install Dependencies:**
|
|
Navigate to the project directory and install the required packages:
|
|
```
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
## Usage
|
|
|
|
### CLI Mode
|
|
|
|
Run the pipeline once to generate an MP3 summary file. For example:
|
|
```
|
|
python vibe.py --generate --prompt "I live in a mid-sized European city, working in the tech industry on AI-driven automation solutions. I prefer content focused on deep learning and reinforcement learning applications, and I want to filter out less relevant topics. Only include articles that are rated 9 or 10 on a relevance scale from 0 to 10." --max-articles 10 --output summary_cli.mp3
|
|
```
|
|
This command fetches the latest articles from arXiv, filters and ranks them based on your specified interests, generates narrative summaries, and converts the final summary into an MP3 file named `summary_cli.mp3`.
|
|
|
|
### Server Mode
|
|
|
|
Alternatively, you can run vibe as a Flask server:
|
|
```
|
|
python vibe.py --serve
|
|
```
|
|
Once the server is running, you can process requests by sending a POST request to the `/process` endpoint. For example:
|
|
```
|
|
curl -X POST http://127.0.0.1:5000/process \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"user_info": "Your interests here", "max_articles": 5, "new_only": false}'
|
|
```
|
|
The server processes the articles, generates an MP3 summary, and returns the file as a downloadable response.
|
|
|
|
## Environment Variables
|
|
|
|
The following environment variables can be set to customize the behavior of vibe:
|
|
|
|
- `ARXIV_URL`: The URL used to fetch the latest arXiv articles. Defaults to `https://arxiv.org/list/cs/new`.
|
|
- `LLM_URL`: The URL for the language model endpoint. Defaults to `http://127.0.0.1:4000/v1/chat/completions` (this is a litellm instance).
|
|
- `MODEL_NAME`: The model name to be used by the LLM. Defaults to `mistral-small-latest`.
|
|
|
|
Note that using the `mistral-small` model through their cloud service typically costs a few cents per run and completes the summarization process in around 4 minutes. It is also possible to run vibe with local LLMs (such as qwen 2.5 14b or mistral-small), although these local runs may take up to an hour.
|
|
|
|
## Project Structure
|
|
|
|
- **vibe.py:** Main application file containing modules for:
|
|
- Fetching and caching arXiv articles.
|
|
- Filtering articles for relevance.
|
|
- Converting PDFs to Markdown using Docling.
|
|
- Summarizing articles via an LLM.
|
|
- Converting text summaries to speech (MP3) using KPipeline.
|
|
- Exposing a Flask API for processing requests.
|
|
- **requirements.txt:** Contains the list of Python packages required by the project.
|
|
- **CACHE_DIR:** Directory created at runtime for caching articles and processed files.
|
|
|
|
## Dependencies
|
|
|
|
The project relies on several key libraries:
|
|
- Flask
|
|
- requests
|
|
- beautifulsoup4
|
|
- soundfile
|
|
- docling
|
|
- kokoro
|
|
|
|
## Contributing
|
|
|
|
Contributions are welcome! Feel free to fork this repository and submit pull requests with improvements or bug fixes.
|
|
|
|
## License
|
|
|
|
This project is licensed under the MIT License.
|
|
|
|
## Acknowledgments
|
|
|
|
Thanks to the developers of [Docling](https://github.com/docling) and [Kokoro](https://github.com/kokoro) as well as the maintainers of BeautifulSoup and Flask for providing great tools that made this project possible. |