2025-02-24 10:06:44 +00:00
# vibe: Article Summarization & TTS Pipeline
2025-03-02 03:22:35 +00:00
vibe is a Python-based pipeline that automatically fetches the latest Computer Science research articles from arXiv, filters them for relevance using a language model (LLM), converts article PDFs to Markdown with Docling, generates narrative summaries, and synthesizes the summaries into an MP3 audio file using a text-to-speech (TTS) system.
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
This repository has been refactored into a modular structure for improved maintainability.
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
## Project Structure
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
- **vibe/** - Main package containing all modules:
- `config.py` - Configuration, constants, and cache setup.
- `fetcher.py` - Module to fetch articles from arXiv.
- `filter.py` - Module for relevance filtering using an LLM.
- `rerank.py` - Module to rerank articles.
- `converter.py` - Module to convert PDFs to Markdown.
- `summarizer.py` - Module to generate article summaries.
- `tts.py` - Module for text-to-speech conversion.
- `orchestrator.py` - Orchestrates the complete pipeline.
- `server.py` - Flask server exposing a REST API.
- `main.py` - CLI entry point.
- **tests/** - Contains unit tests.
- **requirements.txt** - Python package requirements.
- **Makefile** - Makefile to run common tasks.
2025-02-24 10:06:44 +00:00
## Installation
1. **Prerequisites:**
2025-03-02 03:22:35 +00:00
- Python 3.x
- Install dependencies:
```bash
pip install -r requirements.txt
```
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
2. **Clone the repository:**
```bash
git clone < repository_url >
cd < repository_directory >
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
Running the Application
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
CLI Mode
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
To generate a summary MP3 using the CLI:
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
python vibe/main.py --generate --prompt "Your interests and context here" --max-articles 5 --output summary.mp3
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
Server Mode
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
To run the Flask server:
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
python vibe/main.py --serve
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
Then, you can make a POST request to http://127.0.0.1:5000/process with a JSON payload:
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
curl -X POST http://127.0.0.1:5000/process \
-H "Content-Type: application/json" \
-d '{"user_info": "Your interests here", "max_articles": 5, "new_only": false}'
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
Running Tests
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
The project includes basic tests to verify that modules are working as expected. To run the tests, execute:
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
make test
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
or
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
python -m unittest discover -s tests
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
Makefile Commands
• make test - Run the unit tests.
• make run - Run the application in CLI mode (you can modify the command inside the Makefile).
• make serve - Run the Flask server.
• make clean - Clean up temporary files (e.g., remove the cache directory).
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
Environment Variables
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
The following environment variables can be set to customize the behavior:
• ARXIV_URL
• LLM_URL
• MODEL_NAME
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
License
2025-02-24 10:06:44 +00:00
2025-03-02 03:22:35 +00:00
This project is licensed under the MIT License.