Generative AI Project Template
Template for a new AI Cloud project.
Click on Use this template to start your own project!
This project is a generative ai template. It contains the following features: LLMs, information extraction, chat, rag & evaluation. It uses LLMs(local or cloud), NiceGUI (frontend) & FastAPI (backend) & Promptfoo as an evaluation and redteam framework for your AI system.
Test LLM |
---|
![]() |
Engineering tools:
- [x] Use UV to manage packages in a workspace (
frontend
andbackend
). - [x] pre-commit hooks: use
ruff
to ensure the code quality &detect-secrets
to scan the secrets in the code. - [x] Logging using loguru (with colors)
- [x] Pytest for unit tests
- [x] Dockerized project (Dockerfile & docker-compose).
- [x] NiceGUI (frontend) & FastAPI (backend)
- [x] Make commands to handle everything for you: install, run, test
AI tools:
- [x] LLM running locally with Ollama or in the cloud with any LLM provider (LiteLLM)
- [x] Information extraction and Question answering from documents
- [x] Chat to test the AI system
- [x] Efficient async code using asyncio.
- [x] AI Evaluation framework: using Promptfoo, Ragas & more...
CI/CD & Maintenance tools:
- [x] CI/CD pipelines:
.github/workflows
for GitHub (Testing the AI system, local models with Ollama and the dockerized app) - [x] Local CI/CD pipelines: GitHub Actions using
github act
- [x] GitHub Actions for deploying to GitHub Pages with mkdocs gh-deploy
- [x] Dependabot
.github/dependabot.yml
for automatic dependency and security updates
Documentation tools:
- [x] Wiki creation and setup of documentation website using Mkdocs
- [x] GitHub Pages deployment using mkdocs gh-deploy plugin
Upcoming features: - [ ] add RAG again - [ ] optimize caching in CI/CD - [ ][Pull requests templates](https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/creating-a-pull-request-template-for-your-repository) - [ ] Additional MLOps templates: https://github.com/fmind/mlops-python-package - [ ] Add MLFlow - [ ] add Langfuse
1. Getting started
This project is a monorepo containing two main packages:
frontend
: A NiceGUI application.backend
: A FastAPI application that serves the AI models and business logic.
The project uses uv
as a package manager and is configured as a workspace, so dependencies for both packages can be installed with a single command.
Project Structure
frontend/
: The NiceGUI frontend application.pyproject.toml
: Frontend-specific dependencies.src/
: Source code for the frontend.backend/
: The FastAPI backend application.pyproject.toml
: Backend-specific dependencies.src/
: Source code for the backend..github/
: GitHub-specific files.workflows/
: GitHub Actions CI/CD pipelines.dependabot.yml
: Dependabot configuration for dependency updates.assets/
: Static assets like images.scripts/
: Utility scripts.tests/
: Pytest unit and integration tests..env.example
: Example environment variables. Create a.env
from this.Dockerfile
: To build the application container.docker-compose.yml
: To run services likefrontend
,backend
, andollama
..gitlab-ci.yml
: GitLab CI configuration file.Makefile
: Shortcuts for common commands likeinstall
,run
,test
.pyproject.toml
: Definesuv
workspace and shared dependencies.uv.lock
: Lock file foruv
package manager..pre-commit-config.yaml
: Configuration for pre-commit hooks.mkdocs.yml
: Configuration for the documentation site.README.md
,CONTRIBUTING.md
,CODE_OF_CONDUCT.md
,LICENSE
: Project documentation.
1.1. Local Prerequisites
- Ubuntu 22.04 or MacOS
- git clone the repository
- UV & Python 3.12 (will be installed by the Makefile)
- Create a
.env
file (take a look at the.env.example
file)
1.2 ⚙️ Steps for Installation
This project uses a Makefile
to simplify the installation and execution process.
Local Installation
-
For CPU-based environment (or MacOS) To install all dependencies for both
frontend
andbackend
for a CPU environment, run:bash make install-dev
-
For NVIDIA GPU (CUDA) environment If you have an NVIDIA GPU and want to use CUDA for acceleration, run:
bash make install-dev-cuda
This will install the CUDA-enabled version of PyTorch.
Using Docker
The project can be fully containerized using Docker. This is the recommended way to run the application as it handles all services and networks.
- The docker-compose.yml
and docker-compose-cuda.yml
files define the services.
- To build the main docker image:
bash
make docker-build
- To run the entire application stack (frontend, backend, database, Ollama) using Docker Compose:
bash
make run-app
Running the Application
Once installed (either locally or via Docker), you can run the services.
-
Run Everything: The
make run-app
command is the easiest way to start all services, including the frontend, backend, database, and Ollama. -
Run Services Individually:
- Run Frontend:
make run-frontend
- Run Backend:
make run-backend
You can then access: - Frontend (NiceGUI): http://localhost:8080 (or the configured port) - Backend (FastAPI): http://localhost:8000 (or the configured port). Docs http://localhost:8000/docs
Using Local vs. Cloud LLMs
- Local model (Ollama):
- Install Ollama:
make install-ollama
- Ensure Ollama is running (
make run-ollama
can help). - Set your
.env
file to point to the local Ollama endpoint (copy and paste from the.env.example
file). - Download a model:
make download-ollama-model
- Test the connection:
make test-ollama
- Test the connection:
make test-inference-llm
- Install Ollama:
- Cloud model (OpenAI, Anthropic, etc.):
- Update your
.env
file with the correct API keys and model names, following the LiteLLM naming convention. - Test the connection:
make test-inference-llm
- Update your
1.3 ⚙️ Steps for Installation (Contributors and maintainers)
Check the CONTRIBUTING.md file for more information.
2. Contributing
Check the CONTRIBUTING.md file for more information.