Docker one-click setup + Gradio web UI for all collaboration styles #1 by danielesalpietro · Pull Request #23 · RecursiveMAS/RecursiveMAS

danielesalpietro · 2026-05-28T21:30:33Z

🎯 Goal

Make RecursiveMAS accessible to everyone — researchers, students, and curious minds — with no prior technical knowledge required. Clone the repo, fill in two lines in a .env file, run one command, and the full multi-agent reasoning system is up and running in your browser in under 60 seconds.

Tested on: HP OMEN 16 Pro · NVIDIA RTX 5080 · 32 GB RAM · NVMe SSD
Result: all 5 collaboration styles working, GPU inference in seconds, warm model cache between requests.

✨ What's new

🐳 Docker infrastructure

Dockerfile — GPU-ready image for batch inference (run.py)
Dockerfile.serve — separate image for the Gradio web UI (serve.py)
docker-compose.yml — orchestrates both services with a shared hf_cache named volume; reads HF_TOKEN and TAVILY_API_KEY from .env
.dockerignore — keeps secrets and cache out of the build context

🖥️ Gradio web UI (`serve.py`)

Chat interface exposing all 5 collaboration styles via dropdown
Warm model cache — models are loaded into VRAM on first request and stay resident; no reload between questions
Style switching evicts the old models and loads the new set automatically
Sliders for recursive rounds (1–5) and latent steps (8–64)
Informational note below the style selector: first-use download time, subsequent runs instant from cache
Compatible with Gradio 6.0 (messages format updated, theme moved to launch())

🩺 Health check (`healthcheck.py`)

Three-level check to verify the container environment before running inference:

Level 1 — Python deps + all 5 styles registered (no GPU needed)
Level 2 — CUDA device detection + tensor allocation
Level 3 — HuggingFace Hub reachability

🪟 Windows / no-GPU support

serve-cpu.bat — one double-click to launch the web UI on CPU; reads credentials from .env automatically
docker-compose.override.yml pattern documented in README with runtime: runc to bypass NVIDIA hook on WSL2 systems without GPU passthrough configured
WSL2 GPU fix checklist included (driver ≥ 470, wsl --list --verbose, Docker Desktop WSL integration)

🔒 Security

.gitignore added — .env is never tracked by git
.env.example provided as a safe template
All secrets passed at runtime via environment variables, never baked into images

📖 Documentation

New 🐳 Docker: One-Click Setup section in README covering prerequisites, build, batch inference, web UI launch, health check, and CPU fallback
Repository structure updated to reflect all new files

🚀 Quickstart (for reviewers)

git clone https://github.com/danielesalpietro/RecursiveMAS.git
cd RecursiveMAS

# Create .env
echo "HF_TOKEN=hf_your_token" > .env
echo "TAVILY_API_KEY=your_key" >> .env

# Build and launch web UI
docker compose build serve
docker compose up serve
# → open http://localhost:7860

No GPU on your machine right now? Create docker-compose.override.yml:

services:
  recursivemas:
    runtime: runc
    deploy: {}
  serve:
    runtime: runc
    deploy: {}

Then docker compose up serve — the UI runs on CPU (slower, but fully functional for exploration).

🗂️ Files changed

File	Change
`Dockerfile`	New — batch inference image
`Dockerfile.serve`	New — Gradio web UI image
`docker-compose.yml`	New — multi-service orchestration
`serve.py`	New — Gradio web UI with warm model cache
`healthcheck.py`	New — 3-level container health check
`serve-cpu.bat`	New — Windows one-click CPU launcher
`requirements-serve.txt`	New — gradio dependency
`.dockerignore`	New
`.gitignore`	New
`.env.example`	New
`README.md`	Updated — Docker setup section added

Adds Dockerfile (nvidia/cuda 12.4 base), docker-compose.yml with GPU reservation and a named volume for HF model cache, and .dockerignore. HF_TOKEN and TAVILY_API_KEY are passed as env vars at runtime — not baked into the image. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

healthcheck.py verifies the environment in three progressive levels without downloading model weights: Python deps + internal imports (L1), CUDA device availability and allocation (L2), HF Hub reachability via lightweight metadata call (L3). https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

serve.py patches load_agent_model_and_tokenizer / release_resources in the base inference module before any submodule imports, keeping all agent models warm in VRAM across requests. Single questions are run through the existing pipeline via a temp medqa-format JSON dataset; structured output is captured with --result_jsonl. UI exposes all five collaboration styles via a dropdown, with sliders for recursive rounds and latent steps. Style switching evicts the VRAM cache automatically. Also adds Dockerfile.serve (inherits cuda base + installs gradio), requirements-serve.txt, and a `serve` service in docker-compose.yml. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

.env must never be committed — it contains secrets (TAVILY_API_KEY). Remove it from git tracking and add .gitignore to prevent future accidental commits of .env and Python cache files. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

NVIDIA dropped the cuDNN major version suffix from image tags. The correct tag format is now cudnn-runtime, not cudnn9-runtime. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

Covers image build, docker compose up, Gradio web UI launch, 3-level health check procedure, and CPU fallback workaround for systems without GPU passthrough (including WSL2 fix steps). Also updates the repository structure listing with new Docker files. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

…ructions - Add assets/webui.png reference after Step 4 (Gradio launch) - Fix CPU override to use runtime: runc (deploy: {} alone is insufficient) - Add docker run alternative for bypassing Compose GPU reservation - Add Linux/macOS and PowerShell variants for no-GPU docker run https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

Gradio 6.0 deprecated passing theme in Blocks() constructor. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

Reads HF_TOKEN and TAVILY_API_KEY from .env and starts the Gradio web UI via docker run (no NVIDIA runtime required). https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

Gradio 6.0 requires messages as dicts with role/content keys instead of (user, bot) tuples. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

ENTRYPOINT in Dockerfile.serve already runs python serve.py. The command block should only pass arguments, not repeat the executable. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

…esalpietro/RecursiveMAS into claude/sharp-carson-nBKHX

Image not yet available — will be added in a follow-up commit. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

Docker one-click setup + Gradio web UI for all collaboration styles

claude and others added 19 commits May 25, 2026 08:37

Add .gitignore and untrack .env

5160853

.env must never be committed — it contains secrets (TAVILY_API_KEY). Remove it from git tracking and add .gitignore to prevent future accidental commits of .env and Python cache files. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

Fix CUDA base image tag (cudnn9 → cudnn)

b3a89f3

NVIDIA dropped the cuDNN major version suffix from image tags. The correct tag format is now cudnn-runtime, not cudnn9-runtime. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

chore: add .gitignore and .env.example

22ac206

fix: move Gradio theme to launch() for Gradio 6.0 compatibility

3e562c7

Gradio 6.0 deprecated passing theme in Blocks() constructor. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

chore: add serve-cpu.bat for GPU-less Windows launch

42c7cb5

Reads HF_TOKEN and TAVILY_API_KEY from .env and starts the Gradio web UI via docker run (no NVIDIA runtime required). https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

fix: update Chatbot message format for Gradio 6.0 compatibility

dbacac8

Gradio 6.0 requires messages as dicts with role/content keys instead of (user, bot) tuples. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

fix: add missing seed and sample_seed to fake_args in serve.py

1032efa

https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

fix: remove redundant python/serve.py from compose command for serve

3632cfc

ENTRYPOINT in Dockerfile.serve already runs python serve.py. The command block should only pass arguments, not repeat the executable. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

feat: add first-use download info message below style dropdown

78b8a87

https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

Create webui.png

36dab1a

Merge branch 'claude/sharp-carson-nBKHX' of https://github.com/daniel…

1a7612d

…esalpietro/RecursiveMAS into claude/sharp-carson-nBKHX

docs: remove placeholder webui screenshot reference

cd0631f

Image not yet available — will be added in a follow-up commit. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

docs: add Gradio web UI screenshot to README

533ad21

https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy

Merge pull request #1 from danielesalpietro/claude/sharp-carson-nBKHX

4b5eea3

Docker one-click setup + Gradio web UI for all collaboration styles

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker one-click setup + Gradio web UI for all collaboration styles #1#23

Docker one-click setup + Gradio web UI for all collaboration styles #1#23
danielesalpietro wants to merge 19 commits into
RecursiveMAS:mainfrom
danielesalpietro:main

danielesalpietro commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

danielesalpietro commented May 28, 2026

🎯 Goal

✨ What's new

🐳 Docker infrastructure

🖥️ Gradio web UI (serve.py)

🩺 Health check (healthcheck.py)

🪟 Windows / no-GPU support

🔒 Security

📖 Documentation

🚀 Quickstart (for reviewers)

🗂️ Files changed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

🖥️ Gradio web UI (`serve.py`)

🩺 Health check (`healthcheck.py`)