Docker one-click setup + Gradio web UI for all collaboration styles #1#23
Open
danielesalpietro wants to merge 19 commits into
Open
Docker one-click setup + Gradio web UI for all collaboration styles #1#23danielesalpietro wants to merge 19 commits into
danielesalpietro wants to merge 19 commits into
Conversation
Adds Dockerfile (nvidia/cuda 12.4 base), docker-compose.yml with GPU reservation and a named volume for HF model cache, and .dockerignore. HF_TOKEN and TAVILY_API_KEY are passed as env vars at runtime — not baked into the image. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
healthcheck.py verifies the environment in three progressive levels without downloading model weights: Python deps + internal imports (L1), CUDA device availability and allocation (L2), HF Hub reachability via lightweight metadata call (L3). https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
serve.py patches load_agent_model_and_tokenizer / release_resources in the base inference module before any submodule imports, keeping all agent models warm in VRAM across requests. Single questions are run through the existing pipeline via a temp medqa-format JSON dataset; structured output is captured with --result_jsonl. UI exposes all five collaboration styles via a dropdown, with sliders for recursive rounds and latent steps. Style switching evicts the VRAM cache automatically. Also adds Dockerfile.serve (inherits cuda base + installs gradio), requirements-serve.txt, and a `serve` service in docker-compose.yml. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
.env must never be committed — it contains secrets (TAVILY_API_KEY). Remove it from git tracking and add .gitignore to prevent future accidental commits of .env and Python cache files. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
NVIDIA dropped the cuDNN major version suffix from image tags. The correct tag format is now cudnn-runtime, not cudnn9-runtime. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
Covers image build, docker compose up, Gradio web UI launch, 3-level health check procedure, and CPU fallback workaround for systems without GPU passthrough (including WSL2 fix steps). Also updates the repository structure listing with new Docker files. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
…ructions
- Add assets/webui.png reference after Step 4 (Gradio launch)
- Fix CPU override to use runtime: runc (deploy: {} alone is insufficient)
- Add docker run alternative for bypassing Compose GPU reservation
- Add Linux/macOS and PowerShell variants for no-GPU docker run
https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
Gradio 6.0 deprecated passing theme in Blocks() constructor. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
Reads HF_TOKEN and TAVILY_API_KEY from .env and starts the Gradio web UI via docker run (no NVIDIA runtime required). https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
Gradio 6.0 requires messages as dicts with role/content keys instead of (user, bot) tuples. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
ENTRYPOINT in Dockerfile.serve already runs python serve.py. The command block should only pass arguments, not repeat the executable. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
…esalpietro/RecursiveMAS into claude/sharp-carson-nBKHX
Image not yet available — will be added in a follow-up commit. https://claude.ai/code/session_01CE2uPEFeYKtN3hAXQ1m7jy
Docker one-click setup + Gradio web UI for all collaboration styles
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🎯 Goal
Make RecursiveMAS accessible to everyone — researchers, students, and curious minds — with no prior technical knowledge required. Clone the repo, fill in two lines in a
.envfile, run one command, and the full multi-agent reasoning system is up and running in your browser in under 60 seconds.✨ What's new
🐳 Docker infrastructure
Dockerfile— GPU-ready image for batch inference (run.py)Dockerfile.serve— separate image for the Gradio web UI (serve.py)docker-compose.yml— orchestrates both services with a sharedhf_cachenamed volume; readsHF_TOKENandTAVILY_API_KEYfrom.env.dockerignore— keeps secrets and cache out of the build context🖥️ Gradio web UI (
serve.py)launch())🩺 Health check (
healthcheck.py)Three-level check to verify the container environment before running inference:
🪟 Windows / no-GPU support
serve-cpu.bat— one double-click to launch the web UI on CPU; reads credentials from.envautomaticallydocker-compose.override.ymlpattern documented in README withruntime: runcto bypass NVIDIA hook on WSL2 systems without GPU passthrough configuredwsl --list --verbose, Docker Desktop WSL integration)🔒 Security
.gitignoreadded —.envis never tracked by git.env.exampleprovided as a safe template📖 Documentation
🚀 Quickstart (for reviewers)
No GPU on your machine right now? Create
docker-compose.override.yml:Then
docker compose up serve— the UI runs on CPU (slower, but fully functional for exploration).🗂️ Files changed
DockerfileDockerfile.servedocker-compose.ymlserve.pyhealthcheck.pyserve-cpu.batrequirements-serve.txt.dockerignore.gitignore.env.exampleREADME.md