-
-
Notifications
You must be signed in to change notification settings - Fork 202
2.3.87 Satellite ML Intern
Handle:
ml-intern
URL: http://localhost:34870
ML Intern is a Hugging Face ML engineering agent that can research papers, inspect Hugging Face resources, train models, and ship ML-related code. Harbor builds the web/API service from the upstream Dockerfile and wires it to local OpenAI-compatible backends through LiteLLM.

# Build the image from the upstream GitHub repository
harbor build ml-intern
# Start with Ollama as the default local model backend
harbor up ml-intern ollama --open
# Or start only the web service and configure an external provider
harbor up ml-intern --openThe first build clones the upstream repository and builds both the Node frontend and Python backend, so it can take several minutes. The web UI is served on port 7860 inside the container and mapped to HARBOR_ML_INTERN_HOST_PORT on the host.
For cloud providers, set the relevant shared Harbor keys before starting:
harbor config set openai.key sk-...
harbor config set anthropic.key sk-ant-...
harbor config set hf.token hf_...
harbor config set ml-intern.github.token ghp_...
harbor env ml-intern GITHUB_TOKEN ghp_...Following options can be set via harbor config:
# Host port for the ML Intern web UI/API
HARBOR_ML_INTERN_HOST_PORT=34870
# Upstream Git build context
HARBOR_ML_INTERN_GIT_REF="https://github.com/huggingface/ml-intern.git#main"
# Persistent workspace root
HARBOR_ML_INTERN_WORKSPACE="./services/ml-intern"
# Default model id passed to ML Intern
HARBOR_ML_INTERN_MODEL="ollama/qwen3.5:9b"
# Model names used by Harbor cross-service integrations
HARBOR_ML_INTERN_OLLAMA_MODEL="qwen3.5:9b"
HARBOR_ML_INTERN_LLAMACPP_MODEL="auto"
HARBOR_ML_INTERN_VLLM_MODEL="Qwen/Qwen3.5-4B"
# Shared local LiteLLM fallback endpoint for standalone launches
HARBOR_ML_INTERN_LOCAL_LLM_BASE_URL=""
HARBOR_ML_INTERN_LOCAL_LLM_API_KEY=""
# Optional GitHub token for repository tooling
HARBOR_ML_INTERN_GITHUB_TOKEN=""
# Session trace settings. Harbor disables trace sharing by default.
HARBOR_ML_INTERN_SHARE_TRACES=false
HARBOR_ML_INTERN_SESSION_DATASET_REPO="smolagents/ml-intern-sessions"
# Agent approval defaults
HARBOR_ML_INTERN_YOLO_MODE=false
HARBOR_ML_INTERN_CONFIRM_CPU_JOBS=true
HARBOR_ML_INTERN_AUTO_FILE_UPLOAD=trueML Intern also receives these shared Harbor variables:
| Harbor Variable | Container Variable | Purpose |
|---|---|---|
HARBOR_HF_TOKEN |
HF_TOKEN |
Hugging Face API access and sandbox/Hub operations |
HARBOR_HF_CACHE |
HF_HOME mount |
Shared Hugging Face cache |
HARBOR_OPENAI_KEY |
OPENAI_API_KEY |
OpenAI provider key |
HARBOR_ANTHROPIC_KEY |
ANTHROPIC_API_KEY |
Anthropic provider key |
ML Intern uses LiteLLM model prefixes for local providers. Harbor cross-files set the right model id and base URL when the backend is started alongside ml-intern.
# Ollama
harbor up ml-intern ollama
# llama.cpp
harbor up ml-intern llamacpp
# vLLM
harbor up ml-intern vllmThe Ollama integration sets ML_INTERN_MODEL=ollama/${HARBOR_ML_INTERN_OLLAMA_MODEL} and OLLAMA_BASE_URL=${HARBOR_OLLAMA_INTERNAL_URL}. The vLLM integration sets the corresponding vllm/ model prefix plus an OpenAI-compatible /v1 endpoint.
The llama.cpp integration defaults to HARBOR_ML_INTERN_LLAMACPP_MODEL=auto. On startup, ML Intern queries http://llamacpp:8080/v1/models and selects a ranked text/code model from the advertised router catalog. The selector avoids obvious non-chat models such as image, vision, embedding, audio, STT, TTS, and reranker models, then prefers code, instruct, common chat model families, and explicit moderate model sizes.
Auto selection is a convenience default, not a substitute for choosing the model you want. If llama.cpp only advertises non-text models, startup fails instead of selecting an unsuitable embedding, vision, audio, STT, TTS, or reranker model. For reproducible work, list available ids and pin one explicitly:
harbor llamacpp models
harbor config set ml-intern.llamacpp.model unsloth/Qwen3.5-4B-GGUF:Q4_K_M
harbor restart ml-internWhen pinned, use the raw llama.cpp model id from harbor llamacpp models. Do not include the llamacpp/ prefix in Harbor config; the integration adds that prefix for ML Intern.
Note: ML Intern's upstream UI currently labels the configured default-model slot as "Claude Opus 4.6" even when Harbor points it at a local backend. The authoritative value is the model id returned by
curl http://localhost:34870/api/config/model.
| Mount | Description |
|---|---|
${HARBOR_ML_INTERN_WORKSPACE}/data:/app/session_logs |
ML Intern session logs |
${HARBOR_ML_INTERN_WORKSPACE}/workspace:/workspace |
User workspace for files created or edited by the agent |
${HARBOR_HF_CACHE}:/home/user/.cache/huggingface |
Shared Hugging Face cache and token storage |
./services/ml-intern/config.json:/app/configs/*.json:ro |
Harbor-managed ML Intern config |
Harbor runs an init sidecar before the main container to create and chown the workspace directories for the non-root upstream user.
harbor logs ml-intern
harbor down ml-internIf the UI loads but agent calls fail, verify that the selected model has a reachable backend. For Ollama, start with harbor up ml-intern ollama and make sure HARBOR_ML_INTERN_OLLAMA_MODEL names a model available in Ollama. For llama.cpp, check what ML Intern selected:
curl http://localhost:34870/api/config/model
curl -f http://localhost:34870/api/health/llmThe LLM health endpoint returns HTTP 503 when the configured backend is not reachable or rejects the test request.
If the selected llama.cpp model is not the one you intended, pin a raw id from harbor llamacpp models:
harbor config set ml-intern.llamacpp.model unsloth/Qwen3.5-4B-GGUF:Q4_K_M
harbor restart ml-internIf Hugging Face tools fail, set HARBOR_HF_TOKEN:
harbor config set hf.token hf_...
harbor restart ml-internIf GitHub repository operations fail or hit rate limits, set a GitHub token through the service override env:
harbor config set ml-intern.github.token ghp_...
# Or set a raw container env override:
harbor env ml-intern GITHUB_TOKEN ghp_...
harbor restart ml-intern