Monorepo Service Mapping¶
Overview¶
Rune is built within an existing monorepo that provides shared infrastructure (event system, API scaffolding, model training utilities, data pipeline). This document maps each new Rune component to its position in the monorepo, identifies which existing services it extends or runs alongside, and names the specific integration points.
For the component build order and dependency chain, see Build Order.
Service Mapping¶
New Services¶
| Rune Service | Path | Extends / Runs Alongside | Integration Points |
|---|---|---|---|
rune-agent |
services/rune-agent/ |
LangGraph state graph (generate → execute → reflect) with 4-phase pipeline (decompose → plan → code → integrate) | Consumes libs/adapter-registry for adapter selection; uses libs/inference providers for generation; uses libs/events-py for event publishing; manages sandbox containers via libs/shared |
training-svc |
services/training-svc/ |
Extends libs/model-training (FastAPI) |
Consumes PEFT utilities and hypernetwork from model-training; reads adapter corpus from adapter-registry; coordinates GPU via vLLM sleep/wake REST calls managed by scripts/swarm_workers.py |
evolution-svc |
services/evolution-svc/ |
FastAPI stubs; primary logic in scripts/swarm_evolution.py |
Reads adapter metadata from adapter-registry; evaluates adapter fitness using held-out test sets; writes promotion/pruning events via libs/events-py |
New Libraries¶
| Rune Library | Path | Extends / New | Consumers |
|---|---|---|---|
adapter-registry |
libs/adapter-registry/ |
New (implemented) | rune-agent, training-svc, evolution-svc, api-service |
Extended Existing Components¶
| Component | Path | What Changes |
|---|---|---|
model-training |
libs/model-training/ |
Hypernetwork (DocToLoraHypernetwork), D2L training pipeline (d2l_train, d2l_data, d2l_probe, d2l_config, d2l_lora, d2l_prep, d2l_mining), TIES/DARE merging (merging.py), QLoRA trainer, PEFT utilities, Sakana D2L integration |
api-service |
services/api-service/ |
Add REST routes: /adapters (registry CRUD), /sessions (agent session state); new SQLModel tables for session tracking |
inference |
libs/inference/ |
Provider-agnostic interface (InferenceProvider ABC) with TransformersProvider, LlamaCppProvider, OllamaProvider, VLLMProvider backends and factory for configuration-based selection |
shared |
libs/shared/ |
Hardware probe, sandbox (SubprocessBackend), checkpoint DB, template loader, Rune data models (CodingSession, SwarmConfig, PipelinePhase), storage utils |
evaluation |
libs/evaluation/ |
OOD benchmark, fitness scoring, Pass@k metrics, generalization delta |
Integration Point Details¶
adapter-registry (dependency root)¶
Every Rune component depends on the adapter registry. It provides two interfaces:
| Interface | Protocol | Consumers |
|---|---|---|
Python API (adapter_registry.registry.AdapterRegistry) |
Direct import (in-process) | rune-agent, training-svc, evolution-svc |
REST API (via api-service) |
HTTP | External tools, UI, monitoring |
The registry owns the SQLite database and the filesystem adapter store. Key exceptions: AdapterAlreadyExistsError, AdapterNotFoundError. See Adapter Storage for schema and path conventions.
GPU Coordination (vLLM sleep/wake)¶
GPU coordination uses vLLM sleep/wake REST calls managed by scripts/swarm_workers.py. When a training job needs the GPU, the worker puts vLLM to sleep, runs QLoRA in a subprocess, then wakes vLLM:
flowchart LR
TrainReq([Training Request]) --> Sleep[POST /sleep to vLLM]
Sleep --> Train[QLoRA in subprocess]
Train --> Wake[POST /wake_up to vLLM]
Wake --> Resume[Inference resumes]
See GPU Strategy for the full GPU coordination protocol.
rune-agent <-> inference providers¶
The agent uses the InferenceProvider interface from libs/inference/ for generation. The provider is selected via configuration-based factory, supporting multiple backends:
| Field | Value |
|---|---|
| Interface | InferenceProvider ABC (libs/inference/) |
| Backends | TransformersProvider, LlamaCppProvider, OllamaProvider, VLLMProvider |
| Selection | Factory-based, driven by configuration |
| Concurrency | Single-tenant (one agent session at a time in v1) |
evolution-svc <-> adapter-registry (lifecycle)¶
The evolution service reads adapter metadata, evaluates fitness on held-out tests, and writes lifecycle events. Note: evolution logic primarily lives in scripts/swarm_evolution.py, not in the service endpoints (which are stubs).
| Operation | Description |
|---|---|
| Evaluate | Run held-out tests against adapter, compute pass rate |
| Promote | Move high-fitness task adapter to domain level |
| Prune | Mark low-fitness adapters as archived (not deleted — write-once) |
| Merge | Combine overlapping adapters into a new composite adapter |
scripts/ (fat orchestrator)¶
The scripts/ directory is the primary execution layer, collapsing the microservice architecture into single-process orchestration:
| Script | Role |
|---|---|
rune_runner.py |
4-phase pipeline: decompose → plan → code → integrate |
swarm.py |
Multi-agent orchestrator: agents + training pool + evolution + watchdog |
swarm_workers.py |
Training pool manager: QLoRA in subprocess, vLLM sleep/wake |
swarm_evolution.py |
Evolution worker: TIES/DARE merge, pruning, lineage tracking |
e2e_test.py |
End-to-end test exercising full pipeline |
bootstrap.py |
Path setup for scripts importing from libs/ |
Existing Services Not Modified¶
These existing monorepo services are not modified by Rune and continue operating independently:
| Service / Library | Role | Rune Relationship |
|---|---|---|
libs/shared |
Extended: hardware.py, checkpoint_db.py, sandbox.py, template_loader.py, rune_models.py, storage_utils.py | Consumed by scripts, services, and other libs |
libs/evaluation |
Extended: ood_benchmark.py, metrics.py | Used by evolution-svc and scripts/swarm_evolution.py for fitness evaluation |
Monorepo Layout (Post-Rune)¶
rune/
scripts/ # Fat orchestrator layer
rune_runner.py # 4-phase pipeline
swarm.py # Multi-agent orchestrator
swarm_workers.py # Training pool, GPU coordination
swarm_evolution.py # TIES/DARE merge, pruning
e2e_test.py # End-to-end test
services/
api-service/ # REST API with adapter and session routes
rune-agent/ # LangGraph state graph: generate → execute → reflect
training-svc/ # LoRA and hypernetwork training jobs (FastAPI)
evolution-svc/ # Adapter lifecycle endpoints (FastAPI, stubs)
libs/
adapter-registry/ # SQLite + filesystem adapter store
model-training/ # Hypernetwork, D2L pipeline, TIES/DARE, trainer
inference/ # Provider-agnostic: Transformers, llama.cpp, Ollama, vLLM
shared/ # Hardware, sandbox, templates, models, checkpoint DB
evaluation/ # OOD benchmark, Pass@k, fitness scoring
events-py/ # Event envelope and helpers
docs/ # MkDocs documentation