Solution Strategy
The goal is to generate accurate, readable architecture diagrams with minimal friction for users and minimal context load for LLMs. The solution is an autonomous agent that:
-
works locally-first (privacy, offline capability), with explicit opt-in for remote services;
-
iterates on diagram source until syntax and visual quality meet an agreed bar;
-
exposes both CLI and MCP modes to fit shell-first and tool-driven LLM workflows;
-
returns concise outputs (source + rendered image or URL) and surfaces any remaining issues transparently.
Core Strategy: Autonomous Feedback Loop
The agent operates independently from the calling LLM to minimize context consumption:
-
Generate: Agent calls LLM to create diagram source code with examples
-
Validate: Submit to Kroki, capture syntax errors
-
Fix: If errors, agent provides feedback to LLM for correction
-
Analyze (if vision available): Render image, LLM evaluates layout/design
-
Iterate: Repeat steps 2-4 until quality threshold or limits reached
-
Return: Provide source + rendered outputs (or URL references)
Handling Syntax Errors and Visual Quality
-
Syntax resilience: Start from diagram-type templates, avoid known-not-working directives, and always run a Kroki lint/render pass to surface exact line errors. The agent retries with the error payload (line numbers, message) and keeps a short history to avoid repeating failed constructs. If two retries fail, it returns the best-attempt source plus the captured error for transparency.
-
Visual quality: When vision is available, the agent asks the LLM to critique legibility (spacing, overlaps, label truncation) and requests a corrected version. Without vision, it applies heuristics: cap node/edge counts per diagram type, prefer left-to-right layouts, apply consistent skinparams/themes, and simplify (collapse groups) when exceeding thresholds. A small library of “good defaults” per renderer (e.g., PlantUML skinparams) is injected into prompts to steer toward readable output.
Design Decisions
Architecture decisions are maintained centrally in section 9 (Architecture Decisions) with dedicated ADR files (ADR-001 … ADR-008).
Technology Stack
| Component | Technology | Rationale |
|---|---|---|
Runtime |
Python 3.10+ |
Existing codebase, rich ecosystem |
LLM Abstraction |
LiteLLM |
Unified interface for 100+ models |
CLI Framework |
Click |
Intuitive command structure, good help generation |
MCP Implementation |
FastMCP / MCP SDK |
Standard protocol for LLM tool integration |
Diagram Rendering |
Kroki |
Supports 20+ diagram types |
Configuration |
python-dotenv, PyYAML |
Standard config management |
HTTP Client |
Requests / httpx |
Kroki API communication |
Containerization |
Docker |
Bundled Kroki Fat-JAR option |
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.