Solution Strategy

The goal is to generate accurate, readable architecture diagrams with minimal friction for users and minimal context load for LLMs. The solution is an autonomous agent that:

  • works locally-first (privacy, offline capability), with explicit opt-in for remote services;

  • iterates on diagram source until syntax and visual quality meet an agreed bar;

  • exposes both CLI and MCP modes to fit shell-first and tool-driven LLM workflows;

  • returns concise outputs (source + rendered image or URL) and surfaces any remaining issues transparently.

Core Strategy: Autonomous Feedback Loop

The agent operates independently from the calling LLM to minimize context consumption:

  1. Generate: Agent calls LLM to create diagram source code with examples

  2. Validate: Submit to Kroki, capture syntax errors

  3. Fix: If errors, agent provides feedback to LLM for correction

  4. Analyze (if vision available): Render image, LLM evaluates layout/design

  5. Iterate: Repeat steps 2-4 until quality threshold or limits reached

  6. Return: Provide source + rendered outputs (or URL references)

feedback loop

Handling Syntax Errors and Visual Quality

  • Syntax resilience: Start from diagram-type templates, avoid known-not-working directives, and always run a Kroki lint/render pass to surface exact line errors. The agent retries with the error payload (line numbers, message) and keeps a short history to avoid repeating failed constructs. If two retries fail, it returns the best-attempt source plus the captured error for transparency.

  • Visual quality: When vision is available, the agent asks the LLM to critique legibility (spacing, overlaps, label truncation) and requests a corrected version. Without vision, it applies heuristics: cap node/edge counts per diagram type, prefer left-to-right layouts, apply consistent skinparams/themes, and simplify (collapse groups) when exceeding thresholds. A small library of “good defaults” per renderer (e.g., PlantUML skinparams) is injected into prompts to steer toward readable output.

Design Decisions

Architecture decisions are maintained centrally in section 9 (Architecture Decisions) with dedicated ADR files (ADR-001 …​ ADR-008).

Technology Stack

Component Technology Rationale

Runtime

Python 3.10+

Existing codebase, rich ecosystem

LLM Abstraction

LiteLLM

Unified interface for 100+ models

CLI Framework

Click

Intuitive command structure, good help generation

MCP Implementation

FastMCP / MCP SDK

Standard protocol for LLM tool integration

Diagram Rendering

Kroki

Supports 20+ diagram types

Configuration

python-dotenv, PyYAML

Standard config management

HTTP Client

Requests / httpx

Kroki API communication

Containerization

Docker

Bundled Kroki Fat-JAR option