Back to Reflow

Architecture

Edit on GitHub
Published March 23, 2026 · By Equalify Tech Team

Architecture

Equalify Reflow is composed of three services that work together: a conversion engine, a WordPress plugin, and a feedback service. This page covers the conversion engine architecture in detail.

System Components

┌─────────────────┐
                    │  WordPress Site  │
                    │  (reflow-wp)     │
                    └────────┬────────┘
                             │
              ┌──────────────┼──────────────┐
              │              │              │
              ▼              ▼              ▼
┌──────────────────────────────────────────────────┐
│              Equalify Reflow API                  │
│              (FastAPI + Uvicorn)                   │
│                                                    │
│  ┌──────────┐  ┌──────────┐  ┌──────────────────┐│
│  │ Document  │  │ Pipeline │  │ Approval         ││
│  │ Endpoints │  │ Viewer   │  │ Endpoints        ││
│  └─────┬────┘  └─────┬────┘  └────────┬─────────┘│
│        │             │                │           │
│  ┌─────┴─────────────┴────────────────┴─────────┐│
│  │           Service Layer                       ││
│  │  ┌────────────┐  ┌──────────┐  ┌───────────┐ ││
│  │  │ Processing │  │ Storage  │  │ Queue     │ ││
│  │  │ Service    │  │ Service  │  │ Service   │ ││
│  │  └──────┬─────┘  └─────┬───┘  └─────┬─────┘ ││
│  └─────────┼──────────────┼─────────────┼───────┘│
└────────────┼──────────────┼─────────────┼────────┘
             │              │             │
    ┌────────┼──────────────┼─────────────┼────────┐
    │        ▼              ▼             ▼        │
    │  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
    │  │ Docling  │  │    S3    │  │  Redis   │   │
    │  │ Serve    │  │          │  │          │   │
    │  └──────────┘  └──────────┘  └──────────┘   │
    │              Infrastructure                   │
    └──────────────────────────────────────────────┘

System component diagram showing the three-layer architecture: a WordPress site connects to the Equalify Reflow API (FastAPI + Uvicorn), which contains Document, Pipeline Viewer, and Approval endpoint groups. These feed into a shared Service Layer (Processing, Storage, and Queue services), which communicates with the infrastructure layer containing Docling Serve (PDF extraction), S3 (document storage), and Redis (job state and queuing).

Conversion Engine (equalify-reflow)

The core service. A FastAPI application that accepts PDF uploads, runs the five-stage pipeline, and returns accessible markdown.

Key responsibilities:

WordPress Plugin (equalify-reflow-wp)

Integrates the conversion engine with WordPress. Administrators process PDFs from the Media Library; results are stored as WordPress posts and served through a built-in viewer.

See the WordPress Plugin Guide.

Feedback Service (equalify-reflow-feedback)

A separate FastAPI + SQLite service that collects issue reports and text corrections from viewers. Provides filtering, aggregation, and a Metabase dashboard for analyzing feedback patterns.

Data Flow

Standard Processing Flow

1. PDF uploaded → S3 temp bucket
2. PII scan (Microsoft Presidio)
   ├─ Pass → queue for processing
   └─ Fail → hold for human approval
3. Pipeline processing (5 stages)
   └─ Each stage: AI agent processes → edits recorded in change ledger
4. Results stored in S3 results bucket
5. Job marked completed → SSE event emitted
6. Client downloads markdown + figures via pre-signed S3 URLs

Data flow diagram showing the six steps of document processing: PDF upload to S3, PII scanning with pass/fail branching, five-stage pipeline processing with change ledger recording, results storage in S3, job completion notification via SSE, and client download of markdown and figures via pre-signed URLs.

SSE Streaming Architecture

Real-time progress is delivered via Server-Sent Events. The architecture is designed so the pipeline runs independently of client connections:

This pattern is used by both the built-in viewer and the WordPress plugin.

Service Layer

ProcessingService

Orchestrates the conversion pipeline. Manages the dossier (document context that accumulates through pipeline stages), coordinates AI agents, and records the change ledger.

StorageService

Wraps S3 operations with circuit breakers and retry logic. Handles upload, download, and pre-signed URL generation for both temp and results buckets.

QueueService

Redis-based job queuing. Documents are enqueued after PII scanning and dequeued by background workers.

JobService

Manages job state in Redis using Lua scripts for atomic operations. Tracks status transitions, stores metadata, and publishes state-change events.

PIIDetectionService

Scans document text using Microsoft Presidio. Detects email addresses, phone numbers, SSNs, and other PII entity types. Configurable confidence threshold.

AI Agent Architecture

The pipeline uses PydanticAI to define agents with tool-call interfaces. Each agent:

Tool registration is conditional — vision tools are only provided when the task involves images, reducing prompt token waste for text-only work.

Model

The pipeline uses Claude Haiku (via AWS Bedrock) for all AI processing steps. Model configuration is managed centrally in src/agents/model_tiers.py.

Infrastructure

Local Development

make dev  # Starts everything via Docker Compose

ServicePortPurpose
API Gateway8080FastAPI application
Redis6379Job state, queues, pub/sub
LocalStack4566S3 emulation
Docling Serve5001PDF extraction sidecar
Prometheus9090Metrics collection
Grafana3001Monitoring dashboards
Jaeger16686Distributed tracing
Code is mounted into the container with hot reload enabled — edit files on your host and changes take effect immediately.

Production (AWS ECS)

Resilience