Back to Reflow
Reference

Supported Document Types

Edit on GitHub
Published April 16, 2026 · By Equalify Tech Team

Supported document types

Equalify Reflow is designed for course materials — syllabi, academic papers, policy documents, and presentations. This page is a quick lookup for what's in scope, what the pipeline produces, and where quality drops off. For the judgement side — how to evaluate a specific conversion — see interpret the output.

Size limits

LimitValueBehaviour when exceeded
File size100 MBAPI rejects with 413 Payload Too Large at submission
Page count50 pagesJob moves to failed status with error: PDF has N pages, which exceeds the maximum of 50. Please split into smaller documents.
Documents close to the 50-page ceiling also incur the most cost (roughly linear in page count — plan ~$0.08–0.10 per page for a Haiku-tier run).

Quality by document type

Document typeTypical qualityCommon issues
Syllabi and course materialsHighOccasional heading-level disagreements
Policy documentsHighComplex nested numbering schemes
Letters and memosHighLetterhead content may be over-described
Academic chaptersMediumFootnote ordering, reading order in multi-column layouts
Presentations (slides)MediumSlide boundaries, text embedded in images
Infographics and postersLowerSpatial relationships lost when linearised
Brochures with complex layoutsLowerMulti-column reading order confusion
The pipeline emits warnings on the job response for document types it handles poorly, visible in both the API response and the viewer.

Known limitations

The following are outside current scope and will produce lower-quality output:

  • Scanned multi-column academic chapters — reading order across columns is unreliable for scanned content
  • Heavy infographics — spatial relationships (flow diagrams, org charts) flatten into linear text
  • Mathematical equations — complex LaTeX formulas are not fully supported
  • Bilingual scanned documents — OCR quality degrades with mixed-language scanned content
  • Very long documents — while the technical limit is 50 pages, quality and cost scale with complexity; documents over ~40 pages may benefit from being split along natural section boundaries

What the output contains

Every completed job produces:

  • result.md — a single markdown file with the full document content (semantic headings H1–H6, alt text on images, accessible tables with header rows, reconstructed lists, inline hyperlinks, logical reading order)
  • Figures — individual image files (PNG) for each extracted figure/chart/diagram, each tied to a figure_id referenced from the markdown
  • Change ledger — a JSON record of every edit the pipeline made, with before/after text and a one-sentence reason per edit
  • Bundle — optional ZIP of the above, downloadable from the /bundle endpoint
Decorative images (logos, spacers) are identified and left with empty alt text, following WCAG best practices. Informational images get descriptive alt text generated by the image sub-agent.