Reference

Supported Document Types

Published April 16, 2026 · By Equalify Tech Team

Supported document types

Equalify Reflow is designed for course materials — syllabi, academic papers, policy documents, and presentations. This page is a quick lookup for what's in scope, what the pipeline produces, and where quality drops off. For the judgement side — how to evaluate a specific conversion — see interpret the output.

Size limits

Limit	Value	Behaviour when exceeded
File size	100 MB	API rejects with `413 Payload Too Large` at submission
Page count	50 pages	Job moves to `failed` status with error: `PDF has N pages, which exceeds the maximum of 50. Please split into smaller documents.`

Documents close to the 50-page ceiling also incur the most cost (roughly linear in page count — plan ~$0.08–0.10 per page for a Haiku-tier run).

Quality by document type

Document type	Typical quality	Common issues
Syllabi and course materials	High	Occasional heading-level disagreements
Policy documents	High	Complex nested numbering schemes
Letters and memos	High	Letterhead content may be over-described
Academic chapters	Medium	Footnote ordering, reading order in multi-column layouts
Presentations (slides)	Medium	Slide boundaries, text embedded in images
Infographics and posters	Lower	Spatial relationships lost when linearised
Brochures with complex layouts	Lower	Multi-column reading order confusion

The pipeline emits warnings on the job response for document types it handles poorly, visible in both the API response and the viewer.

Known limitations

The following are outside current scope and will produce lower-quality output:

Scanned multi-column academic chapters — reading order across columns is unreliable for scanned content
Heavy infographics — spatial relationships (flow diagrams, org charts) flatten into linear text
Mathematical equations — complex LaTeX formulas are not fully supported
Bilingual scanned documents — OCR quality degrades with mixed-language scanned content
Very long documents — while the technical limit is 50 pages, quality and cost scale with complexity; documents over ~40 pages may benefit from being split along natural section boundaries

What the output contains

Every completed job produces:

result.md — a single markdown file with the full document content (semantic headings H1–H6, alt text on images, accessible tables with header rows, reconstructed lists, inline hyperlinks, logical reading order)
Figures — individual image files (PNG) for each extracted figure/chart/diagram, each tied to a figure_id referenced from the markdown
Change ledger — a JSON record of every edit the pipeline made, with before/after text and a one-sentence reason per edit
Bundle — optional ZIP of the above, downloadable from the /bundle endpoint

Decorative images (logos, spacers) are identified and left with empty alt text, following WCAG best practices. Informational images get descriptive alt text generated by the image sub-agent.