Response Structure
Top-Level Fields
| Field | Type | Description |
|---|---|---|
job_id | string | Unique identifier for this job |
duration | number | Processing time in seconds |
result | object | Contains parsed content (see below) |
usage.num_pages | integer | Number of pages processed |
usage.credits | number | Credits consumed |
pdf_url | string | Temporary URL to download the processed PDF |
studio_link | string | Link to view results in Reducto Studio |
Result Types: Full vs URL
Parse can return results in two ways:- Full (inline)
- URL (external)
result.chunks directly.Understanding Chunks
Chunks are the primary output unit, optimized for RAG and embedding workflows.| Field | Description |
|---|---|
content | Markdown-formatted content of this chunk |
embed | Embedding-optimized version (may include table/figure summaries) |
blocks | Array of individual content blocks with positions |
enriched | AI-enriched content (when enrich config enabled) |
enrichment_success | Whether enrichment completed successfully |
content vs embed
content: Raw extracted content, preserves original textembed: Optimized for vector embeddings, may include:- Table summaries (natural language descriptions)
- Figure summaries (AI-generated descriptions)
embed for your vector database, content for display.
Understanding Blocks
Blocks are the atomic content elements within each chunk. Every paragraph, table, header, and figure is a separate block.Block Types
| Type | Description | Example Content |
|---|---|---|
Title | Document title | ”Invoice #12345” |
Section Header | Section headings | ”Payment Terms” |
Header | Page headers | ”Page 1 of 5” |
Footer | Page footers | ”Confidential” |
Text | Body paragraphs | ”Thank you for your business…” |
Table | Tabular data | HTML/Markdown table |
Figure | Images and charts | Caption or AI description |
Key Value | Label-value pairs | ”Total: $1,234.56” |
List Item | Bulleted/numbered items | ”• First item” |
Checkbox | Form checkboxes | ”☑ Agree to terms” |
Block Fields
| Field | Type | Description |
|---|---|---|
type | string | Block type (see table above) |
content | string | The actual content |
bbox | object | Position and size on the page |
confidence | string | ”high” or “low” |
granular_confidence | object | Numeric confidence scores |
image_url | string|null | URL to block image (if return_images enabled) |
Bounding Box Coordinates
Every block includes abbox object describing its position on the page.
Coordinate System
All coordinates are normalized to [0, 1] relative to page dimensions:| Field | Description |
|---|---|
left | Distance from left edge (0 = left edge, 1 = right edge) |
top | Distance from top edge (0 = top, 1 = bottom) |
width | Block width as fraction of page width |
height | Block height as fraction of page height |
page | Page number (1-indexed) in the processed output |
original_page | Page number in the source document |
page and original_page differ when using page_range to process a subset of pages. For example, if you process pages 5-10, page 5 becomes page: 1 but original_page: 5.Confidence Scores
Parse provides confidence scores to help you identify potentially problematic extractions.String Confidence
high: Extraction is reliablelow: May need review; consider enabling agentic mode
Granular Confidence
| Field | Description |
|---|---|
parse_confidence | Numeric score (0-1) for parsing accuracy |
extract_confidence | Numeric score for extraction (when using Extract endpoint) |
Complete Example
Here’s a complete example showing the full structure:Full Response Example
Full Response Example
Related
Parse Overview
Quick start and basic usage.
Best Practices
Optimize for your document types.
Chunking Methods
Control how content is split.
Table Output Formats
HTML, Markdown, JSON, CSV options.