Parse Response Format

Parse returns structured content in a format optimized for RAG and LLM applications. This page explains every field in the response.

Response Structure

{
  "job_id": "7600c8c5-a52f-49d2-8a7d-d75d1b51e141",
  "duration": 3.89,
  "result": {
    "type": "full",
    "chunks": [ ... ]
  },
  "usage": {
    "num_pages": 1,
    "credits": 2.0
  },
  "pdf_url": "https://...",
  "studio_link": "https://studio.reducto.ai/job/..."
}

Top-Level Fields

Field	Type	Description
`job_id`	string	Unique identifier for this job
`duration`	number	Processing time in seconds
`result`	object	Contains parsed content (see below)
`usage.num_pages`	integer	Number of pages processed
`usage.credits`	number	Credits consumed
`pdf_url`	string	Temporary URL to download the processed PDF
`studio_link`	string	Link to view results in Reducto Studio

Result Types: Full vs URL

Parse can return results in two ways:

Full (inline)
URL (external)

{
  "result": {
    "type": "full",
    "chunks": [
      { "content": "...", "blocks": [...] }
    ]
  }
}

When returned: Documents under ~6MB response size (most documents).How to use: Access result.chunks directly.

{
  "result": {
    "type": "url",
    "url": "https://storage.reducto.ai/chunks/abc123.json"
  }
}

When returned: Large documents that would exceed HTTP response limits.How to use: Fetch the JSON from result.url.

import requests

if result.result.type == "url":
    chunks_data = requests.get(result.result.url).json()
else:
    chunks_data = result.result.chunks

To always receive a URL (useful for consistent handling), set force_url_result: true in your request.

Result URLs expire after 1 hour. Download or process the content promptly after receiving the response. This applies to:

result.url (when type: "url")
pdf_url (processed PDF)
image_url on blocks (when return_images is enabled)

Understanding Chunks

Chunks are the primary output unit, optimized for RAG and embedding workflows.

{
  "content": "# Invoice\n\nBill To: Acme Corp\n123 Main St...",
  "embed": "# Invoice\n\nBill To: Acme Corp\n123 Main St...",
  "blocks": [ ... ],
  "enriched": null,
  "enrichment_success": false
}

Field	Description
`content`	Markdown-formatted content of this chunk
`embed`	Embedding-optimized version (may include table/figure summaries)
`blocks`	Array of individual content blocks with positions
`enriched`	AI-enriched content (when enrich config enabled)
`enrichment_success`	Whether enrichment completed successfully

content vs embed

content: Raw extracted content, preserves original text
embed: Optimized for vector embeddings, may include:
- Table summaries (natural language descriptions)
- Figure summaries (AI-generated descriptions)

For RAG: Use embed for your vector database, content for display.

Understanding Blocks

Blocks are the atomic content elements within each chunk. Every paragraph, table, header, and figure is a separate block.

{
  "type": "Table",
  "content": "<table><tr><th>Date</th><th>Amount</th></tr>...</table>",
  "bbox": {
    "left": 0.076,
    "top": 0.427,
    "width": 0.834,
    "height": 0.432,
    "page": 1,
    "original_page": 1
  },
  "confidence": "high",
  "granular_confidence": {
    "parse_confidence": 0.834,
    "extract_confidence": null
  },
  "image_url": null
}

Block Types

Type	Description	Example Content
`Title`	Document title	”Invoice #12345”
`Section Header`	Section headings	”Payment Terms”
`Header`	Page headers	”Page 1 of 5”
`Footer`	Page footers	”Confidential”
`Text`	Body paragraphs	”Thank you for your business…”
`Table`	Tabular data	HTML/Markdown table
`Figure`	Images and charts	Caption or AI description
`Key Value`	Label-value pairs	”Total: $1,234.56”
`List Item`	Bulleted/numbered items	”• First item”
`Checkbox`	Form checkboxes	”☑ Agree to terms”

Block Fields

Field	Type	Description
`type`	string	Block type (see table above)
`content`	string	The actual content
`bbox`	object	Position and size on the page
`confidence`	string	”high” or “low”
`granular_confidence`	object	Numeric confidence scores
`image_url`	string\|null	URL to block image (if `return_images` enabled)

Bounding Box Coordinates

Every block includes a bbox object describing its position on the page.

{
  "left": 0.076,
  "top": 0.427,
  "width": 0.834,
  "height": 0.432,
  "page": 1,
  "original_page": 1
}

Coordinate System

All coordinates are normalized to [0, 1] relative to page dimensions:

(0,0) ─────────────────────────── (1,0)
  │                                 │
  │    ┌─────────────────┐          │
  │    │     Block       │          │
  │    │  left=0.1       │          │
  │    │  top=0.2        │          │
  │    │  width=0.5      │          │
  │    │  height=0.3     │          │
  │    └─────────────────┘          │
  │                                 │
(0,1) ─────────────────────────── (1,1)

Field	Description
`left`	Distance from left edge (0 = left edge, 1 = right edge)
`top`	Distance from top edge (0 = top, 1 = bottom)
`width`	Block width as fraction of page width
`height`	Block height as fraction of page height
`page`	Page number (1-indexed) in the processed output
`original_page`	Page number in the source document

page and original_page differ when using page_range to process a subset of pages. For example, if you process pages 5-10, page 5 becomes page: 1 but original_page: 5.

Confidence Scores

Parse provides confidence scores to help you identify potentially problematic extractions.

String Confidence

"confidence": "high"  // or "low"

high: Extraction is reliable
low: May need review; consider enabling agentic mode

Granular Confidence

"granular_confidence": {
  "parse_confidence": 0.834,
  "extract_confidence": null
}

Field	Description
`parse_confidence`	Numeric score (0-1) for parsing accuracy
`extract_confidence`	Numeric score for extraction (when using Extract endpoint)

Complete Example

Here’s a complete example showing the full structure:

Full Response Example

{
  "job_id": "7600c8c5-a52f-49d2-8a7d-d75d1b51e141",
  "duration": 3.89,
  "result": {
    "type": "full",
    "chunks": [
      {
        "content": "# Bank Statement\n\nAccount: 12345678\nPeriod: Jan 1 - Jan 31, 2024\n\n## Transactions\n\n| Date | Description | Amount |\n|------|-------------|--------|\n| 01/05 | Direct Deposit | $2,500.00 |\n| 01/10 | Grocery Store | -$85.32 |",
        "embed": "# Bank Statement\n\nAccount: 12345678\nPeriod: Jan 1 - Jan 31, 2024\n\n## Transactions\n\nThis table shows transactions including a direct deposit of $2,500 and a grocery purchase of $85.32.",
        "blocks": [
          {
            "type": "Title",
            "content": "Bank Statement",
            "bbox": {
              "left": 0.35,
              "top": 0.02,
              "width": 0.30,
              "height": 0.03,
              "page": 1,
              "original_page": 1
            },
            "confidence": "high",
            "granular_confidence": {
              "parse_confidence": 0.95,
              "extract_confidence": null
            },
            "image_url": null
          },
          {
            "type": "Key Value",
            "content": "Account: 12345678",
            "bbox": {
              "left": 0.10,
              "top": 0.08,
              "width": 0.25,
              "height": 0.02,
              "page": 1,
              "original_page": 1
            },
            "confidence": "high",
            "granular_confidence": {
              "parse_confidence": 0.92,
              "extract_confidence": null
            },
            "image_url": null
          },
          {
            "type": "Section Header",
            "content": "Transactions",
            "bbox": {
              "left": 0.10,
              "top": 0.15,
              "width": 0.20,
              "height": 0.02,
              "page": 1,
              "original_page": 1
            },
            "confidence": "high",
            "granular_confidence": {
              "parse_confidence": 0.88,
              "extract_confidence": null
            },
            "image_url": null
          },
          {
            "type": "Table",
            "content": "| Date | Description | Amount |\n|------|-------------|--------|\n| 01/05 | Direct Deposit | $2,500.00 |\n| 01/10 | Grocery Store | -$85.32 |",
            "bbox": {
              "left": 0.10,
              "top": 0.20,
              "width": 0.80,
              "height": 0.25,
              "page": 1,
              "original_page": 1
            },
            "confidence": "high",
            "granular_confidence": {
              "parse_confidence": 0.78,
              "extract_confidence": null
            },
            "image_url": null
          }
        ],
        "enriched": null,
        "enrichment_success": false
      }
    ]
  },
  "usage": {
    "num_pages": 1,
    "credits": 2.0
  },
  "pdf_url": "https://storage.reducto.ai/pdfs/7600c8c5.pdf?...",
  "studio_link": "https://studio.reducto.ai/job/7600c8c5-a52f-49d2-8a7d-d75d1b51e141"
}

Parse Overview

Quick start and basic usage.

Best Practices

Optimize for your document types.

Chunking Methods

Control how content is split.

Table Output Formats

HTML, Markdown, JSON, CSV options.

Get Started

Core Functions

Workflows and Pipelines

Configurations

Reference

Components

Enterprise Resources

Security and privacy

On-premise Resources

Parse Response Format

Response Structure

Top-Level Fields

Result Types: Full vs URL

Understanding Chunks

content vs embed

Understanding Blocks

Block Types

Block Fields

Bounding Box Coordinates

Coordinate System

Confidence Scores

String Confidence

Granular Confidence

Complete Example

Parse Overview

Best Practices

Chunking Methods

Table Output Formats

Get Started

Core Functions

Workflows and Pipelines

Configurations

Reference

Components

Enterprise Resources

Security and privacy

On-premise Resources

​Response Structure

​Top-Level Fields

​Result Types: Full vs URL

​Understanding Chunks

​content vs embed

​Understanding Blocks

​Block Types

​Block Fields

​Bounding Box Coordinates

​Coordinate System

​Confidence Scores

​String Confidence

​Granular Confidence

​Complete Example

​Related

Parse Overview

Best Practices

Chunking Methods

Table Output Formats

Response Structure

Top-Level Fields

Result Types: Full vs URL

Understanding Chunks

content vs embed

Understanding Blocks

Block Types

Block Fields

Bounding Box Coordinates

Coordinate System

Confidence Scores

String Confidence

Granular Confidence

Complete Example

Related