Extract

import requests url = "https://platform.reducto.ai/extract" payload = { "input": "<string>", "parsing": { "enhance": { "agentic": [], "intelligent_ordering": False, "summarize_figures": True }, "retrieval": { "chunking": { "chunk_mode": "disabled", "chunk_overlap": 0 }, "embedding_optimized": False, "filter_blocks": [] }, "formatting": { "add_page_markers": False, "include": [], "merge_tables": False, "table_output_format": "dynamic" }, "spreadsheet": { "clustering": "accurate", "exclude": [], "include": [], "split_large_tables": { "enabled": True, "size": 50 } }, "settings": { "embed_pdf_metadata": False, "extraction_mode": "hybrid", "force_url_result": False, "ocr_system": "standard", "persist_results": False, "return_images": [], "return_ocr_data": False } }, "instructions": { "schema": {}, "system_prompt": "Be precise and thorough." }, "settings": { "include_images": False, "optimize_for_latency": False, "array_extract": False, "deep_extract": False, "citations": { "enabled": False, "numerical_confidence": True } } } headers = { "Authorization": "Bearer <token>", "Content-Type": "application/json" } response = requests.post(url, json=payload, headers=headers) print(response.text)

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

SyncExtractConfig
AsyncExtractConfig

input

required

For parse/split/extract pipelines, the URL of the document to be processed. You can provide one of the following: 1. A publicly available URL 2. A presigned S3 URL 3. A reducto:// prefixed URL obtained from the /upload endpoint after directly uploading a document 4. A jobid:// prefixed URL obtained from a previous /parse invocation 5. A list of URLs (for multi-document pipelines, V3 API only)

For edit pipelines, this should be a string containing the edit instructions

parsing

ParseOptions · object

The configuration options for parsing the document. If you are passing in a jobid:// URL for the file, then this configuration will be ignored.

Show child attributes

instructions

Instructions · object

The instructions to use for the extraction.

Show child attributes

settings

ExtractSettings · object

The settings to use for the extraction.

Show child attributes

Response

Successful Response

V3ExtractResponse
AsyncExtractResponse

usage

ExtractUsage · object

required

Show child attributes

result

required

The extracted response in your provided schema. This is a list of dictionaries. If disable_chunking is True (default), then it will be a list of length one.

job_id

string | null

studio_link

string | null

The link to the studio pipeline for the document.

Document Processing

Job Management

Utilities

Authorizations

Body

Response