New

Highlights

One month ago, we opened a playground for Information Extract. The waitlist filled fast, and developers tested it with real-world documents—insurance packets, scanned forms, multipage tables.

This early traffic helped us refine schema alignment, layout handling, and batch performance.

Today, Information Extract becomes a production-ready REST API—turning unstructured PDFs into structured, schema-aware JSON. No training. No templates. No prompt tuning.

What makes Information Extract different

Zero-training extraction: Works on any document—no templates, no fine-tuning required
Schema-aligned output: Returns structured JSON that matches your schema—types, nesting, and required fields included
Layout understanding: Accurately handles tables, checkboxes, multi-page layouts, and rotated content
Flat per-page pricing: Predictable billing, regardless of token count or content complexity

From document to JSON—in one call

Information Extract turns layout-heavy PDFs into clean, typed JSON—aligned to your schema, without templates or scripting.

In this example, a multi-page rent roll PDF is converted into structured JSON.

Each row is mapped to typed fields like rent, deposits, concessions, and parking fees—with no templates or custom scripts.

How to extract with schema in one call

# Information Extraction Request using the generated schema
extraction_response = client.chat.completions.create(
    model="information-extract",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/png;base64,{base64_data}"}
                }
            ]
        }
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "document_schema",
            "schema": {
                "type": "object",
                "properties": {
                    "bank_name": {
                        "type": "string",
                        "description": "The name of bank in bank statement"
                    }
                }
            }
        }
    }
)

Available now

Upstage Console: Create a workspace and get $10 free credits

Let your apps understand documents—at scale. Start building with the Information Extract API.

‍

Extract structured data from any document—Information Extract API is live

Minjee Kang

•

Products

•

May 29, 2025

One month ago, we opened a playground for Information Extract. The waitlist filled fast, and developers tested it with real-world documents—insurance packets, scanned forms, multipage tables.

This early traffic helped us refine schema alignment, layout handling, and batch performance.

Today, Information Extract becomes a production-ready REST API—turning unstructured PDFs into structured, schema-aware JSON. No training. No templates. No prompt tuning.

What makes Information Extract different

Zero-training extraction: Works on any document—no templates, no fine-tuning required
Schema-aligned output: Returns structured JSON that matches your schema—types, nesting, and required fields included
Layout understanding: Accurately handles tables, checkboxes, multi-page layouts, and rotated content
Flat per-page pricing: Predictable billing, regardless of token count or content complexity

From document to JSON—in one call

Information Extract turns layout-heavy PDFs into clean, typed JSON—aligned to your schema, without templates or scripting.

In this example, a multi-page rent roll PDF is converted into structured JSON.

Each row is mapped to typed fields like rent, deposits, concessions, and parking fees—with no templates or custom scripts.

How to extract with schema in one call

# Information Extraction Request using the generated schema
extraction_response = client.chat.completions.create(
    model="information-extract",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/png;base64,{base64_data}"}
                }
            ]
        }
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "document_schema",
            "schema": {
                "type": "object",
                "properties": {
                    "bank_name": {
                        "type": "string",
                        "description": "The name of bank in bank statement"
                    }
                }
            }
        }
    }
)

Available now

Upstage Console: Create a workspace and get $10 free credits

Let your apps understand documents—at scale. Start building with the Information Extract API.

‍

One month ago, we opened a playground for Information Extract. The waitlist filled fast, and developers tested it with real-world documents—insurance packets, scanned forms, multipage tables.

This early traffic helped us refine schema alignment, layout handling, and batch performance.

Today, Information Extract becomes a production-ready REST API—turning unstructured PDFs into structured, schema-aware JSON. No training. No templates. No prompt tuning.

What makes Information Extract different

Zero-training extraction: Works on any document—no templates, no fine-tuning required
Schema-aligned output: Returns structured JSON that matches your schema—types, nesting, and required fields included
Layout understanding: Accurately handles tables, checkboxes, multi-page layouts, and rotated content
Flat per-page pricing: Predictable billing, regardless of token count or content complexity

From document to JSON—in one call

Information Extract turns layout-heavy PDFs into clean, typed JSON—aligned to your schema, without templates or scripting.

In this example, a multi-page rent roll PDF is converted into structured JSON.

Each row is mapped to typed fields like rent, deposits, concessions, and parking fees—with no templates or custom scripts.

How to extract with schema in one call

# Information Extraction Request using the generated schema
extraction_response = client.chat.completions.create(
    model="information-extract",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/png;base64,{base64_data}"}
                }
            ]
        }
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "document_schema",
            "schema": {
                "type": "object",
                "properties": {
                    "bank_name": {
                        "type": "string",
                        "description": "The name of bank in bank statement"
                    }
                }
            }
        }
    }
)

Available now

Upstage Console: Create a workspace and get $10 free credits

Let your apps understand documents—at scale. Start building with the Information Extract API.

‍

Highlights

What makes Information Extract different

From document to JSON—in one call

How to extract with schema in one call

Available now

Demo now open! Unleash information from any document

Extract structured data from any document—Information Extract API is live

We build intelligence for the future of work—now it’s your turn.

What makes Information Extract different

From document to JSON—in one call

How to extract with schema in one call

Available now

What makes Information Extract different

From document to JSON—in one call

How to extract with schema in one call

Available now

Next generation for AGI: Upstage’s on-device LLM, WriteUp

Turn Charts into LLM-Actionable Data: Introducing Chart Recognition in Upstage Document Parse

Understanding document structure with OCR - Document AI technology for LLM

Building tomorrow’s solutions today

What makes Information Extract different

From document to JSON—in one call

How to extract with schema in one call

Available now

Related posts

Demo now open! Unleash information from any document

We build intelligence for the future of work—now it’s your turn.

What makes Information Extract different

From document to JSON—in one call

How to extract with schema in one call

Available now

What makes Information Extract different

From document to JSON—in one call

How to extract with schema in one call

Available now

Related blog posts

Next generation for AGI: Upstage’s on-device LLM, WriteUp

Turn Charts into LLM-Actionable Data: Introducing Chart Recognition in Upstage Document Parse

Understanding document structure with OCR - Document AI technology for LLM

Building tomorrow’s solutions today