space ocr
ArticlesDocs
space ocr

Guides & articles

Practical guides to turning receipts, invoices, and any document into structured, verifiable data — every value traceable to its source.

Guide
Document OCR with an Audit Trail: Verify Every Extracted Value
Why document AI needs an audit trail, and how per-field bounding boxes make every extracted value checkable against the source image.
7 min read
Guide
How to Convert Scanned Documents Into CSV (Step by Step)
A practical tutorial: capture a document image, define your columns once, upload to auto-fill rows, and export a clean UTF-8 CSV — line items and all.
7 min read
Guide
API for Extracting Data From Invoices: A Developer Guide
A developer guide to the space-ocr API for extracting data from invoices — curl and Python examples, invoice templates, custom field schemas, and bounding-box provenance.
8 min read
Guide
Best OCR Software for Receipts and Invoices (2026 Guide)
What to look for in receipt and invoice OCR — verifiable accuracy, line items, export, API, and an audit trail — and how space-ocr delivers each, proven with a live demo.
8 min read
Guide
Receipt OCR to CSV: Convert Receipts and Import Into freee, Money Forward & Yayoi
Turn the hell of typing receipts by hand into nothing more than snapping a photo and exporting a CSV. Line items expand to one row per product, UTF-8 BOM means no garbled text, and every value can be checked against the original, so the data drops straight into freee / Money Forward / Yayoi.
8 min read
Guide
Invoice & Delivery Note OCR API: Extract Invoice Data to CSV (Developer Guide)
Turn invoices and delivery notes into structured data with a single API call. A developer-focused walkthrough of POST /ocr/fields — curl/Python examples, templates and custom fields, exploding line items into rows, per-value bbox verification, and CSV export.
9 min read
verification
How to Validate OCR Output Using Bounding Boxes
Use per-field bounding boxes and match ratios to spot-check extracted data fast — without re-running OCR.
5 min read
receipts
Extract Line Items From Invoices Automatically | space-ocr
Declare the line-item table as an array field and get one structured, verifiable row per item — then expand it straight into CSV.
7 min read
convert
Convert a Scanned PDF to Excel: Page Images to CSV
A scanned page is an image, not a spreadsheet — here's how to read it into structured rows and export a CSV that opens directly in Excel.
7 min read
comparison
Amazon Textract Alternative: a Verifiable OCR API (2026)
When a Textract alternative makes sense — verifiable per-value coordinates, Japanese/Korean/Chinese support, a queryable sheet, and flat per-image pricing with no AWS account — and where Textract still shines.
8 min read
developer
OCR API with Bounding Boxes: Verify Every Value (2026)
Most OCR APIs return bounding boxes — but coordinate systems differ and a box only says where, not how sure. How to get source coordinates plus a per-value match ratio you can verify.
8 min read
comparison
Google Vision vs Space OCR: Raw Text vs Structured Fields (2026)
Google Cloud Vision is excellent raw OCR — text plus pixel boundingPoly and a recognition confidence — but structured key-value fields are a separate product (Document AI). Here's the honest dividing line, with a live, checkable demo.
8 min read
comparison
Tesseract vs Google Vision vs Space OCR for Receipts (2026)
Receipt OCR three ways: self-hosted Tesseract, cloud Google Vision, and verification-first space-ocr. Where each wins — pixel boxes vs. structured fields, recognition confidence vs. character-coverage match ratio — proven with a live demo.
8 min read
developer
OCR API with Source Coordinates: Verify Every Value (2026)
An OCR API where every value carries source coordinates and a match_ratio you can gate on — flag low-confidence values for review, keep an audit trail, and query stored results server-side without re-running OCR.
8 min read
developer
Extract Invoice Line Items to JSON: the API Contract (2026)
How to ask the OCR API for invoice line items as a structured array, the exact JSON you get back (each cell with its own bbox, vertices and match_ratio), and how to parse it in code and unfold it to CSV.
8 min read
convert
Convert Scanned PDF to Excel (Japanese, No Garbled Text) — Get Tables Into CSV
A scanned PDF is a picture of a table. Skip the retyping and the copy-paste: read each page into structured fields and write it out to a CSV (UTF-8 BOM) that opens straight in Excel. Here's the real-world workflow.
8 min read