October 16, 2025

New: Vision Language Models for Document Processing

Tensorlake now uses Vision Language Models (VLMs) across multiple features including page classification, figure/table summarization, and structured extraction, enabling faster and more intelligent document understanding.

Key Highlights

VLM-powered page classification for efficient large document processing
Direct visual understanding for figures, tables, and structured data extraction
Skip OCR entirely with VLM-based extraction for more accurate results from harder to parse documents

What's New

We've expanded our use of Vision Language Models (VLMs) across multiple DocumentAI features for faster and more accurate document processing on documents with hundreds of pages:

Page Classification: Identify relevant pages in large documents
Figure and Table Summarization: Extract insights from visual elements
Structured Extraction (with `skip_ocr``): Direct visual understanding for more accurate extraction on harder to parse documents (e.g. scanned documents, engineering diagrams, or documents with confusing reading order)

This changelog focuses on our enhanced page classification capabilities for demonstration. With VLM support, you can quickly process large documents by identifying and extracting from only relevant pages.

Key Improvements

Scale & Performance

Handle Large Documents: Classify documents with hundreds of pages without performance degradation
VLM-Powered Classification: Replaced OCR with Vision Language Models for faster, more accurate classification
Selective Processing: Only parse pages that matter, reducing processing time and costs

Recommended Workflow

Classify First: Use the classify endpoint to identify relevant pages based on your criteria
Parse Selectively: Set page_range to only process the classified relevant pages
Extract Efficiently: Apply structured extraction only to pages containing the information you need

Use Case Example: SEC Filings Analysis

This approach is particularly powerful for extracting specific information from lengthy documents like SEC filings. For example, when analyzing cryptocurrency holdings across multiple companies' 10-K and 10-Q reports:

Challenge: Each filing can be 100-200+ pages, but crypto-related information might only appear on 10-20 pages
Solution: First classify pages containing "digital assets holdings", then extract structured data only from those pages
Result: 80-90% reduction in processing time and more focused, accurate extractions

Code Example

1[.code-block-title]Code[.code-block-title]from tensorlake.documentai import DocumentAI, PageClassConfig
2
3doc_ai = DocumentAI()
4
5# Step 1: Classify pages
6page_classifications = [
7  PageClassConfig(
8    name="digital_assets_holdings",
9    description="Pages showing cryptocurrency holdings on balance sheet..."
10  )
11]
12
13parse_id = doc_ai.classify(
14  file_url=filing_url,
15  page_classifications=page_classifications
16)
17
18result = doc_ai.wait_for_completion(parse_id=parse_id)
19
20# Step 2: Parse only relevant pages
21relevant_pages = result.page_classes[0].page_numbers
22page_range = ",".join(str(i) for i in relevant_pages)
23
24final_result = doc_ai.parse_and_wait(
25  file=filing_url,
26  page_range=page_range,
27  structured_extraction_options=[...]
28)

Benefits

Cost Efficiency: Process only what you need
Speed: Reduce processing time by focusing on relevant content
Accuracy: VLM classification provides better understanding of page content
Scalability: Handle large document sets without compromising performance

Try It Out

Check out our example notebook demonstrating how to extract cryptocurrency metrics from SEC filings using the new classification approach.

Getting Started

Update to the latest version of Tensorlake:

1[.code-block-title]Code[.code-block-title]pip install --upgrade tensorlake

Then start classifying, summarizing, and extracting with improved efficiency!

Get server-less runtime for agents and data ingestion

Data ingestion like never before.

TRY TENSORLAKE

REQUEST A DEMO

TRUSTED BY PRO DEVS GLOBALLY

Tensorlake is the Agentic Compute Runtime the durable serverless platform that runs Agents at scale.

"At SIXT, we're building AI-powered experiences for millions of customers while managing the complexity of enterprise-scale data. TensorLake gives us the foundation we need—reliable document ingestion that runs securely in our VPC to power our generative AI initiatives."

Boyan Dimitrov

CTO, Sixt

“Tensorlake enabled us to avoid building and operating an in-house OCR pipeline by providing a robust, scalable OCR and document ingestion layer with excellent accuracy and feature coverage. Ongoing improvements to the platform, combined with strong technical support, make it a dependable foundation for our scientific document workflows.”

Yaroslav Sklabinskyi

CEO, Reliant AI

"For BindHQ customers, the integration with Tensorlake represents a shift from manual data handling to intelligent automation, helping insurance businesses operate with greater precision, and responsiveness across a variety of transactions"

Cristian Joe

CEO @ BindHQ

“Tensorlake let us ship faster and stay reliable from day one. Complex stateful AI workloads that used to require serious infra engineering are now just long-running functions. As we scale, that means we can stay lean—building product, not managing infrastructure.”

Arpan Bhattacharya

Founder & CEO @ The Intelligent Search Company