Back to All changelogs
September 17, 2025

Fixed token limit issues with large CSV/Excel tables

Fixed token limit issues with large, dense CSV and Excel tables through automatic splitting and intelligent result merging.

Key Highlights

  • Handles 500+ row spreadsheets and extensive financial reports that previously failed
  • Automatic table splitting preserves relationships and maintains extraction accuracy
  • Transparent processing - no configuration changes or manual preprocessing required

What's new

Large, dense CSV and Excel tables that previously failed due to token limits now process correctly. The system automatically splits oversized tables into manageable chunks during processing, then merges the structured extraction results back together seamlessly.

Why it matters

  • Dense spreadsheets with hundreds of rows/columns no longer fail
  • Financial reports with extensive data tables process reliably
  • Data exports from systems maintain full extraction accuracy
  • No manual preprocessing required - handled automatically

Highlights

  • Automatic table splitting when token limits are approached
  • Intelligent result merging preserves table relationships
  • Maintains extraction accuracy across table chunks
  • Transparent to the user - no configuration changes needed

How it works

  1. System detects when a table would exceed token limits
  2. Intelligently splits table while preserving headers and context
  3. Processes each chunk with full context awareness
  4. Merges extraction results back into complete table structure
  5. Returns unified results as if the entire table was processed at once

How to use

This works automatically - no changes needed to your existing code.

Previously failed scenarios that now work

  • 500+ row expense reports
  • Multi-sheet Excel files with dense data tables
  • CSV exports from database systems
  • Financial statements with extensive line items

Status

✅ Live now. Large table processing works automatically with existing extraction configurations.

Get server-less runtime for agents and data ingestion

Data ingestion like never before.
TRUSTED BY PRO DEVS GLOBALLY

Tensorlake is the Agentic Compute Runtime the durable serverless platform that runs Agents at scale.

"At SIXT, we're building AI-powered experiences for millions of customers while managing the complexity of enterprise-scale data. TensorLake gives us the foundation we need—reliable document ingestion that runs securely in our VPC to power our generative AI initiatives."

Boyan Dimitrov
CTO, Sixt

“Tensorlake enabled us to avoid building and operating an in-house OCR pipeline by providing a robust, scalable OCR and document ingestion layer with excellent accuracy and feature coverage. Ongoing improvements to the platform, combined with strong technical support, make it a dependable foundation for our scientific document workflows.”

Yaroslav Sklabinskyi
CEO, Reliant AI

"For BindHQ customers, the integration with Tensorlake represents a shift from manual data handling to intelligent automation, helping insurance businesses operate with greater precision, and responsiveness across a variety of transactions"

Cristian Joe
CEO @ BindHQ

“Tensorlake let us ship faster and stay reliable from day one. Complex stateful AI workloads that used to require serious infra engineering are now just long-running functions. As we scale, that means we can stay lean—building product, not managing infrastructure.”

Arpan Bhattacharya
Founder & CEO @ The Intelligent Search Company