← powerpoint.md

Exstruct

LLM / RAG Pipeline CLI + MCP Open Source
Best for: Developers building LLM and RAG pipelines that need to ingest Excel data — Exstruct parses tables, shapes, and charts into structured JSON that Claude can reason over effectively
Not ideal for: Writing back to Excel — Exstruct is read/extract only; use Excel MCP Server or SV Excel Agent when you need to edit spreadsheets in place

Data Extraction Pipeline

Excel elements → Structured JSON
📋 Tables
{"tables": [{"name": "Sales", "headers": [...], "rows": [...]}]}
🔷 Shapes
{"shapes": [{"type": "TextBox", "text": "...", "position": {...}}]}
📊 Charts
{"charts": [{"type": "BarChart", "series": [...], "title": "..."}]}
🖼️ Images
{"images": [{"sheet": "Sheet1", "alt_text": "...", "index": 0}]}

All Excel elements are extracted into a single structured JSON document — pass it directly to Claude as context for analysis, summarization, or RAG retrieval.

Key Capabilities

📋
Full Excel Element Parsing
Tables, shapes, charts, images, merged cells, named ranges — all parsed into structured JSON, not just raw cell values
🔍
LLM-Optimized Output
JSON structure is designed to be directly passed to Claude or other LLMs — minimal noise, maximum information density for reasoning tasks
🗄️
RAG Pipeline Ready
Use Exstruct as a pre-processor in RAG pipelines — extract spreadsheet content once, chunk it, embed it, and make Excel data searchable
🖥️
CLI + MCP Interfaces
Use as a standalone CLI command in scripts, or as an MCP server in Claude Desktop — extract data from spreadsheets in a chat prompt

Signals

GitHub Stars
133
Primary Use
Extract → LLM/RAG
Read / Write
Read-only (by design)
License
Open Source

Quality Assessment

Extraction Completeness
4.5
LLM-Readability of Output
4.4
Setup Ease
4.1
Write-back Support
N/A

Write-back is N/A by design — Exstruct is a read-only extraction tool. If you need write-back, combine with Excel MCP Server. Extraction completeness is strong for a 133-star project.

Install & Use

# Install via pip pip install exstruct # Extract all elements from a workbook to JSON exstruct extract report.xlsx --output report.json # Extract specific sheets only exstruct extract report.xlsx --sheets "Q3 Data,Summary" --output q3.json # Use as MCP server in Claude Desktop # Then ask: "Extract the data from budget.xlsx and summarize by department" # Claude calls exstruct to get JSON, then analyzes it directly in context # In Claude Code: use Bash to extract, then pass JSON to Claude exstruct extract data.xlsx | claude analyze "Summarize this spreadsheet data"

Alternatives & Tradeoffs

Excel MCP Server
Bidirectional — reads and writes. 3.6k stars, far more tested. Exstruct wins for RAG pipelines where structured extraction is more important than write-back
SV Excel Agent
Autonomous agent that also reads and writes. Exstruct wins when all you need is clean JSON extraction for downstream processing
OfficeCLI
Reads XLSX plus DOCX and PPTX — broader format coverage. Exstruct wins for Excel-specific deep extraction including charts and shapes

Community Outputs

LLM and RAG pipelines built with Exstruct. Submit yours after use.
RAG Pipeline
Exstruct · LLM · 4.5/5
Data Analysis
Claude Code · JSON · 4.4/5
Chart Summary
Claude · Charts · 4.2/5
Table Extract
Pipeline · Bulk · 4.1/5