LlamaIndex’s cover photo
LlamaIndex

LlamaIndex

Technology, Information and Internet

San Francisco, California 273,277 followers

Redefine document workflows with AI agents

About us

LlamaIndex empowers developers to build agents that extract insights and take action on complex enterprise documents. It combines industry-leading document parsing and extraction with a trusted framework for building intelligent agents that reason over documents, adapt to business logic, and scale to production. LlamaIndex is loved by developers and trusted by enterprises. Its open source framework is downloaded more than 4M+ every month and has processed more than 200 million documents on LlamaCloud.

Website
https://www.llamaindex.ai/
Industry
Technology, Information and Internet
Company size
11-50 employees
Headquarters
San Francisco, California
Type
Public Company

Products

Locations

Employees at LlamaIndex

Updates

  • LlamaIndex reposted this

    Building “RAG 2.0” is just making Claude Code running over your filesystem 🤖🗂️ To make this work well, you need to solve three things 1️⃣ Virtualize your filesystem to prevent the agent from messing stuff up. AgentFS by Turso is a nice example of how you can give the agent access to a copy of all your files without messing up your raw data. 2️⃣ Parse unstructured documents like PDFs, pptx, Word into an LLM-ready format. Agentic OCR solutions like LlamaParse can help here 3️⃣ Creating an agentic loop with human-in-the-loop. If you want to control the agent implementation instead of using Claude Code out of the box, you can use LlamaIndex workflows to help orchestrate these long-running agent tasks. Shoutout Clelia Astra Bertelli, check it out! Blog: https://lnkd.in/gAdF2eta Repo: https://lnkd.in/guHeBcSh

  • Secure your coding agents with virtual filesystems and better document understanding. Building safe AI coding agents requires solving two critical challenges: filesystem access control and handling unstructured documents. We've created a solution using AgentFS, LlamaParse, and Claude. 🛡️ Virtual filesystem isolation: agents work with copies, not your real files, preventing accidental deletions while maintaining full functionality 📄 Enhanced document processing: LlamaParse converts PDFs, Word docs, and presentations into high-quality text that agents can actually understand ⚡ Workflow orchestration: LlamaIndex Workflows provide stepwise execution with human-in-the-loop controls and resumable sessions 🔧 Custom tool integration: replace built-in filesystem tools with secure MCP server alternatives that enforce safety boundaries This approach uses AgentFS as a SQLite-based virtual filesystem, our LlamaParse for state-of-the-art document extraction, and Claude for the coding interface - all orchestrated through LlamaIndex Agent Workflows. Read the full technical deep-dive with implementation details: https://lnkd.in/e4cMNN2Z Find the code on GitHub: https://lnkd.in/eaMBdfdv

    • No alternative text description for this image
  • LlamaIndex reposted this

    GPT-5.2 Thinking is really good at parsing charts 📊 I threw in some charts into the raw ChatGPT UI after OpenAI hyped up GPT-5.2’s visual capabilities 👇 The native visual understanding capability of GPT-5.2 is not amazing - see the plotted graph for GPT-5.2. But both GPT-5.2 Thinking and Pro make up for that by spending a *ton* on reasoning tokens in order to break down the chart image and plot every point. The plotted points by GPT-5.2 Thinking and Pro are spot on (there are maybe small discrepancies but it’s also really hard to tell by the human eye) If you look at the reasoning trace within the ChatGPT UI, you’ll find that GPT-5.2 will spend a lot of reasoning tokens on writing code to break down the image, analyzing each axis, and getting the line values. Check out the results in the image 🖼️ The cool finding here is that models can make up for poor “one-shot” understanding by just adding a ton of thinking tokens on top.  ⚠️ Of course if you’re actually trying to parse a bunch of chart data efficiently this isn’t very practical and quite slow/expensive. If you’re looking for good/much cheaper chart understanding check out LlamaCloud! 

    • No alternative text description for this image
  • LlamaSheets is our new way to handle complex, messy spreadsheets that come as many sheets disguised as one, multiple regions that provide different sets of information, and much more. Check out this example of a (generated, fake) company budget sheet. It actually has 4 sub-sheets, each containing multiple regions. LlamaSheets identifies each sub-region, creates summaries about what information they provide, and returns all of this informaiton as a parquet file! Try it out and let us know what you think while it's in public beta! Get started here: https://lnkd.in/e_Y9refB

  • LlamaIndex reposted this

    We just launched a specialized agent for document splitting 📑✂️ This is like semantic chunking on steroids, across complex document packets. A lot of documents are stapled together collections of mini “sub-documents”. Each document packet can contain a bunch of subdocs of one or multiple types: - A packet of resumes - Expense reports containing reimbursement form + receipt images - Court filings: complaint/exhibits/orders in a single PDF Our agent lets you do this automatically and route it to downstream workflows: extraction with separate schemas per doc, document parsing with different settings, or higher-order chunking for knowledge base/RAG/agentic workflows. Come check it out 🔥: https://lnkd.in/g2NxADcW Docs: https://lnkd.in/gvs57Sah Signup: https://lnkd.in/g9Wpqn7w

  • Split documents into distinct sections automatically with our new LlamaSplit API 📄✂️ We're excited to introduce LlamaSplit (now in beta), which uses AI to automatically separate bundled documents into clear, targeted sections based on categories you define - no more manual splitting of document stacks. 📋 Analyze page content and classify pages into your defined categories with natural language descriptions 🎯 Get back precise segments with exact page ranges and confidence scores for each section ⚡ Handle real-world scenarios like resume stacks, mixed financial documents, court filings, and research paper collections 🔗 Combine with LlamaExtract to run targeted extraction on each segment or route to appropriate agent workflows Perfect for processing resume bundles,handling mixed document types, legal teams organizing court filings, categorizing patient charts and more. Watch an example of segmenting (an AI generated) bundle of resumes below 👇 Read the full announcement and get started with LlamaSplit: https://lnkd.in/e6DkkFdc Docs: https://lnkd.in/eU5rZFVS

    • No alternative text description for this image
  • Need to parse multiple PDFs efficiently? Learn how to use LlamaParse with async batch processing. 📁 Process entire folders of PDFs simultaneously instead of one-by-one ⚡ Use asyncio and semaphores to control how many files parse concurrently 🎯 Prevent API rate limit errors while maximizing throughput 📊 Get detailed progress tracking and summary statistics for batch operations This is perfect for processing large document collections, research papers, or any scenario where you need to parse dozens or hundreds of PDFs quickly and reliably. Full tutorial with working code examples: https://lnkd.in/eFSDB7R8

  • LlamaIndex reposted this

    “Intelligent Document Processing” 📑🧪 as an industry is gone . With our latest release this week, *anyone* can build and deploy a specialized document agent in seconds ⚡️🤖, and customize the steps via code. Let’s take a tour through our invoice processing and contract matching agent: given an invoice, extract out vendor details and line items, and match it against the corresponding MSA with the vendor. 1️⃣ Put in your name and API key, and deploy the agent in 5 seconds 2️⃣ Upload some sample contracts and invoices, and watch the workflow run. 3️⃣ If you want to customize it, you can clone our source repository, modify the internals, and deploy the agent! It is both more accurate and more customizable than existing IDP solutions. With coding agents today, the ease of use is equivalent too. Click on the “agents” tab in LlamaCloud to check it out! https://lnkd.in/g9Wpqn7w Invoice processing repo: https://lnkd.in/gXTGv2mz LlamaAgents Docs: https://lnkd.in/gJdCNvn2

  • LlamaIndex reposted this

    We’re building out an applied research team to push SOTA on document understanding using LLMs/VLMs and other emerging techniques 📈📑 We’re on a mission to understand and orchestrate the most complex document types, from PDFs to Excel. You’re responsible for research, evals, and productization. The work you do will impact thousands to millions of developers across large enterprise to digital-native startups in unlocking context from any unstructured data. Simon and I have a deep appreciation for high-quality research from top conferences (NeurIPS, CVPR, ACL, etc.) to 0-1 work. We have a lot of ideas and GPUs but need additional resources to help us out! If this sounds fun come join us: https://lnkd.in/g_gxQTvv

    • No alternative text description for this image

Similar pages

Browse jobs

Funding