Retrieval-augmented generation pipelines built on fresh, indexed data.
Index a codebase for RAG and AI coding agents with CocoIndex V1 and Tree-sitter: language-aware chunking, embedding, and a live vector index in async Python.
Define query handlers in CocoIndex and trace search results back to source data in CocoInsight to close the loop on indexing strategy.
Build a unified visual document index from multiple file formats (including PDFs, images, and slides) using CocoIndex and ColPali. No OCR needed.
How to index academic research papers by extracting metadata (e.g., title, authors, abstract) for AI agents and AI workflows using LLMs and CocoIndex
Build a semantic text index with CocoIndex and text embeddings, then query it with natural language: a beginner's guide to embeddings and vector search.
Step-by-step tutorial to build text embeddings from Google Drive docs with CocoIndex and store them in Postgres for semantic search and RAG.
First release of CocoIndex Changelog: LLM support, codebase indexing, custom functions, and assorted core/performance improvements
Indexing codebase for RAG with CocoIndex and Tree-sitter in real-time: chunking, embedding, semantic search, and build vector index for efficient retrieval.
CocoIndex is now open source: the first engine to combine custom transformation logic with incremental processing built specifically for data indexing.
What customizable data indexing pipelines are and why custom transformation logic matters, with practical CocoIndex examples.
CocoIndex is a data indexing platform for AI: ingestion, chunking, embedding, and pipeline management for RAG, semantic search, and knowledge graphs.