CocoIndex updates: in-process API, CLI improvements, EmbedText support, codebase indexing enhancements, and more.
CocoIndex updates: Amazon S3 as a data source, updates on query handling, standalone, and more.
CocoIndex updates: Knowledge Graphs, Qdrant, Supabase, KTable/LTable, and more LLM providers.
CocoIndex updates: Incremental processing with live update mode, evaluation utilities, support for date/time types, Google Drive, and assorted core/performance improvements
CocoIndex continuously watches source changes and keeps derived data in sync, with low latency and minimal performance overhead.
CocoIndex helps to keep index up to date with source changes, super efficient and low latency - with the support of incremental processing.
Extract structured data from patient intake forms in PDF/Word with LLM by CocoIndex.
Tutorial to create text embeddings from docs on Google Drive, save in vector stores for semantics search / RAG, using CocoIndex.
First release of CocoIndex Changelog: LLM support, codebase indexing, custom functions, and assorted core/performance improvements
Indexing codebase for RAG with CocoIndex and Tree-sitter in real-time: chunking, embedding, semantic search, and build vector index for efficient retrieval.
Learn to use CocoIndex extracting structured data from PDF/Markdown with Ollama's local LLM models. All running on premise without sending data to external APIs.
CocoIndex is the world's first open-source engine that supports both custom transformation logic and incremental processing specialized for data indexing. We are now officially open sourced!
Explain what customizable data indexing pipelines are through comparisons and examples.
What makes indexing pipelines different from other data systems — and why they need special handling for incremental processing and persistence.
How CocoIndex handles system updates in indexing flows: automatic schema inference and managing data + logic evolution without downtime.
Handle large files in data indexing: processing granularity, fan-in/fan-out, and memory pressure — walked through a patent XML example in CocoIndex.
Data consistency in indexing pipelines: concurrent updates, exposure risks, and how CocoIndex's data-driven approach keeps indexes converging.
Fundamentals of data indexing pipelines for RAG: what makes a good one, common production pitfalls, and how CocoIndex addresses them.
CocoIndex is a data indexing platform for AI applications — ingestion, processing, and management for RAG and semantic search.
Welcome to the official CocoIndex blog! We're excited to share our journey in building high-performance indexing infrastructure for AI applications.