Step-by-step guides for building pipelines with CocoIndex.
Build a CocoIndex pipeline that generates a wiki page for each project in your codebase using an LLM, and keeps it fresh with incremental processing.
Extract Pydantic-typed structured data from patient intake forms using DSPy and CocoIndex: OCR vision models with incremental processing.
Build a custom incremental HackerNews connector with CocoIndex's Custom Source API and export to Postgres for semantic search and analytics.
How to use BAML and CocoIndex to extract structured data from patient intake forms in PDF/Word with LLMs continuously for production.
Extract invoice fields from PDFs in Azure Blob Storage and load them into Snowflake with an incremental CocoIndex + GPT-4o pipeline: open-source unstructured ETL.
CocoIndex now supports custom targets. Export indexed data to any destination: a local file, cloud storage, a REST API, or your own bespoke system.
Build a scalable face detection and recognition pipeline with CocoIndex: embed faces, structure for search, and export to a vector DB.
How to index academic research papers by extracting metadata (e.g., title, authors, abstract) for AI agents and AI workflows using LLMs and CocoIndex
Indexing images with CocoIndex and Vision Model in real-time: multi-modal embedding, and build vector index for efficient retrieval.
Build a semantic text index with CocoIndex and text embeddings, then query it with natural language: a beginner's guide to embeddings and vector search.
CocoIndex now supports knowledge graphs with incremental processing. Building live knowledge for agents is super easy with CocoIndex!
Indexing codebase for RAG with CocoIndex and Tree-sitter in real-time: chunking, embedding, semantic search, and build vector index for efficient retrieval.
Learn to use CocoIndex to extract structured data from PDF/Markdown with Ollama's local LLM models. All running on premise without sending data to external APIs.