Using large language models for extraction, transformation, and indexing.
Build a pipeline that turns YouTube podcasts into a knowledge graph: extract speakers, statements, and entities with an LLM, then dedupe them with embeddings.
Build a CocoIndex pipeline that generates a wiki page for each project in your codebase using an LLM, and keeps it fresh with incremental processing.
Extract Pydantic-typed structured data from patient intake forms using DSPy and CocoIndex: OCR vision models with incremental processing.
Build a self-updating knowledge graph from meeting notes: extract decisions, tasks, owners, and relationships from your documents with CocoIndex and an LLM.
Build a real-time HackerNews trending topics detector with CocoIndex: a deep dive into Custom Sources and AI-powered topic extraction.
How to use BAML and CocoIndex to extract structured data from patient intake forms in PDF/Word with LLMs continuously for production.
CocoIndex updates: in-process setup/drop API, the EmbedText building block, major SplitRecursively codebase-indexing improvements, union and NumPy type support, more LLM APIs, and the Kuzu graph target.
Build a real-time product recommendation engine with an LLM and a graph database, from the aspect of product category (taxonomy) understanding.
CocoIndex updates: knowledge graph support, Qdrant and Supabase targets, KTable and LTable data types, additional LLM providers, and more.
CocoIndex now supports knowledge graphs with incremental processing. Building live knowledge for agents is super easy with CocoIndex!
Extract structured data from patient intake forms in PDF and Word documents using an LLM and CocoIndex: a practical healthcare document extraction example.
First release of CocoIndex Changelog: LLM support, codebase indexing, custom functions, and assorted core/performance improvements
Learn to use CocoIndex to extract structured data from PDF/Markdown with Ollama's local LLM models. All running on premise without sending data to external APIs.