11 posts tagged with "Examples"

CocoIndex examples and implementation guides

Index academic papers and extract metadata for AI agents

July 9, 2025 · 8 min read

Academic Papers Indexing

In this blog we will walk through a comprehensive example of indexing research papers with extracting different metadata — beyond full text chunking and embedding — and build semantic embeddings for indexing and querying.

CocoIndex + Kuzu: Real-time knowledge graph with Kuzu

June 3, 2025 · 5 min read

Linghua Jin

cover

CocoIndex now provides native support for Kuzu as a target graph data store. This integration features a high performance knowledge graph stack with real-time updates.

Real-time data transformation pipeline with Amazon S3 bucket, SQS and CocoIndex

May 29, 2025 · 6 min read

Linghua Jin

cover

CocoIndex now provides native support for Amazon S3 as a data source. Additionally, CocoIndex integrates with AWS Simple Queue Service (SQS), enabling true real-time incremental processing of your S3 data.

Build image search and query with natural language with vision model CLIP

May 20, 2025 · 8 min read

Linghua Jin

In this project, we will build image search and query it with natural language. You can search for “a cute animal” or “a red car”, and the system returns visually relevant results — no manual tagging needed.

Demo

How to build index with text embeddings

May 19, 2025 · 4 min read

Linghua Jin

In this blog, we will build index with text embeddings and query it with natural language. We try to keep it minimalistic and focus on the gist of the indexing flow.

Cover

Build Real-Time Product Recommendation Engine with LLM and Graph Database

May 7, 2025 · 8 min read

Linghua Jin

Product Graph

In this blog, we will build a real-time product recommendation engine with LLM and graph database. In particular, we will use LLM to understand the category (taxonomy) of a product. In addition, we will use LLM to enumerate the complementary products - users are likely to buy together with the current product (pencil and notebook). We will use Graph to explore the relationships between products that can be further used for product recommendations or labeling.

Build Real-Time Knowledge Graph For Documents with LLM

April 29, 2025 · 7 min read

Linghua Jin

Building Knowledge Graph for Documents with LLM

CocoIndex makes it easy to build and maintain knowledge graphs with continuous source updates. In this blog, we will process a list of documents (using CocoIndex documentation as an example). We will use LLM to extract relationships between the concepts in each document.

Structured Extraction from Patient Intake Form with LLM

March 26, 2025 · 6 min read

CocoIndex Team

Patient Intake Form Extraction with LLM

In this blog, we will show you how to use OpenAI API to extract structured data from patient intake forms with different formats, like PDF, DOCX, etc.

Build text embeddings from Google Drive for RAG

March 23, 2025 · 8 min read

CocoIndex Team

Text Embedding from Google Drive

In this blog, we will show you how to use CocoIndex to build text embeddings from Google Drive for RAG step by step including how to setup Google Cloud Service Account for Google Drive. CocoIndex is an open source framework to build fresh indexes from your data for AI. It is designed to be easy to use and extend.

Build Real-Time Codebase Indexing for AI Code Generation

March 18, 2025 · 6 min read

CocoIndex Team

Code Indexing for RAG

In this blog, we will show you how to index a codebase for RAG with CocoIndex. CocoIndex provides built-in support for codebase chunking, with native Tree-sitter support and real-time update.

On-premise structured extraction with LLM using Ollama

March 17, 2025 · 7 min read

CocoIndex Team

Structured data extraction with Ollama and CocoIndex

In this blog, we will show you how to use Ollama to extract structured data that you can run locally and deploy on your own cloud/server.