Tag: Embeddings.

Generating, storing, and serving text and multimodal embeddings.

← All tags · All posts

Jun 10, 2026

Index Your Codebase for AI Agents with CocoIndex V1

Index a codebase for RAG and AI coding agents with CocoIndex V1 and Tree-sitter: language-aware chunking, embedding, and a live vector index in async Python.

Examples RAG Embeddings AI Agents Incremental Processing
Feb 9, 2026

Building SEC EDGAR Financial Analytics with CocoIndex and Apache Doris

A multi-source pipeline that ingests SEC filings (TXT, JSON, PDF), scrubs PII, extracts topics, and powers hybrid search with CocoIndex + Apache Doris.

Examples Structured Extraction Vector Search Embeddings Connectors
Jan 22, 2026

Slides-to-speech: turn presentations into narrated content with CocoIndex

Turn slide decks into a continuously updated multimodal dataset with CocoIndex: extract speaker notes with Gemini Vision, synthesize narration with Piper TTS, and keep LanceDB in sync.

Examples Multimodal Embeddings Structured Extraction Vector Search
Nov 10, 2025

Adaptive Batching - 5x throughput on your data pipelines

CocoIndex now batches GPU and ML workloads automatically: 5x throughput on text embeddings and AI ops, with zero configuration required.

Feature Performance Embeddings Best Practices
Oct 27, 2025

Index PDF elements: text, images with mixed embedding models and metadata

Extract, embed, and store multimodal PDF elements (text with SentenceTransformers, images with CLIP) for unified semantic search with traceable metadata.

Examples Feature Multimodal Embeddings Vector Search
Sep 1, 2025

Incrementally Transform Structured + Unstructured Data from Postgres with AI

Build unified, incrementally updated semantic + structured search over PostgreSQL data with CocoIndex: read a table, transform with AI and non-AI ops, and write pgvector embeddings back to Postgres.

Examples Postgres Incremental Processing Embeddings Vector Search
Aug 20, 2025

Index PDFs, images, and slides together with ColPali: no OCR required

Build a unified visual document index from multiple file formats (including PDFs, images, and slides) using CocoIndex and ColPali. No OCR needed.

Examples Multimodal Embeddings Vector Search RAG
Aug 12, 2025

Index Images with ColPali: Multi-Modal Context Engineering

CocoIndex now natively integrates ColPali for multi-vector, patch-level image indexing: multi-modal context engineering for visually rich documents and PDFs.

Examples Feature Multimodal Embeddings Vector Search
Aug 10, 2025

Multi-Dimensional Vector Support in CocoIndex

CocoIndex natively handles typed multi-dimensional vectors, from simple arrays to multi-vector embeddings, unlocking multimodal AI pipelines at scale.

Feature Embeddings Vector Search Multimodal
Jul 24, 2025

Indexing faces for visual search: build your own Google Photo Search

Build a scalable face detection and recognition pipeline with CocoIndex: embed faces, structure for search, and export to a vector DB.

Examples Tutorial Multimodal Embeddings Vector Search
Jul 9, 2025

Index academic papers and extract metadata for AI agents

How to index academic research papers by extracting metadata (e.g., title, authors, abstract) for AI agents and AI workflows using LLMs and CocoIndex

Examples Structured Extraction Embeddings RAG Tutorial
Jul 7, 2025

CocoIndex Changelog 2025-07-07

CocoIndex updates: in-process setup/drop API, the EmbedText building block, major SplitRecursively codebase-indexing improvements, union and NumPy type support, more LLM APIs, and the Kuzu graph target.

Changelog Embeddings LLM Knowledge Graph Incremental Processing
May 31, 2025

CocoIndex Changelog 2025-05-31

CocoIndex updates: Amazon S3 as a data source, improved query handling, a standalone runtime mode, and more connector and performance improvements.

Changelog Connectors Incremental Processing Embeddings Vector Search
May 20, 2025

Build image search and query with natural language with vision model CLIP

Indexing images with CocoIndex and Vision Model in real-time: multi-modal embedding, and build vector index for efficient retrieval.

Examples Multimodal Embeddings Vector Search Tutorial
May 19, 2025

How to build an index with text embeddings

Build a semantic text index with CocoIndex and text embeddings, then query it with natural language: a beginner's guide to embeddings and vector search.

Examples Embeddings Vector Search RAG Tutorial
Mar 23, 2025

Build text embeddings from Google Drive for RAG

Step-by-step tutorial to build text embeddings from Google Drive docs with CocoIndex, including service-account setup, and store them in Postgres for semantic search and RAG.

Examples Embeddings RAG Vector Search Connectors
Mar 18, 2025

Build Real-Time Codebase Indexing for AI Code Generation

Indexing codebase for RAG with CocoIndex and Tree-sitter in real-time: chunking, embedding, semantic search, and build vector index for efficient retrieval.

Examples RAG Embeddings Vector Search Tutorial
Feb 20, 2025

Customizable Data Indexing Pipelines

What customizable data indexing pipelines are, and why custom transformation logic matters, explained through clear comparisons and practical CocoIndex examples.

Data Indexing Insight RAG Embeddings Vector Search
Jan 4, 2025

CocoIndex - A Data Indexing Platform for AI Applications

CocoIndex is a data indexing platform for AI applications, handling ingestion, chunking, embedding, and pipeline management for RAG, semantic search, and knowledge graphs with built-in lineage and observability.

Data Indexing RAG Embeddings Vector Search Knowledge Graph

Tag: Embeddings.

Index Your Codebase for AI Agents with CocoIndex V1

Building SEC EDGAR Financial Analytics with CocoIndex and Apache Doris

Slides-to-speech: turn presentations into narrated content with CocoIndex

Adaptive Batching - 5x throughput on your data pipelines

Index PDF elements: text, images with mixed embedding models and metadata

Incrementally Transform Structured + Unstructured Data from Postgres with AI

Index PDFs, images, and slides together with ColPali: no OCR required

Index Images with ColPali: Multi-Modal Context Engineering

Multi-Dimensional Vector Support in CocoIndex

Indexing faces for visual search: build your own Google Photo Search

Index academic papers and extract metadata for AI agents

CocoIndex Changelog 2025-07-07

CocoIndex Changelog 2025-05-31

Build image search and query with natural language with vision model CLIP

How to build an index with text embeddings

Build text embeddings from Google Drive for RAG

Build Real-Time Codebase Indexing for AI Code Generation

Customizable Data Indexing Pipelines

CocoIndex - A Data Indexing Platform for AI Applications