11 docs tagged with "vector-index"

Academic Papers Indexing

Build a real-time academic papers index. Extract metadata, chunk and embed abstracts, and enable semantic and author-based search over academic PDFs.

Bring your own parser as building block with Google Document AI

Use Google Document AI to parse document, embed the resulting text, and store it in a vectorized database for semantic search.

Build image search and query with natural language with vision model CLIP

Indexing images with CocoIndex and Vision Model CLIP for efficient image search and natural language querying

Image Search App with ColPali and FastAPI

Build image search index with ColPali and FastAPI

Index PDF Elements - Unified Text & Image Embedding with Metadata

Extract, embed, and index both text and images from PDFs for advanced multimodal search. Leverage SentenceTransformers and CLIP for unified vector search, complete with metadata linkage, thumbnails, and full traceability.

Index PDFs, Images, Slides without OCR

Build a visual document indexing pipeline using ColPali to index scanned documents, PDFs, academic papers, presentation slides, and standalone images — all mixed together with charts, tables, and figures - into the same vector space.

Photo Search with Face Detection

Covers extracting and embedding faces from images, structuring data for visual search, and exporting to a vector database for face similarity queries.

Real-time Codebase Indexing

Build a real-time codebase index for retrieval-augmented generation (RAG) using CocoIndex and Tree-sitter. Chunk, embed, and search code with semantic understanding.

Real-time data transformation pipeline with Amazon S3 bucket, SQS and CocoIndex

Build real-time data transformation pipeline with S3 and CocoIndex.

Simple Vector Index with Text Embedding

Indexing text with CocoIndex and text embeddings, and query it with natural language.

Transform Data From Structured Source in PostgreSQL

Transform data from PostgreSQL table as source, transform with both AI models and non-AI data mappings, and write them into PostgreSQL/PgVector for semantic + structured search.