CocoIndex overview

CocoIndex is a real-time data transformation framework for AI with incremental processing: only the data that actually changed is reprocessed, lineage is observable end to end, and the schema is known up front.

Version
v 0.3.37
Last reviewed
Jan 25, 2026

CocoIndex is an ultra-performant real-time data transformation framework for AI, with incremental processing.

As a data framework, CocoIndex takes it to the next level on data freshness. Incremental processing is one of the core values provided by CocoIndex.

Incremental Processing

Programming Model

CocoIndex follows the Dataflow programming model. Each transformation creates a new field solely based on input fields, without hidden states and value mutation. All data before/after each transformation is observable, with lineage out of the box.

The gist of an example data transformation:

# import
data['content'] = flow_builder.add_source(...)

# transform
data['out'] = data['content']
    .transform(...)
    .transform(...)

# collect data
collector.collect(...)

# export to db, vector db, graph db ...
collector.export(...)
CocoIndex Docs Edit this page Report issue