LanceDB target

Write rows to a LanceDB table with an automatic B-Tree primary-key index, optional FTS indexes, and periodic optimize() calls tuned by a transaction threshold.

Language
Python 3.11+
Requires
LanceDB
Version
v 0.3.37
Last reviewed
Apr 19, 2026

Exports data to a LanceDB table.

Data Mapping

Here’s how CocoIndex data elements map to LanceDB elements during export:

CocoIndex ElementLanceDB Element
an export targeta unique table
a collected rowa row
a fielda column

::::info Installation and import

This target is provided via an optional dependency [lancedb]:

pip install "cocoindex[lancedb]"

To use it, you need to import the submodule cocoindex.targets.lancedb:

import cocoindex.targets.lancedb as coco_lancedb

::::

Spec

The spec coco_lancedb.LanceDB takes the following fields:

  • db_uri (str, required): The LanceDB database location (e.g. ./lancedb_data).
  • table_name (str, required): The name of the table to export the data to.
  • db_options (coco_lancedb.DatabaseOptions, optional): Advanced database options.
    • storage_options (dict[str, Any], optional): Passed through to LanceDB when connecting.
  • num_transactions_before_optimize (int, optional, default: 50): The number of transactions before calling optimize() for the LanceDB table.

Additional notes:

  • Exactly one primary key field is required for LanceDB targets. We create B-Tree index on this key column.
  • Full-Text Search (FTS) indexes are supported via the fts_indexes parameter. Note that FTS functionality requires LanceDB Enterprise. You can pass any parameters supported by the target’s FTS index creation API (e.g., tokenizer_name for LanceDB). See LanceDB FTS documentation for full parameter details.
i
Info

LanceDB has a limitation that it cannot build a vector index on an empty table (see LanceDB issue #4034). If you want to use vector indexes, you can run the flow once to populate the target table with data, and then create the vector indexes.

You can find an end-to-end example here: examples/text_embedding_lancedb.

FTS Index Example

import cocoindex
import cocoindex.targets.lancedb as coco_lancedb

@cocoindex.flow_def(name="DocumentSearchFlow")
def document_search_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope):
    # ... source and transformations ...

    doc_collector = data_scope.add_collector()
    # ... collect document data ...

    doc_collector.export(
        "documents",
        coco_lancedb.LanceDB(
            db_uri="./lancedb_data",
            table_name="documents"
        ),
        primary_key_fields=["id"],
        # Add FTS indexes for full-text search
        fts_indexes=[
            # Basic FTS index with default tokenizer
            cocoindex.FtsIndexDef("content"),
            # FTS index with stemming for better search recall
            cocoindex.FtsIndexDef("description", parameters={"tokenizer_name": "en_stem"}),
            # FTS index with position tracking for phrase searches
            cocoindex.FtsIndexDef("title", parameters={"tokenizer_name": "default", "with_position": True})
        ]
    )

connect_async() helper

We provide a helper to obtain a shared AsyncConnection that is reused across your process and shared with CocoIndex’s writer for strong read-after-write consistency:

from cocoindex.targets import lancedb as coco_lancedb

db = await coco_lancedb.connect_async("./lancedb_data")
table = await db.open_table("TextEmbedding")

Signature:

def connect_async(
  db_uri: str,
  *,
  db_options: coco_lancedb.DatabaseOptions | None = None,
  read_consistency_interval: datetime.timedelta | None = None
) -> lancedb.AsyncConnection

Once db_uri matches, it automatically reuses the same connection instance without re-establishing a new connection. This achieves strong consistency between your indexing and querying logic, if they run in the same process.

Example

CocoIndex Docs Edit this page Report issue