LanceDB
Exports data to a LanceDB table.
Data Mapping
Here's how CocoIndex data elements map to LanceDB elements during export:
CocoIndex Element | LanceDB Element |
---|---|
an export target | a unique table |
a collected row | a row |
a field | a column |
This target is provided via an optional dependency [lancedb]
:
pip install "cocoindex[lancedb]"
To use it, you need to import the submodule cocoindex.targets.lancedb
:
import cocoindex.targets.lancedb as coco_lancedb
Spec
The spec coco_lancedb.LanceDB
takes the following fields:
db_uri
(str
, required): The LanceDB database location (e.g../lancedb_data
).table_name
(str
, required): The name of the table to export the data to.db_options
(coco_lancedb.DatabaseOptions
, optional): Advanced database options.storage_options
(dict[str, Any]
, optional): Passed through to LanceDB when connecting.
Additional notes:
- Exactly one primary key field is required for LanceDB targets. We create B-Tree index on this key column.
LanceDB has a limitation that it cannot build a vector index on an empty table (see LanceDB issue #4034). If you want to use vector indexes, you can run the flow once to populate the target table with data, and then create the vector indexes.
You can find an end-to-end example here: examples/text_embedding_lancedb.
connect_async()
helper
We provide a helper to obtain a shared AsyncConnection
that is reused across your process and shared with CocoIndex's writer for strong read-after-write consistency:
from cocoindex.targets import lancedb as coco_lancedb
db = await coco_lancedb.connect_async("./lancedb_data")
table = await db.open_table("TextEmbedding")
Signature:
def connect_async(
db_uri: str,
*,
db_options: coco_lancedb.DatabaseOptions | None = None,
read_consistency_interval: datetime.timedelta | None = None
) -> lancedb.AsyncConnection
Once db_uri
matches, it automatically reuses the same connection instance without re-establishing a new connection.
This achieves strong consistency between your indexing and querying logic, if they run in the same process.