Qdrant
The qdrant connector provides utilities for writing points to Qdrant vector databases, with support for both single and named vectors, as well as multi-vector configurations.
from cocoindex.connectors import qdrant
This connector requires additional dependencies. Install with:
pip install cocoindex[qdrant]
Connection setup
create_client() creates a Qdrant client connection with optional gRPC support.
def create_client(
url: str,
*,
prefer_grpc: bool = True,
**kwargs: Any,
) -> QdrantClient
Parameters:
url— Qdrant server URL (e.g.,"http://localhost:6333").prefer_grpc— Whether to prefer gRPC over HTTP (default:True).**kwargs— Additional arguments passed directly toQdrantClient.
Returns: A Qdrant client instance.
Example:
client = qdrant.create_client("http://localhost:6333")
As target
The qdrant connector provides target state APIs for writing points to collections. CocoIndex tracks what points should exist and automatically handles upserts and deletions.
Declaring target states
Setting up a connection
Create a ContextKey[QdrantClient] (with tracked=False) to identify your Qdrant client, then provide it in your lifespan:
from qdrant_client import QdrantClient
import cocoindex as coco
QDRANT_DB = coco.ContextKey[QdrantClient]("my_vectors", tracked=False)
@coco.lifespan
async def coco_lifespan(builder: coco.EnvironmentBuilder) -> AsyncIterator[None]:
client = qdrant.create_client(QDRANT_URL)
builder.provide(QDRANT_DB, client)
yield
Collections (parent state)
Declares a collection as a target state. Returns a CollectionTarget for declaring points.
def declare_collection_target(
db: ContextKey[QdrantClient],
collection_name: str,
schema: CollectionSchema,
*,
managed_by: Literal["system", "user"] = "system",
) -> CollectionTarget[coco.PendingS]
Parameters:
db— AContextKey[QdrantClient]identifying the Qdrant client to use.collection_name— Name of the collection.schema— Schema definition specifying vector configurations (see Collection Schema).managed_by— Whether CocoIndex manages the collection lifecycle ("system") or assumes it exists ("user").
Returns: A pending CollectionTarget. Use the convenience wrapper await qdrant.mount_collection_target(QDRANT_DB, collection_name, schema) to resolve.
Points (child states)
Once a CollectionTarget is resolved, declare points to be upserted using qdrant.PointStruct, which is an alias of qdrant_client.http.models.PointStruct:
def CollectionTarget.declare_point(
self,
point: qdrant.PointStruct,
) -> None
Parameters:
point— Aqdrant.PointStruct(alias ofqdrant_client.http.models.PointStruct) containing:id— Point ID (str, int, or UUID)vector— Vector data (single vector or dict of named vectors)payload— Optional metadata as a JSON-serializable dict
Collection schema
Define vector configurations for a collection using CollectionSchema. Unlike row-oriented databases, Qdrant uses a point-oriented model where each point has schemaless payload and one or more vectors with predefined dimensions.
class CollectionSchema:
@classmethod
async def create(
cls,
vectors: QdrantVectorDef | dict[str, QdrantVectorDef],
) -> CollectionSchema
Parameters:
vectors— Either:- A single
QdrantVectorDeffor an unnamed vector - A dict mapping vector names to
QdrantVectorDeffor named vectors
- A single
QdrantVectorDef
Specifies vector configuration including dimension, distance metric, and multi-vector settings:
class QdrantVectorDef(NamedTuple):
schema: VectorSchemaProvider | MultiVectorSchemaProvider
distance: Literal["cosine", "dot", "euclid"] = "cosine"
multivector_comparator: Literal["max_sim"] = "max_sim"
Parameters:
schema— AVectorSchemaProviderorMultiVectorSchemaProviderthat defines vector dimensionsdistance— Distance metric for similarity search (default:"cosine")multivector_comparator— Comparator for multi-vector fields (only applies toMultiVectorSchemaProvider)
Single (unnamed) vector
For collections with a single unnamed vector:
from cocoindex.ops.sentence_transformers import SentenceTransformerEmbedder
embedder = SentenceTransformerEmbedder("sentence-transformers/all-MiniLM-L6-v2")
schema = await qdrant.CollectionSchema.create(
vectors=qdrant.QdrantVectorDef(schema=embedder)
)
Points use the vector directly:
point = qdrant.PointStruct(
id="doc-123",
vector=embedding.tolist(), # Single vector
payload={"text": "...", "metadata": {...}},
)
Named vectors
For collections with multiple named vectors:
from cocoindex.resources.schema import VectorSchema
import numpy as np
schema = await qdrant.CollectionSchema.create(
vectors={
"text_embedding": qdrant.QdrantVectorDef(
schema=VectorSchema(dtype=np.float32, size=384),
distance="cosine",
),
"image_embedding": qdrant.QdrantVectorDef(
schema=VectorSchema(dtype=np.float32, size=512),
distance="dot",
),
}
)
Points use a dict of vectors:
point = qdrant.PointStruct(
id="doc-123",
vector={
"text_embedding": text_vec.tolist(),
"image_embedding": image_vec.tolist(),
},
payload={"text": "...", "metadata": {...}},
)
VectorSchemaProvider
The schema field of QdrantVectorDef accepts a VectorSchemaProvider, a ContextKey, or an explicit VectorSchema to specify the vector dimension and dtype. See Vector Schema for details.
Multi-vector support
For multi-vector configurations (multiple vectors per point stored together):
from cocoindex.resources.schema import MultiVectorSchema, VectorSchema
import numpy as np
schema = await qdrant.CollectionSchema.create(
vectors=qdrant.QdrantVectorDef(
schema=MultiVectorSchema(
vector_schema=VectorSchema(dtype=np.float32, size=384)
),
multivector_comparator="max_sim",
)
)
Distance metrics
The distance parameter in QdrantVectorDef specifies the similarity metric:
"cosine"— Cosine similarity (default, normalized dot product)"dot"— Dot product similarity"euclid"— Euclidean distance (L2)
Example: single vector
from qdrant_client import QdrantClient
import cocoindex as coco
from cocoindex.connectors import qdrant
from cocoindex.ops.sentence_transformers import SentenceTransformerEmbedder
from typing import AsyncIterator
QDRANT_URL = "http://localhost:6333"
QDRANT_DB = coco.ContextKey[QdrantClient]("main_vectors", tracked=False)
embedder = SentenceTransformerEmbedder("sentence-transformers/all-MiniLM-L6-v2")
@coco.lifespan
async def coco_lifespan(builder: coco.EnvironmentBuilder) -> AsyncIterator[None]:
client = qdrant.create_client(QDRANT_URL)
builder.provide(QDRANT_DB, client)
yield
@coco.fn
async def process_document(
doc_id: str,
text: str,
target: qdrant.CollectionTarget,
) -> None:
embedding = await embedder.embed(text)
point = qdrant.PointStruct(
id=doc_id,
vector=embedding.tolist(),
payload={"text": text},
)
target.declare_point(point)
@coco.fn
async def app_main() -> None:
# Declare collection target state
collection = await qdrant.mount_collection_target(
QDRANT_DB,
"documents",
await qdrant.CollectionSchema.create(
vectors=qdrant.QdrantVectorDef(schema=embedder)
),
)
# Declare points
for doc_id, text in documents:
await coco.mount(
coco.component_subpath("doc", doc_id),
process_document,
doc_id,
text,
collection,
)
Example: named vectors
from cocoindex.resources.schema import VectorSchema
import numpy as np
@coco.fn
async def app_main() -> None:
collection = await qdrant.mount_collection_target(
QDRANT_DB,
"multimodal_docs",
await qdrant.CollectionSchema.create(
vectors={
"text": qdrant.QdrantVectorDef(
schema=text_embedder,
distance="cosine",
),
"image": qdrant.QdrantVectorDef(
schema=VectorSchema(dtype=np.float32, size=512),
distance="dot",
),
}
),
)
# Declare points with named vectors
for doc in documents:
point = qdrant.PointStruct(
id=doc.id,
vector={
"text": doc.text_embedding.tolist(),
"image": doc.image_embedding.tolist(),
},
payload={"title": doc.title, "url": doc.url},
)
collection.declare_point(point)
Point IDs
Qdrant supports the following point ID types:
str— String identifiersint— Integer identifiers (unsigned 64-bit)uuid.UUID— UUID identifiers (converted to string)
All other types are converted to strings automatically.
Payloads
Point payloads are schemaless JSON objects. Any JSON-serializable Python data structure can be used:
payload = {
"text": "Document content",
"metadata": {
"author": "Alice",
"tags": ["machine-learning", "nlp"],
"published": "2024-01-15",
},
"stats": {
"views": 1500,
"likes": 42,
},
}
Vector search
The connector focuses on writing points to Qdrant. For vector search, use the Qdrant client directly:
from qdrant_client.http import models as qdrant_models
# Get the registered client
client = qdrant.create_client("http://localhost:6333")
# Perform search
results = client.search(
collection_name="documents",
query_vector=query_embedding.tolist(),
limit=10,
)
for result in results:
print(f"Score: {result.score}, ID: {result.id}")
print(f"Payload: {result.payload}")
For named vectors:
results = client.search(
collection_name="documents",
query_vector=("text", query_embedding.tolist()), # Search using "text" vector
limit=10,
)