Skip to main content

Qdrant

The qdrant connector provides utilities for writing points to Qdrant vector databases, with support for both single and named vectors, as well as multi-vector configurations.

from cocoindex.connectors import qdrant
Dependencies

This connector requires additional dependencies. Install with:

pip install cocoindex[qdrant]

Connection setup

create_client() creates a Qdrant client connection with optional gRPC support.

def create_client(
url: str,
*,
prefer_grpc: bool = True,
**kwargs: Any,
) -> QdrantClient

Parameters:

  • url — Qdrant server URL (e.g., "http://localhost:6333").
  • prefer_grpc — Whether to prefer gRPC over HTTP (default: True).
  • **kwargs — Additional arguments passed directly to QdrantClient.

Returns: A Qdrant client instance.

Example:

client = qdrant.create_client("http://localhost:6333")

As target

The qdrant connector provides target state APIs for writing points to collections. CocoIndex tracks what points should exist and automatically handles upserts and deletions.

Declaring target states

Setting up a connection

Create a ContextKey[QdrantClient] (with tracked=False) to identify your Qdrant client, then provide it in your lifespan:

from qdrant_client import QdrantClient
import cocoindex as coco

QDRANT_DB = coco.ContextKey[QdrantClient]("my_vectors", tracked=False)

@coco.lifespan
async def coco_lifespan(builder: coco.EnvironmentBuilder) -> AsyncIterator[None]:
client = qdrant.create_client(QDRANT_URL)
builder.provide(QDRANT_DB, client)
yield

Collections (parent state)

Declares a collection as a target state. Returns a CollectionTarget for declaring points.

def declare_collection_target(
db: ContextKey[QdrantClient],
collection_name: str,
schema: CollectionSchema,
*,
managed_by: Literal["system", "user"] = "system",
) -> CollectionTarget[coco.PendingS]

Parameters:

  • db — A ContextKey[QdrantClient] identifying the Qdrant client to use.
  • collection_name — Name of the collection.
  • schema — Schema definition specifying vector configurations (see Collection Schema).
  • managed_by — Whether CocoIndex manages the collection lifecycle ("system") or assumes it exists ("user").

Returns: A pending CollectionTarget. Use the convenience wrapper await qdrant.mount_collection_target(QDRANT_DB, collection_name, schema) to resolve.

Points (child states)

Once a CollectionTarget is resolved, declare points to be upserted using qdrant.PointStruct, which is an alias of qdrant_client.http.models.PointStruct:

def CollectionTarget.declare_point(
self,
point: qdrant.PointStruct,
) -> None

Parameters:

  • point — A qdrant.PointStruct (alias of qdrant_client.http.models.PointStruct) containing:
    • id — Point ID (str, int, or UUID)
    • vector — Vector data (single vector or dict of named vectors)
    • payload — Optional metadata as a JSON-serializable dict

Collection schema

Define vector configurations for a collection using CollectionSchema. Unlike row-oriented databases, Qdrant uses a point-oriented model where each point has schemaless payload and one or more vectors with predefined dimensions.

class CollectionSchema:
@classmethod
async def create(
cls,
vectors: QdrantVectorDef | dict[str, QdrantVectorDef],
) -> CollectionSchema

Parameters:

  • vectors — Either:
    • A single QdrantVectorDef for an unnamed vector
    • A dict mapping vector names to QdrantVectorDef for named vectors

QdrantVectorDef

Specifies vector configuration including dimension, distance metric, and multi-vector settings:

class QdrantVectorDef(NamedTuple):
schema: VectorSchemaProvider | MultiVectorSchemaProvider
distance: Literal["cosine", "dot", "euclid"] = "cosine"
multivector_comparator: Literal["max_sim"] = "max_sim"

Parameters:

  • schema — A VectorSchemaProvider or MultiVectorSchemaProvider that defines vector dimensions
  • distance — Distance metric for similarity search (default: "cosine")
  • multivector_comparator — Comparator for multi-vector fields (only applies to MultiVectorSchemaProvider)

Single (unnamed) vector

For collections with a single unnamed vector:

from cocoindex.ops.sentence_transformers import SentenceTransformerEmbedder

embedder = SentenceTransformerEmbedder("sentence-transformers/all-MiniLM-L6-v2")

schema = await qdrant.CollectionSchema.create(
vectors=qdrant.QdrantVectorDef(schema=embedder)
)

Points use the vector directly:

point = qdrant.PointStruct(
id="doc-123",
vector=embedding.tolist(), # Single vector
payload={"text": "...", "metadata": {...}},
)

Named vectors

For collections with multiple named vectors:

from cocoindex.resources.schema import VectorSchema
import numpy as np

schema = await qdrant.CollectionSchema.create(
vectors={
"text_embedding": qdrant.QdrantVectorDef(
schema=VectorSchema(dtype=np.float32, size=384),
distance="cosine",
),
"image_embedding": qdrant.QdrantVectorDef(
schema=VectorSchema(dtype=np.float32, size=512),
distance="dot",
),
}
)

Points use a dict of vectors:

point = qdrant.PointStruct(
id="doc-123",
vector={
"text_embedding": text_vec.tolist(),
"image_embedding": image_vec.tolist(),
},
payload={"text": "...", "metadata": {...}},
)

VectorSchemaProvider

The schema field of QdrantVectorDef accepts a VectorSchemaProvider, a ContextKey, or an explicit VectorSchema to specify the vector dimension and dtype. See Vector Schema for details.

Multi-vector support

For multi-vector configurations (multiple vectors per point stored together):

from cocoindex.resources.schema import MultiVectorSchema, VectorSchema
import numpy as np

schema = await qdrant.CollectionSchema.create(
vectors=qdrant.QdrantVectorDef(
schema=MultiVectorSchema(
vector_schema=VectorSchema(dtype=np.float32, size=384)
),
multivector_comparator="max_sim",
)
)

Distance metrics

The distance parameter in QdrantVectorDef specifies the similarity metric:

  • "cosine" — Cosine similarity (default, normalized dot product)
  • "dot" — Dot product similarity
  • "euclid" — Euclidean distance (L2)

Example: single vector

from qdrant_client import QdrantClient
import cocoindex as coco
from cocoindex.connectors import qdrant
from cocoindex.ops.sentence_transformers import SentenceTransformerEmbedder
from typing import AsyncIterator

QDRANT_URL = "http://localhost:6333"
QDRANT_DB = coco.ContextKey[QdrantClient]("main_vectors", tracked=False)

embedder = SentenceTransformerEmbedder("sentence-transformers/all-MiniLM-L6-v2")

@coco.lifespan
async def coco_lifespan(builder: coco.EnvironmentBuilder) -> AsyncIterator[None]:
client = qdrant.create_client(QDRANT_URL)
builder.provide(QDRANT_DB, client)
yield

@coco.fn
async def process_document(
doc_id: str,
text: str,
target: qdrant.CollectionTarget,
) -> None:
embedding = await embedder.embed(text)

point = qdrant.PointStruct(
id=doc_id,
vector=embedding.tolist(),
payload={"text": text},
)
target.declare_point(point)

@coco.fn
async def app_main() -> None:
# Declare collection target state
collection = await qdrant.mount_collection_target(
QDRANT_DB,
"documents",
await qdrant.CollectionSchema.create(
vectors=qdrant.QdrantVectorDef(schema=embedder)
),
)

# Declare points
for doc_id, text in documents:
await coco.mount(
coco.component_subpath("doc", doc_id),
process_document,
doc_id,
text,
collection,
)

Example: named vectors

from cocoindex.resources.schema import VectorSchema
import numpy as np

@coco.fn
async def app_main() -> None:
collection = await qdrant.mount_collection_target(
QDRANT_DB,
"multimodal_docs",
await qdrant.CollectionSchema.create(
vectors={
"text": qdrant.QdrantVectorDef(
schema=text_embedder,
distance="cosine",
),
"image": qdrant.QdrantVectorDef(
schema=VectorSchema(dtype=np.float32, size=512),
distance="dot",
),
}
),
)

# Declare points with named vectors
for doc in documents:
point = qdrant.PointStruct(
id=doc.id,
vector={
"text": doc.text_embedding.tolist(),
"image": doc.image_embedding.tolist(),
},
payload={"title": doc.title, "url": doc.url},
)
collection.declare_point(point)

Point IDs

Qdrant supports the following point ID types:

  • str — String identifiers
  • int — Integer identifiers (unsigned 64-bit)
  • uuid.UUID — UUID identifiers (converted to string)

All other types are converted to strings automatically.

Payloads

Point payloads are schemaless JSON objects. Any JSON-serializable Python data structure can be used:

payload = {
"text": "Document content",
"metadata": {
"author": "Alice",
"tags": ["machine-learning", "nlp"],
"published": "2024-01-15",
},
"stats": {
"views": 1500,
"likes": 42,
},
}

The connector focuses on writing points to Qdrant. For vector search, use the Qdrant client directly:

from qdrant_client.http import models as qdrant_models

# Get the registered client
client = qdrant.create_client("http://localhost:6333")

# Perform search
results = client.search(
collection_name="documents",
query_vector=query_embedding.tolist(),
limit=10,
)

for result in results:
print(f"Score: {result.score}, ID: {result.id}")
print(f"Payload: {result.payload}")

For named vectors:

results = client.search(
collection_name="documents",
query_vector=("text", query_embedding.tolist()), # Search using "text" vector
limit=10,
)