Value serialization for memoization
How CocoIndex serializes memoized function returns with msgspec — supported types (primitives, collections, dataclasses, Pydantic, NumPy), required type annotations, custom type registration, and troubleshooting tips.
Overview
CocoIndex serializes and caches the return values of memoized functions so that unchanged work can be skipped on subsequent runs. Most Python types work automatically — the key thing to get right is the return type annotation, which tells CocoIndex how to reconstruct your objects:
@coco.fn(memo=True)
async def process_chunk(chunk: Chunk) -> Embedding: # return type annotation
return embed(chunk.text)
Without annotations, values may deserialize as basic Python types (dict, list, str, etc.) instead of their original types.
Serialization also applies to memo states and tracking records. If you’re implementing these, add type annotations to:
__coco_memo_state__prev_stateparameter — annotate with the state type you return inMemoStateOutcome(state=...). See Memo state validation.reconcile()prev_possible_recordsparameter — annotate withCollection[YourTrackingRecord]. See Custom Target Connector.
Supported types
The following types all work out of the box — no registration needed:
| Category | Types |
|---|---|
| Primitives | bool, int, float, str, bytes, None |
| Collections | list, tuple, dict, set, frozenset |
| Dataclasses | Any @dataclass (including frozen) |
| NamedTuples | Any NamedTuple subclass |
| Pydantic models | Any pydantic.BaseModel subclass |
| msgspec Structs | Any msgspec.Struct subclass |
| Date/time | datetime.datetime, datetime.date, datetime.time, datetime.timedelta, datetime.timezone |
| Other stdlib | uuid.UUID, complex, pathlib.Path, pathlib.PurePath |
| NumPy | numpy.ndarray, numpy.dtype (when numpy is installed) |
More generally, all types supported by msgspec work automatically. These types also work when nested inside collections or other structured types.
Custom types
If your type isn’t in the list above, register it with @coco.serialize_by_pickle:
import cocoindex as coco
@coco.serialize_by_pickle
class MySpecialType:
def __init__(self, data):
self.data = data
For third-party types, call it as a regular function:
import cocoindex as coco
from some_library import SomeType
coco.serialize_by_pickle(SomeType)
Don’t apply @coco.serialize_by_pickle to dataclasses, NamedTuples, or msgspec.Struct — these are already supported natively. Applying it only works at the top level; when nested inside another supported type, the native encoding takes precedence and the decorator has no effect.
If serialization fails because of a problematic field inside a dataclass, register that field’s type with @coco.serialize_by_pickle instead.
Union types
Unions of a custom type with None work fine (MyDataclass | None). However, unions involving multiple custom types or a custom type with other non-None types require tagged msgspec.Struct variants.
For example, this won’t work:
from dataclasses import dataclass
@dataclass
class Config:
value: int
class Settings(NamedTuple):
config: Config | str # fails at deserialization
Fix — wrap each variant in a tagged msgspec.Struct. The tag=True parameter embeds a type tag in the serialized data so that the correct variant can be identified during deserialization:
import msgspec
class ConfigValue(msgspec.Struct, tag=True):
value: int
class StringValue(msgspec.Struct, tag=True):
value: str
class Settings(NamedTuple):
config: ConfigValue | StringValue # works — variants are distinguished by tag
Troubleshooting
DeserializationError: Cannot build msgspec Decoder
This typically means an unsupported union type. The error message includes a hint about the cause.
Fix: Restructure the union to use tagged msgspec.Struct variants. See Union types above.
DeserializationError: Failed to deserialize msgspec payload
The type annotation doesn’t match the serialized data. Common causes:
- Missing return type annotation on a memoized function — add
-> YourTypeto the function signature. - Changed type structure between runs — if you renamed or restructured a dataclass, the cached data won’t match. Rebuild the cache by running
app.update(full_reprocess=True)orcocoindex update --full-reprocess. - Forward reference not resolved — if your type annotation uses a string forward reference, ensure the type is defined before the function is first called.
UnpicklingError: Forbidden global during unpickling
_pickle.UnpicklingError: Forbidden global during unpickling: myapp.models.Summary
CocoIndex restricts which types can be deserialized for security. This error means your type isn’t in the allow-list. Fix by either:
- Converting to a dataclass or NamedTuple (recommended — supported natively, no registration needed)
- Using
@coco.serialize_by_pickleto register the type
If you see this error after upgrading, previously cached data may reference types that aren’t yet registered. You have two options:
- Add
@coco.serialize_by_pickleto the type and re-run. - If the type is already a dataclass or NamedTuple, add
@coco.unpickle_safeto allow reading the old cached data. Once the cache is rebuilt, the decorator can be removed.