Value serialization for memoization

How CocoIndex serializes memoized function returns with msgspec — supported types (primitives, collections, dataclasses, Pydantic, NumPy), required type annotations, custom type registration, and troubleshooting tips.

Version
v 1.0.0-alpha48
Last reviewed
Apr 19, 2026

Overview

CocoIndex serializes and caches the return values of memoized functions so that unchanged work can be skipped on subsequent runs. Most Python types work automatically — the key thing to get right is the return type annotation, which tells CocoIndex how to reconstruct your objects:

python
@coco.fn(memo=True)
async def process_chunk(chunk: Chunk) -> Embedding:  # return type annotation
    return embed(chunk.text)

Without annotations, values may deserialize as basic Python types (dict, list, str, etc.) instead of their original types.

Advanced: other places where serialization and type annotations matter

Serialization also applies to memo states and tracking records. If you’re implementing these, add type annotations to:

  • __coco_memo_state__ prev_state parameter — annotate with the state type you return in MemoStateOutcome(state=...). See Memo state validation.
  • reconcile() prev_possible_records parameter — annotate with Collection[YourTrackingRecord]. See Custom Target Connector.

Supported types

The following types all work out of the box — no registration needed:

CategoryTypes
Primitivesbool, int, float, str, bytes, None
Collectionslist, tuple, dict, set, frozenset
DataclassesAny @dataclass (including frozen)
NamedTuplesAny NamedTuple subclass
Pydantic modelsAny pydantic.BaseModel subclass
msgspec StructsAny msgspec.Struct subclass
Date/timedatetime.datetime, datetime.date, datetime.time, datetime.timedelta, datetime.timezone
Other stdlibuuid.UUID, complex, pathlib.Path, pathlib.PurePath
NumPynumpy.ndarray, numpy.dtype (when numpy is installed)

More generally, all types supported by msgspec work automatically. These types also work when nested inside collections or other structured types.

Custom types

If your type isn’t in the list above, register it with @coco.serialize_by_pickle:

python
import cocoindex as coco

@coco.serialize_by_pickle
class MySpecialType:
    def __init__(self, data):
        self.data = data

For third-party types, call it as a regular function:

python
import cocoindex as coco
from some_library import SomeType

coco.serialize_by_pickle(SomeType)
Not for dataclasses, NamedTuples, or msgspec.Struct

Don’t apply @coco.serialize_by_pickle to dataclasses, NamedTuples, or msgspec.Struct — these are already supported natively. Applying it only works at the top level; when nested inside another supported type, the native encoding takes precedence and the decorator has no effect.

If serialization fails because of a problematic field inside a dataclass, register that field’s type with @coco.serialize_by_pickle instead.

Union types

Unions of a custom type with None work fine (MyDataclass | None). However, unions involving multiple custom types or a custom type with other non-None types require tagged msgspec.Struct variants.

For example, this won’t work:

python
from dataclasses import dataclass

@dataclass
class Config:
    value: int

class Settings(NamedTuple):
    config: Config | str  # fails at deserialization

Fix — wrap each variant in a tagged msgspec.Struct. The tag=True parameter embeds a type tag in the serialized data so that the correct variant can be identified during deserialization:

python
import msgspec

class ConfigValue(msgspec.Struct, tag=True):
    value: int

class StringValue(msgspec.Struct, tag=True):
    value: str

class Settings(NamedTuple):
    config: ConfigValue | StringValue  # works — variants are distinguished by tag

Troubleshooting

DeserializationError: Cannot build msgspec Decoder

This typically means an unsupported union type. The error message includes a hint about the cause.

Fix: Restructure the union to use tagged msgspec.Struct variants. See Union types above.

DeserializationError: Failed to deserialize msgspec payload

The type annotation doesn’t match the serialized data. Common causes:

  • Missing return type annotation on a memoized function — add -> YourType to the function signature.
  • Changed type structure between runs — if you renamed or restructured a dataclass, the cached data won’t match. Rebuild the cache by running app.update(full_reprocess=True) or cocoindex update --full-reprocess.
  • Forward reference not resolved — if your type annotation uses a string forward reference, ensure the type is defined before the function is first called.

UnpicklingError: Forbidden global during unpickling

_pickle.UnpicklingError: Forbidden global during unpickling: myapp.models.Summary

CocoIndex restricts which types can be deserialized for security. This error means your type isn’t in the allow-list. Fix by either:

  1. Converting to a dataclass or NamedTuple (recommended — supported natively, no registration needed)
  2. Using @coco.serialize_by_pickle to register the type
Upgrading from older versions

If you see this error after upgrading, previously cached data may reference types that aren’t yet registered. You have two options:

  • Add @coco.serialize_by_pickle to the type and re-run.
  • If the type is already a dataclass or NamedTuple, add @coco.unpickle_safe to allow reading the old cached data. Once the cache is rebuilt, the decorator can be removed.
CocoIndex Docs Edit this page Report issue