Following Up on Pydantic & Polymorphism

Following Up on Pydantic & Polymorphism

February 1, 2026·Ben
Ben

In my previous blog post, I discussed polymorphism with Pydantic and the issues it caused. I suggested an elegant solution (at least I think so), but one that was perhaps not the most plug and play option.

That said, we can try to simplify things and rely solely on Pydantic’s features to achieve the same goal, but this will require a few compromises. In particular, we will need to leverage Pydantic’s core schema generation API and use a specific annotation. It’s not ideal but there is not much choice, unless we switch back to the previous solution.

Solution

This post is going to be fairly light, I will not dive into the details. I will reuse my class hierarchy described in the previous post (as long as the Python setup), with a small addition to make sure the whole thing works across multiple levels.

animals.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
from __future__ import annotations

import abc
from typing import final, override

from pydantic import BaseModel


class Animal(BaseModel, abc.ABC):
    name: str
    age: int

    @abc.abstractmethod
    def speak(self) -> str:
        raise NotImplementedError


@final
class Dog(Animal):
    breed: str

    @override
    def speak(self) -> str:
        return "Woof!"


@final
class Cat(Animal):
    color: str

    @override
    def speak(self) -> str:
        return "Meow!"


class Horse(Animal):
    height: float

    @override
    def speak(self) -> str:
        return "Neigh!"


@final
class TamedHorse(Horse):
    owner_name: str

    @override
    def speak(self) -> str:
        return f"Neigh! I belong to {self.owner_name}."


@final
class Owner(BaseModel):
    name: str

We draw inspiration from what is done for SerializeAsAny to hack together a schema on the fly for the declared classes. The trick here lies in identifying subclasses and generating a union schema dynamically, so we preserve correct serialization and deserialization while respecting polymorphism, and avoid doing anything too ugly with Python’s typing system.

The type AnyOf[T] = ... syntax in the TYPE_CHECKING block requires Python 3.12+. For Python 3.10-3.11, you can use TypeAlias from typing instead.
annotations.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
from __future__ import annotations

import dataclasses as dc
from inspect import isabstract
from typing import TYPE_CHECKING, Annotated, Any, ClassVar, TypeVar, final

from pydantic import GetCoreSchemaHandler
from pydantic_core.core_schema import any_schema, union_schema

if TYPE_CHECKING:
    from collections.abc import Iterable, Iterator, Sequence

    from pydantic_core import CoreSchema

    type AnyOf[T] = Annotated[T, ...]
else:

    @final
    @dc.dataclass(slots=True, frozen=True)
    class AnyOf:
        _CACHE: ClassVar[dict[type[Any], Sequence[type[Any]]]] = {}

        __hash__ = object.__hash__

        def __class_getitem__(cls, item: Any) -> Any:
            return Annotated[item, AnyOf()]

        @classmethod
        def _all_subclasses(cls, c: type[Any]) -> Iterator[type[Any]]:
            # Avoid abstract classes and generic classes, focusing on concrete classes
            if not isabstract(c) and not getattr(c, "__parameters__", False):
                yield c

            for sc in c.__subclasses__():
                yield from cls._all_subclasses(sc)

        @staticmethod
        def _unique[T](items: Iterable[T]) -> Iterator[T]:
            seen: set[T] = set()

            for item in items:
                if item not in seen:
                    seen.add(item)

                    yield item

        def __get_pydantic_core_schema__(
            self,
            source_type: Any,
            handler: GetCoreSchemaHandler,
        ) -> CoreSchema:
            if isinstance(source_type, TypeVar):
                # Avoid issues when used on generics
                return any_schema()

            # Cache the hierarchy so we don't have to recompute it every time
            if source_type not in self._CACHE:
                self._CACHE[source_type] = tuple(
                    self._unique(self._all_subclasses(source_type))
                )

            subclasses = self._CACHE[source_type]

            if not subclasses:
                return handler.generate_schema(source_type)

            schemas: list[CoreSchema | tuple[CoreSchema, str]] = [
                handler(c) for c in subclasses
            ]

            return union_schema(schemas, mode="smart")

The usage is then fairly straightforward. You just need to remember using the annotation when needed…

house.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
from __future__ import annotations

from collections.abc import Sequence
from typing import final

from pydantic import BaseModel

from .animals import Animal, Horse, Owner
from .annotations import AnyOf


@final
class House(BaseModel):
    owner: Owner
    animals: Sequence[AnyOf[Animal]]


@final
class StudFarm[H: Horse](BaseModel):
    horses: Sequence[AnyOf[H]]

Serialization and deserialization work as expected. Pydantic manages to find its way.

demo.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
from __future__ import annotations

from .animals import Dog, Cat, TamedHorse, Horse, Owner
from .house import House, StudFarm

if __name__ == "__main__":
    dog = Dog(name="Buddy", age=3, breed="Golden Retriever")
    cat = Cat(name="Whiskers", age=2, color="Tabby")
    owner = Owner(name="Alice")
    horse = TamedHorse(name="Star", age=5, height=15.2, owner_name=owner.name)
    house = House(owner=owner, animals=[dog, cat, horse])
    # TamedHorse is a subclass of Horse : polymorphism works !
    farm = StudFarm[Horse](horses=[horse])

    _ = House.model_validate_json(house.model_dump_json())
    _ = StudFarm.model_validate_json(farm.model_dump_json())
    _ = StudFarm[Horse].model_validate_json(farm.model_dump_json())
    _ = StudFarm[TamedHorse].model_validate_json(farm.model_dump_json())
{
  "owner": {
    "name": "Alice"
  },
  "animals": [
    {
      "name": "Buddy",
      "age": 3,
      "breed": "Golden Retriever"
    },
    {
      "name": "Whiskers",
      "age": 2,
      "color": "Tabby"
    },
    {
      "name": "Star",
      "age": 5,
      "height": 15.2,
      "owner_name": "Alice"
    }
  ]
}

Conclusion

That’s it. Not much more to say. Just a small hack to get you unstuck. One of the main drawbacks of this lighter approach is the need to have all classes properly imported, otherwise Python will not be able to discover them dynamically and the schema generated by Pydantic may end up incomplete. In my opinion, the biggest annoyance is still the requirement to add a specific annotation whenever you want to support this use case. I would still tend to favor the previous solution in a well established codebase. In both cases, the solutions presented are sensitive to changes in Pydantic’s API, but unfortunately there isn’t much choice.

Anyway, just a quick little digression. See you.

Last updated on