The problem started when I had two classes that needed to talk to each other. Sometimes, classes need to talk to each other in both directions.
The following example is made up, but mostly behaves like the original problem.
Let’s say I have a Director
and an Actor
.
The Director
tells the Actor
to do_action()
.
In order to do the action, the Actor
needs to get_data()
from the Director
.
Here’s our director.py:
from .actor import Actor
class Director:
def __init__(self, data):
self._data = data
self._actor = Actor(self)
def do_action(self, action):
return self._actor.do_action(action)
def get_data(self):
return self._data
To make testing easier, we’re passing the data
to Director
at construction.
The Director
passes it’s self
to Actor
during construction: self._actor = Actor(self)
.
Now, actor.py:
class Actor:
def __init__(self, director):
self._director = director
def do_action(self, action):
if action == "add":
data = self._director.get_data()
return sum(data)
else:
raise KeyError(action)
Again, for testing simplicity in this toy example, the “action” in question is “add”.
Now, a test case, test_action.py:
from .director import Director
def test_add():
d = Director(data=[1, 2, 3])
assert d.do_action("add") == 6
This test passes just fine. The test creates a Director
, then calls do_action("add")
.
The director passes the work to actor with self._actor.do_action(action)
.
The actor retrieves the data from director with get_data()
.
Now this simple example is silly. If that’s all we need to do, we can add the numbers ourselves, or at the very least pass the data in do_action()
.
However, this is a toy example. A real example might include many retrievals of data or retrievals of state of the system, etc, and we’d like the actor to not keep that knowledge locally.
So what’s the problem? This seems fine.
Let’s add type hints.
We want to utilize type hints and at least add them to all of the method signatures.
Director
is pretty easy.
data
is a list of integers, list[int]
.
action
is a str
.
do_action()
returns an int
.
These actually seem kind of restrictive. We may want int|float
or something wider like Any
. But int
will work for the example.
from .actor import Actor
class Director:
def __init__(self, data: list[int]):
...
def do_action(self, action: str) -> int:
...
def get_data(self) -> list[int]:
r...
Now Actor
is where the problem starts:
from .director import Director
class Actor:
def __init__(self, director: Director):
...
def do_action(self, action: str) -> int:
...
The type for director
is Director
. But we haven’t imported that yet. So I’ve added the from .director import Director
line.
What happens when we run the tests:
$ pytest
=================== test session starts ====================
collected 0 items / 1 error
========================== ERRORS ==========================
_____________ ERROR collecting test_action.py ______________
ImportError while importing test module 'test_action.py'.
Traceback:
...
director.py:1: in <module>
from .actor import Actor
actor.py:1: in <module>
from .director import Director
E ImportError: cannot import name 'Director' from partially initialized module
'with_types.director' (most likely due to a circular import) (director.py)
================= short test summary info ==================
ERROR test_action.py
!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!
===================== 1 error in 0.06s =====================
Drat! Circular imports.
Now, there are many ways to solve this problem. But today, we’re going to focus on Protocol
, and replace what the Actor
knows about Director
.
Solving circular imports with Protocol
We had the constructor of Actor
looking like this:
class Actor:
def __init__(self, director: Director):
...
But really, Actor
doesn’t need to access every method in Director
or even really know what a Director
is.
All Actor
cares about is that it can call get_data()
:
class Actor:
...
def do_action(self, action: str) -> int:
if action == "add":
data = self._director.get_data()
...
Protocol
allows us to replace the Actor
view of Director
with a stub class and use it for a type. For that matter, Actor
doesn’t really care that it gets passed a Director
, just something that has a get_data()
method that returns the right type, so naming it HasData
is fine.
from typing import Protocol
class HasData(Protocol):
def get_data(self) -> list[int]:
...
class Actor:
def __init__(self, hasData: HasData):
self._hasData = hasData
def do_action(self, action: str) -> int:
if action == "add":
data = self._hasData.get_data()
...
I want to be clear here. The ...
in do_action()
is because I don’t want to list the whole function.
But the ...
in HasData
is the actual code. Since it’s a Protocol
, there doesn’t need to any function body, not even a return call.
Seriously, this is the entire bit of additional code to get this to work:
from typing import Protocol
class HasData(Protocol):
def get_data(self) -> list[int]:
...
Now Actor
knows that it’s going to get passed something that has a get_data()
method, but it doesn’t know anything else about the type.
No more circular imports.
Note that test_action.py
and director.py
are unchanged.
The only module that knows about the Protocol
is actor.py
.
The test passes fine, as does mypy
, if we want to run it.
Further reading
I learned about Protocol
from this excellent article: Subclassing in Python Redux.