You start out really small, perhaps a Proof of Concept, a small app or data engineering pipeline. Or you want to go full Domain Driven Design, with all the bells and whistles? Sooner or later you will reach the point where you realise: I’ve created a mess, or at least, contributed to it. You change one thing here and ten other things start failing over there. Welcome to the Big Ball of Mud.
And time for one of our core principles:
Global organisation, local chaos.
Your project has undergone multiple epochs of evolutionary growth, by you and developers before you. You added features. Many, many, features, but nobody looked at the whole thing. At this stage two things are evident:
- You need to “grow wings”, get into bird’s eye view-mode and draw boundaries, aka define separate modules.
- Now that you know your boundaries, you have to make sure the boundaries are respected.
Point 1 you most likely cannot learn from a blog post, but point 2 is definitively something we can tackle here.
Chaotic code bases are difficult to tame, and once tamed, difficult to keep in that state. As example we’ll take one guideline of Domain Driven Design (or DDD in short):
the domain model has no dependencies (except the most basic and essential ones)
You can see the domain model as “module” containing all business logic. Spreading your business logic across the modules or layers of an application is, in 99.9% of the cases, a recipe for disaster. Conceptually a rule to protect our domain module could be written like this:
- for each file in the domain model module
- forbid imports from other modules
- but allow imports from the domain module itself
The test to validate something like this could be written in this way:
from pytest_archon import archrule
def test_domain_model():
(
archrule("protect the domain model")
.match("app.domain_model*") # (1)
.should_not_import("app*") # (2)
.may_import("app.domain_model*") # (3)
.check("app")
)
The test above is already totally valid pytest-archon
code. So what is pytest-archon
? It tries to help you with the question:
How can I codify the boundaries by which I develop and extend my application?
So it is a pytest plugin that helps you define (architectural) rules (archon means ruler, but it also sounds a bit like the arch in architecture) for your application. We created it at one of our innovation days at Xebia. Architecture rules are defined in simple Pytest test cases and can run as part of a CI/CD pipeline. It scratches our own itch: as consultants we know right at the start of an assignment that our time will be limited. How can we still ensure that our initiatives and efforts towards code quality stay, even after we leave?
Guard your architecture
Traditionally Python code bases are not concerned a lot with architectural questions. Most applications are using an already opinionated framework, such as Django or FastAPI, or don’t have the size to reap the benefits of a clear architecture. Hence, minimal effort is put into architecture. But if you grow, you will reach a tipping point at which you benefit greatly from paying attention to architectural concerns.
Figure 1: (A) A simple architecture for a python web or CLI app. (B) For larger apps, it makes sense to create a service layer as abstraction between the web or CLI framework. (C) This pattern makes it possible to grow the app even further into multi-module architectures.
A common approach is the Model-View-Controller design pattern (figure 1A). As logical next step, you might want to add a service layer that serves as an abstraction layer for your domain model and database parts (figure 1B). Consider the service layer-based setup as building block of a scaffold for your app (figure 1C). Which architecture you choose and what fits for your application is highly context dependent. For pytest-archon
it does not matter what you choose, only what to guard: the dependencies (red arrows) and the absence of dependencies (invisible arrows 😉). How does it look in pytest-archon
? To make it concrete imagine you are building an app to book flight tickets, with order, price_calculation and reservation modules.
src
└── flight_ticket
├── common
├── order
│ ├── data
│ └── domain
├── price_calculation
│ ├── data
│ └── domain
└── reservation
├── data
└── domain
For architecture A (figure 1), you only need to make sure that
- the domain model does not depend on other modules of the app.
def test_fig1a():
(
archrule("fig 1a: domain model has no dependencies")
.match("flight_ticket.order.domain*")
.should_not_import("flight_ticket*")
.may_import("flight_ticket.order.domain*")
.check("flight_ticket")
)
For architecture B (figure 1), you need to make sure that
- the controller, CLI or other modules only interact with the service layer
- the domain model does not depend on other modules of the app
def test_fig1b1():
(
archrule("fig 1b (1): other modules only uses service level")
.match("flight_ticket*")
.exclude("flight_ticket.order.*")
.may_import("flight_ticket.order")
.should_not_import("flight_ticket.order.*")
.check("flight_ticket")
)
This rule deserves some explanation: Target is the order module. We exclude all sub-modules flight_ticket.order.*
, because they need to import each other. Everybody else is allowed to import the main module flight_ticket.order
(which contains the API/service), but not any sub-modules flight_ticket.order.*
.
And option two is already outlined in “architecture A”.
For architecture C (figure 1), you need to make sure that
- for every module: (a) the controller (or CLI) only interacts with the service layer (b) the domain model does not depend on other modules of the app
- modules do not depend on each other
Here we will only sketch the solution, the implementation is left as exercise for the reader. The idea is simple: iterate through every module, and apply the same architectural rules.
@pytest.mark.parametrize("module", ['order', 'price_calculation', 'reservation'])
def test_fig1c(module):
(
archrule("domain model has no dependencies")
.match(f"flight_ticket.{module}.domain*")
.should_not_import("flight_ticket*")
.may_import(f"flight_ticket.{module}.domain*")
.check("flight_ticket")
)
Depending on your app structure, you could tackle option two either by
- make sure that only the controller or app imports a module or
- select a module A and check if other modules B, C import module A. Then take the next module B and check if A or C import B, etc.
Side Note In case you ask yourself: how the hell should I make sure that the database only uses the domain model, but not vice versa, you can get inspiration from the cosmic python book (ORM depends on model and the repository pattern). Depending on your preference, you could also split the repository definition into interface, which goes into the domain module, and an implementation of the interface, which resides in the data or db module. The domain model would then exclusively uses the interface. The app can instantiate an implementation and supply it to the domain model as argument, effectively decoupling domain and data layer.
Architecture has two natural enemies: laziness and architecture astronauts
The rules above are pretty global, on purpose. We just want to define a few rules, the boundaries, in our application. Just enough to keep its architecture clear and avoid surprises.
Additionally, these rules can help when you want to use your app as library, too. If you think: a command-line interface (CLI) would be a nice to have, but I don’t want to start all the web-server machinery to just run a simple command, python-archon
can help you. You build rules to prevent importing web-server-related code for your CLI.
To return to the title of this section:
please don’t overdo it. pytest-archon is a crash barrier!
Think about pytest-archon
as guard rails for (or against?) driving down the cliff. Don’t fall into the the trap of trying to nail down every aspect of your code/module structure.
The opposite of too much restriction is laziness: you find a bug in the domain module. Easy to solve, you think: simply reuse a function from the database module. However, the database module contains specific code for database communication and is not supposed to be imported everywhere. The quick hack of importing the database module in the domain module would make the logic dependent on the database. You’ll only need a few of those “quick fixes” for your code base to become a mess.
Conclusion
pytest-archon
is a convenient way to write architectural boundaries, simply in Python. No need to learn a special syntax. No YAML files that live out of reach for formatting and linting. You can guide new developers into the right direction and keep your laziness at bay.
pytest-archon
can be found in the Python Package Index (PyPI). Sources are on Github. If you find an issue, please tell us. If you like it, tell others 😊.
(You can find this post also on the personal blog of Joachim Bargsten)
Photo by Amiya Chaturvedi on Unsplash