Hexagonal architecture in Webnode
If you attended the OpenDay in September 2022 you could hear and see my lecture on hexagonal architecture. We did not want to teach and explain the theory, but to show how we understand this approach and how we implement it in Webnode. Twenty minutes is not a lot of time and not everything can be fitted into PowerPoint. That is why we decided to break everything down a little more in this article. It might help someone to understand what hexagonal architecture brings and how it can look in production. Maybe someone will point out that we are doing it wrong. Feel free to leave a comment, we will be happy to learn from you 😊
Content list
What is this all about
"Hexagonal architecture"/"Ports and Adapters" (Alistair Cockburn), "Onion architecture" (Jeffrey Palermo) or "Clean architecture" (Robert C. Martin). There are different names from different authors for similar approaches. One of the goals of these architectures is to separate application code and infrastructure.
Changing the database system must not affect our main application logic. The discount calculation of order is always calculated the same regardless of whether it is calculated for displaying on the web or in cron job. If the order is saved in the MySQL or PostgreSQL databases, it does not affect how we work with the order in the main code. Changing the name of the database key must not cause a change across the application. It sounds like basic rules everyone follows. But from my experience I would say that this is not the case.
Most people have probably worked with a legacy application (or maybe even a newer one 😊) where a field loaded from a DB was happily passed through the application and maybe even returned directly as an API result. Any change to the DB is then practically impossible, because it will be immediately reflected in the output of the application. Bloated controllers (or presenters) full of business logic, tied to HTTP so much that they are untestable. Stray POST variable from which the data for SQL is taken in the repository. These are all issues that can make changes to core business logic much harder. Same goes for testability or code reusability.
Hexagonal architecture
Alistair Cockburn came up with the term hexagonal architecture (or Ports and Adapters) in 2009. The goals of this architecture include the possibility to control the application in different ways (user, other program, automatic test, ...) and the possibility of developing the application in isolation from the infrastructure on which it will run.
At the core of the entire application is "Application". It represents the most important business logic, the main thing that earns us money. On the "Application" boundary are "Ports". A "port" is an interface that allows access to an application's functionality or provides resources to an application. We can connect various data sources, services or application drivers to these interfaces. "Adapters" are specific technologies that the application uses or ways of controlling the application. It can be a database, an API, an HTTP controller, a CLI command, etc. For more information on the theory of hexagonal architecture, it is best to go directly to the source: https://alistair.cockburn.us/hexagonal-architecture/.
Ports and Adapters can be divided into controlling/primary and controlled/secondary. This view is nicely illustrated and described by Juan Manuel Garrido de Paz (https://jmgarridopaz.github.io/content/hexagonalarchitecture.html). I find it easier to understand because it corresponds with how I am used to thinking about the application. Classes in our applications are structured in this way.
Someone might wonder why hexagonal architecture, why a hexagon is shown in the drawings. It has no deeper meaning. The shape was chosen to have enough space around it to draw the various ports and adapters. Or at least that is one of the explanations given.
Explicit architecture
I briefly described the hexagonal architecture above. The approach we are using builds on the hexa architecture but extends it slightly. The author is Herberto Graça, and he refers to it as "Explicit architecture".
The main thing we took from this approach is a clearer separation of "driving" and "driven" adapters. Let us look at the controller for processing the HTTP request and the repository connected to the database. We could say that both are adapters connected to the infrastructure and both could be in the same place in the application. However, it differs from the application point of view. If I want to know in which ways the application is used (HTTP, CLI, ...), I do not care so much about what resources it uses, what is underneath it. For this reason, and to make it easier for us to divide classes into directories, we separate both types of adapters. You can read more on the author's website: https://herbertograca.com/2017/11/16/explicit-architecture-01-ddd-hexagonal-onion-clean-cqrs-how-i-put-it-all-together/
Domain driven design - how does it relate to hexa architecture?
Most of us have probably come across the term Domain Driven Design. This is an approach of modeling the domain of the application. The most important part of the company's business. Modeling the core of the domain, the core of what we do. The hexagonal architecture makes it possible to separate the main domain of the application from the technical environment. It opens the way for us to use DDD in our applications. The domain we are modeling fits into the "Application" part. Ports can be written to match the ACL (anticorruption layer) pattern from DDD.
At Webnode, we are learning to think about our applications according to DDD and model our classes that way. The application part represents the Domain of individual applications. Our "microservices" correspond to the "Bounded context" we have. We are trying to implement a "Ubiquitous Language" (my favorite phrase) so that everyone can understand each other. Learning to work with DDD is not easy, rewriting existing applications is even harder. Hexagonal architecture is an approach to get you started. In our applications and this article, the terms Application and Domain will continue to be mixed.
What do our applications look like
Enough theory. Let us see some pieces of the application and code that we recently created from scratch. In the new application it was easier to write everything using hexa architecture and DDD. In the examples we will go from "Primary Actors" to "Secondary Actors", from left to right on the diagram. It starts with reception of the request, its processing, the use of the DB, etc. To make everything less complicated (hopefully), I will continue to call the ports and adapters "actors" as these describe our application better.
Primary actors - Driver adapters
"Primary actors" handle how our application is interacted with, they handle the inputs. In our application, these are controllers for processing HTTP API requests, CLI commands and classes that process events. The inputs of these classes depend on the technology (HTTP request, CLI arguments, Kafka event).
The input data is converted to a technology-independent domain object and using this object class from application layer is called. This class is using domain classes to execute business logic. The output from the application layer is some domain object, which is converted here to the output corresponding to the technology. E.g. The entity is converted to an array, and this is passed to the JSON to output the API. The entity itself has no idea how it is formatted, what its output keys are called, etc. This layer is handling this. To separate responsibility and allow better testing, the Controller is not doing everything. We like to use transformers to create outputs. A simple class, with a simple test, creates an array from an entity.
In the "driver" adapters, we also deal with the conversion of exceptions. The application throws domain exceptions that are relevant to the application. Error messages and codes are application-specific, technology-independent. For example, the repository will never throw us a NotFoundException with code 404, which we would subsequently throw in the controller as an output from the application. If the same code were to be used, for example in a CLI command, the 404 code would have no meaning. Therefore, the conversion to error messages/codes is again solved in each adapter separately according to its technology.
Application – the main part
Driver ports
"Driver ports" are on the boundary of the "Application" layer. They are classes called from "Driver adapters" (from the controller, CLI command or other "driver" port). They represent the ways we can use the app, and what functionality does it provides. We call these classes "UseCase", and they have the "application" namespace. While it may be confusing at first that our "application" namespace is only part of the hexagon's center, it helps us to make the application easier to understand.
A "Driver" port can only use classes from the "application" namespace (another driver port) or from the "domain" namespace. These classes know nothing about the infrastructure or the surrounding world.
Driven ports
This is really the main part of the application, the core of the hexagon. These classes tie everything together. That is why we refer to this part as "domain". It contains entities, domain services, exceptions, etc. Classes in this layer can only be used by other parts of the domain. Something from another layer must never occur here, the domain is independent. But its classes appear in all other layers.
The interfaces that are in the "domain" layer correspond to the "driven" ports on the right side of the hexagon. They will be connected to database implementations, APIs, SDKs, etc.
Secondary actors – Driven adapters
"Secondary actors" are on the right side of the hexagon. We place them in the "infrastructure" namespace in the application. They are specific implementations for DB, API, SDK, queue, etc. and implement the domain interfaces I have mentioned before. This layer only uses other classes from the infrastructure or from the domain. They cannot see the inputs of the application. Care must be taken to ensure that no untranslated exception gets from the infra layer further into the application (common PDOException, GuzzleException or exceptions from packages). In the same way, no foreign object may be returned from the infrastructure layer to application or domain. We want to ensure independence from classes which do not belong to the application.
Who will connect the driven ports and adapters?
From the example and description of our application we can see that everything is separated. Layers are connected only by interfaces and nowhere in the code we see what is behind the interface, what kind of technologies are used. What decides this is the Dependency Injection Container.
DIC connects concrete implementation from the infrastructure layer with the domain interface. In theory, it is possible to replace the technology by changing only one place with this configuration. The rest of the application does not know anything.
We do not often replace one database engine with another. What we do more often is, for example, adding caching. In that case, we just create a new infra class with a cache adapter which implements the same domain interface. Subsequently, we change the bind in our DIC, and everything works the same, only now with caching.
An example of processing of request to get a domain
To get a better idea of how the request travels through our application, let us show an example of code which retrieves a domain detail to API.
Driver adapter – Controller
If the input data were more complex, we would create an application (or domain) object that we would pass on. We never pass an HTTP object or field directly. We do not want the app to depend on the keys we get from the API.
Domain exceptions that could be thrown by UseCase will be converted to exceptions that have a meaning in HTTP or directly to error responses. They may contain HTTP error codes here.
The output from the UseCase is processed by the Transformer. Usually from some domain object an array is created. This array is passed to a class which creates response output for HTTP (usually JSON).
We can use framework classes in the Controller. The rest of the application is framework agnostic, independent. This reduces the chance that a change in the framework will force a change in the domain logic. Any change should be easier and less risky. In contrast we might lose some benefits of the framework in the rest of the code. But we see better benefits in limiting usage of the framework in the code.
Driver port – UseCase
The UseCase class is pretty simple in this case. This is just a call of the domain interface. Sometimes the UseCase class can be unnecessary, and the domain interface can be called directly from the controller. But UseCase creates a better overview of what the application can do. Everyone in the development team can better see what "capabilities" the application has and what can be used in the "driver" adapters.
Driven adapter – Repository
A repository is a specific implementation of a domain interface. It is used to implement specific infrastructure code (MySQL, Elastic, etc.). It can also hide some legacy implementation, as in the example. This involves more complex calls to different APIs. The main application does not know about this at all. If we manage to replace it with a newer solution in the future, the replacement will simply take place here. The new application is shielded. This is also useful when refactoring to hexa architecture. These interfaces can help us separate new code from the old one.
The important thing here is to convert all exceptions to domain ones. In the same way, we cannot simply return objects and arrays. Both objects and arrays need to be converted to domain ones so that the application is not dependent on an external implementation.
Driver output – Transformer
As mentioned, the transformer takes care of formatting the application's output. It converts the domain entity into the required structure. In this case we create arrays with keys for the JSON output of API. The domain entity does not know about this, it does not have any "toArray" or similar functions. In case some key of the output needs to change, we do not want to make any change to entity. In this way the entity is shielded from the API output
What does hexagonal architecture bring us? What benefits do we see?
Implementing hexagonal architecture can look complicated. Sometimes it seems like we are creating a lot of code unnecessarily. When the SDK returns to us its "beautiful" objects with data and functionality in the infrastructure layer, why write our own, which will only be able to do part of it? Why catch every exception? Why not keep the format of the entity at the output, or take a field from the API input and process it directly for insertion into the database? I will try to mention at least some of the advantages that we see.
Testability
Writing tests is important to us and we really try to do it, not just say it. The main code of our application, the domain, is completely independent of the infrastructure on which it runs. The domain interfaces are clearly defined contracts which say how the application will behave. Even with mocks or test implementations, we can test the main logic easily.
Implementing new requirements using the TDD (test driven development) is simple. I will start with the logic itself that is required and prepare the interface. I do not care about where the request comes from, which database gives me the data or how the response will be formatted on the output. I focus on the essentials. In the end, I will add the infrastructure implementation and call of my use case and connect everything in DIC.
Before I got used to this approach, I thought it might be slow and unnecessarily demanding. But when a person runs the code created in this way and everything works on the first try, fewer comments on the code review are written, because the code is clearer, they will find that it is worth it.
Substitutability
As I mentioned, replacing an entire database system or framework is not something we do often. Still, it is good to have it prepare the application.
When we appreciated our architecture, a new microservice was created for a certain part of our team domain (taking care of the invoices domain). Until then applications accessed the invoices database directly, but now they should use SDK for the new REST API. In applications with hexa architecture it was done simply by adding a new interface implementation and changing the DIC.
The second case where it comes in handy quite often is when we are adding cache. At the beginning, we might not know which data are worth caching and which are not. But as soon as we find out that a query to the database or another application is demanding and unnecessarily slows down our application, we want to add caching. In such a case, we will again add a new implementation of the domain interface. This will contain an interface for accessing the caching system and the current implementation of the domain interface. The old implementation is just wrapped around by caching and DIC is adjusted. Again, the application knows nothing, the working and production-verified code did not have to change, and we are done.
Code structure
Thanks to the hexagonal architecture, we can create a more clearly divided code. Individual layers are in separate namespaces. The main application code is separated from inputs to the application and from accesses to other systems.
If I come to an application as a new developer, I can see more easily the ways in which the application can be called, thanks to the separation of "Driving" adapters (controllers, commands, ...). In the "application" section, I have use cases that will help me get my better understanding of what the application can do, what I can use it for. In the "infrastructure" I immediately see what technologies the application is connected to. And the main part of the application is separated into the "domain".
Dividing into individual layers also forces us to write smaller classes, separate logic from controllers, prepare object factories, etc.
Domain
By creating the main application logic separately from the rest, we can think more clearly about the application domain and its modeling. We are shielded from implementation details, terminology imposed on us by infrastructure or foreign packages.
Layers independence
Thanks to the separation of layers, it cannot happen to us that a change in the DB (for example, a new column, a change in the name of a column) has a significant impact on the application or its output. The only change that appears is in the infrastructure layer. It will not affect us elsewhere.
Where do we want to move forward?
Extension to all applications and rewriting of old codes
We write new applications and codes using hexa architecture. But we have a lot of applications. In most of the applications that we are actively developing, we already have at least the hexa structure in place, even if it only contains a few classes.
We are also gradually rewriting the old codes. The advantage is that there is no need to stop development of new features and rewrite the entire application. It can be gradually improved in iterations. Once we need to modify the legacy code, we discuss whether we want to move it, or how much of it we can move. During the development of new features, we move the old code to new structure with tests and implement needed changes.
It is a long process, but it allows us to satisfy the business requirements of customers and our need to improve the code and decrease technical debt.
Improve our knowledge of DDD
We are still learning to think about the domain of our company and our applications. When working on tasks and communicating between teams, we introduce the ubiquitous language so that everyone can understand each other. We are trying to model our domain and processes. We still have a lot to learn.
Do not make silly design mistakes
I think we have improved on this since my lecture. It happens less and less often that we miss an infrastructure exception in the application, or we use a field from the database somewhere else. But we will never be perfect.
We managed to use the deptrac tool in one of the applications, which can perform static analysis of the architecture. It watches over us and checks that our layers do not get mixed up. We want to expand to all applications where we have hexa architecture.
Conclusion
What to say in conclusion? We certainly feel in the team that using the hexagonal architecture is paying off. Codes are cleaner, more testable, clearer, and easier to maintain. Introducing new features is faster and we make fewer mistakes.
Maybe we're doing something wrong in our approach, we did not understand something correctly, or we adjusted it to our liking too much. I heard from a few former colleagues that everything that is described in hexagonal architecture is clear, and everyone does it. My experience is that this is not the case and having clearly defined rules pays off.
If anyone has a different opinion, disagrees with our approach, please let us know. We can discuss it in the comments, at a meetup, or at an interview in our company 😊. I like to learn something new that will allow us to work better.
Jan Vacula and Tomáš Vaverka