How architecture enables kick ass teams (1): replication considered harmful?
At Xebia we regularly have discussions regarding Agile Architecture? What is it? What does it take? How should you organise this? Is it technical or organisational? And much more questions… which I won’t be answering today. What I will do today is kick off a blog series covering subjects that are often part of these heated debates. In general what we strive for with Agile Architecture is an architecture that enables the organisation to keep moving fast and without IT be a limiting factor for realising changes. As you read this series you’ll start noticing one theme coming back over and over again: Autonomy. Sometimes we’ll be focussing on the architecture of systems, sometimes on the architecture of the organisation or teams, but autonomy is the overarching theme. And if you’re familiar with Conways Law it should be no surprise that there is a strong correlation between team and system structure. Having a structure of teams that is completely different from your system landscape causes friction. We are convinced that striving for optimal team and system autonomy will lead to an organisation which is able to quickly adapt and respond to changes.
The first subject is replication of data, this is more a systems (landscape) issue and less of an organisational issue and definitely not the only one, more posts will follow.
We all have to deal with situations where:
- consumers of a data retrieval service (e.g. customer account details) require this service to be highly available, or
- compute intensive analysis must be done using the data in a system, or
- data owned by a system must be searched in a way that is not (efficiently) supported by that system
These situations all impact the autonomy of the system owning the data.Is the system able to provide the it’s functionality at the require quality level or do these external requirements lead to negative consequences on quality of the service provided or maintainability? Should these requirements be forced into the system or is another approach more appropriate?
Above examples all could be solved by replicating data into another system which is more suitable for meeting these requirements but … replication of data is considered to be harmful by some. Is it really? Often mentioned reasons not to replicate data are:
- The replicated data will always be less accurate and timely than the original data
True, and is this really a problem for the specific situation you’re dealing with? Sometimes you really need the latest version of a customer record, but in many situations it is no problem is the data is seconds, minutes or even hours old.
- Business logic that interprets the data is implemented twice and needs to be maintained
Yes, and you have to compare the costs of this against the benefits. As long as the benefits outweigh the costs, it is a good choice. You can even consider to provide a library that is used in both systems.
- System X is the authoritative source of the data and should be the only one that exposes it
Agree, and keeping that system as the authoritative source is good practice and does not mean that there could be read only access to the same (replicated) data in other systems.
As you can see it is never a black and white decision, you’ll have to make a balanced decision and include benefits and costs of both alternatives. The gained autonomy and business benefits derived from this can easily outweigh the extra development, hosting and maintenance costs of replicating data.
A few concrete examples from my own experience:
We had a situation where a CRM system instance owned data which was also required in a 24×7 emergency support proces. The data was nicely exposed by a number of data retrieval services. At that organisation the CRM system deployment was such that most components were redundant, but during updates the system as a whole would still be down for several hours. Which was not acceptable given that the data was required in a 24×7 emergency support process. Making the CRM system deployment upgradable without downtime was not possible or would cost $$$$.
In this situation the costs of replicating the CRM system database to another datacenter using standard database features and having the data retrieval services access either that replicated database or the original database (as fall back) was much cheaper than trying to make CRM system itself high available. The replicated database would remain running accessible even when CRM system got upgraded. Yes, we’re bypassing the CRM system business logic for interpreting the data, but for this situation the logic was so simple that the costs of reimplementing and maintaining this in a new lightweight service (separate from CRM system) were neglectable.
Another example is from a telecom provider that uses a chain of fulfilment systems in which it registered all network products sold to its customers (e.g. internet access, telephony, tv). Each product instance depends on instances registered in another system and if you drill down deep enough you’ll reach the physical network hardware ports on which it runs. The systems that registered all products used a relational model which was okay for registration. However, questions like “if this product instance breaks, which customers are impacted” were impossible to answer without overheating CPUs in those systems. By publishing all changes in the registrations to a separate system we could model the whole inventory of services as a network graph and easily do analysis on it without impacting the fulfilment systems. The fact that the data would be a (at most) a few seconds old was absolutely no problem.
And a last example is that sometimes you want to do a full (phonetic) text search through a subset of your domain model. Relational data models quickly get you into an unmaintainable situation. You’re SQL queries will require many tables, lot’s of inefficient “LIKE ‘%gold%’” and developers that have a hard time understanding what a query actually intended to do. Replicating the data to a search engine makes searching far easier and provides more possibilities for searches that are hard to realise in a relational database.
As you can see replication of data can increase autonomy of systems and teams and thereby make your system or system landscape and organisation more agile. I.e. you can realise new functionality faster and get it available for your users quicker because the coupling with other systems or teams is reduced.
In a next blog we’ll discuss another subject that impacts team or system autonomy.