Earlier this week Patrick Debois asked a question on Twitter on "to what extend do you consider things like dns, ca, proxies [...] etc as part of API security".
This led into a short discussion on how to deal with complexity and scope of a threat model for which I suggested to use the concept of bounded contexts and a Wardley map.
Off course this led into more discussions and a question for a real life example on how to do that. As I promised to come up with something and it probably wouldn't fit in a (bunch of) tweets, I decided to create a blog about it.
Here be Dragons!
A warning upfront; combining a DDD approach to threat modeling is mostly 'here be dragons' territory. This approach is also focusing on how to deal with complex systems in a threat modeling strategy. It will not provide a level of detail as techniques like STRIDE will give you, but it hopefully will provide you more overview on where to apply STRIDE without getting lost in dirty details.
Before we dive into this unfamiliar territory I would like to provide some background that is useful to understand on beforehand;
- if your not familiar with threat modeling I recommend starting with the Threat Modeling Manifesto
- if your not familiar with Wardley Mapping I recommend taking a look at the introduction on the youtube channel of HiredThought or read Simon's book on Medium
- I got a lot of inspiration from a presentation by Susanne Kaiser at a DDD France meetup named "Building adaptive systems with Wardley mapping, DDD, and Team Topologies"
Start with a basic map
Let's start with Patrick's original question: "to what extend do you consider things like dns, ca, proxies [...] etc as part of API security?". It is clear where this question is coming from: development teams usually don't have to think a lot about networking architecture when developing their applications, but once you start to think about security there are potential issues in many network layer and many of them have impact on the application itself. On the other hand; you can't bring the whole world into your design, so where do you draw the boundaries?
So let's just try to make a high level overview of what we're dealing with and make a first map. In this example I've used the following assumptions based on real-life experiences:
- the website, mobile app, and API are in-house developed
- DNS, WAF, proxies, loadbalancers, firewalls, TLS handling are handled by commercial or open source packages
- certificate authorities and BGP are consumed as commodity/utility
- management of the networking components is typically done by a networking team (or department), while the application parts are handled by a product focused development team
In this map I also ignore the value axis, as it's difficult to define the user (maybe a 'data packet'?) and if we would use a correct OSI based data flow it will probably become messy pretty quickly. However, it doesn't really matter for the information we're looking for.
So placing all this information into our first map will produce something like this:
The first insight this map already provides is on the level of control and responsibilities. The further to the right a component is placed, the less control you have over it; meaning you are limited to configurable parts or service requests if you want to change something. At the same time this automatically means that the more something is placed on the left-hand side, the more you are responsible for making it secure.
Looking for trouble
Now that we have a map, we can start exploring the landscape. Where might be the areas that we need to explore further and how should we do that? Taking a look at the opportunities overview there are 4 items very relevant in for identifying possible threats:
- Reduce Bias
- Use Appropriate Methods
- Dispose of Liability
- Refactor Liability
Let's start with the first one, 'Reduce Bias' which basically means 'validate your assumptions'. Bias is typically introduced at locations where handovers or shared responsibilities exist. Hot-spots in our map these would be the lines between elements under control of different teams. Let's mark these with a triangle. In your threat modeling strategy it would be wise to focus on verifying assumptions between the teams and make sure there are no items where both teams expect the other team to pick it up (as these will be the things nobody does). Once you've identified all promises and assumptions in both teams you can use that as input for the more detailed threat model sessions within a team.
Identify needs and capabilities
The next opportunity is 'Use Appropriate methods', which is about verifying the maturity of the solutions you use and identify less risky approaches. To do this analysis it's important to understand where value is generated for your business, which expertise and knowledge you have in the organization, and which regulations are applicable for your company. In our map the positioning of the API would be an interesting item (I marked it with a star). Does the API really provide value, or is it mostly a burden? Do we have the skills and expertise in-house to mitigate all issues? The outcome of this discussion could be that API hardening (which would be the outcome of a standard STRIDE based threat model) is not feasible and to mitigate risks properly it's better to look for a product or commodity type of solution.
A similar discussion can be held on the lower level networking components. Configuring, hardening, and maintaining DNS servers, load balancers, and firewalls requires expertise knowledge, so you have to make sure this knowledge is present on a sufficient level. If this analysis shows there are insufficient capabilities to secure these components, it means you either have to get more expertise on board or think about a commodity based approach.
Here, our Wardley map helped us identifying locations where we are limited in our capabilities. If threats are identified in these locations we will have a hard time fixing them and might be forced to accept everything.
Taking out the garbage
The last interesting opportunities are 'Dispose of liability' (do we really need a component) and 'Refactor Liability' (can we make it simpler). From a security point of view this is basically challenging your attack surface. In simpler term; if it doesn't exist, it can't contain problems. In our map we see 3 components that handle http requests; the API, a proxy, and a WAF. It worth to investigate if this situation can be simplified without losing valuable functionality. Maybe the proxy can also act as a (more limited, but sufficient) WAF. A real-life example would be using the mod_security plugin on a webserver configured to act as a proxy server.
Our first journey
At this point we have created a first map and identified some landmarks, so we can start thinking about the journey. The journey we are trying to detail is 'how can we threat model this complex system as efficient as possible?'.
Based on our Wardley map it becomes clear that before we deep-dive into threat modeling on a detailed level, we should have the following discussions:
- are the expectations and promises between teams totally clear
- do we have the capabilities in the company to make this system secure enough, or should we take a look at simpler solutions
- are we actually using all the components, or should we redesign parts first
The outcomes of these discussions might be that:
- the role of the API is limited and there is not a lot value in it
- there is limited knowledge and capabilities on managing the WAF and proxy layer
- there is very little customization required in the load balancers and DNS servers
Based on these outcomes we can redraw the map to show a situation where threats and risks will be more manageable:
Where to go next?
As I stated at the top; combining DDD and threat modeling is mostly uncharted territory, so the true powers are yet to be revealed. However, with this simple example I hope to have showed some of the possibilities on how we can use this approach to effectively threat model complex situations. At the very least I have showed how creating a Wardley map can give you quick insights in where biases and assumptions should be validated (always a great source for incidents!) and how you can challenge the existing architecture from a threat and risk perspective.