Using Team Topologies to discover and improve reliability qualities

Team Topologies is the work of Matthew Skelton and Manuel Pais, and I use it as part of my job. From a sociotechnical perspective, a team-first approach is paramount for any organisation and helps to decrease the accidental complexity. As such, I’m often asked “How can we operate in DevOps?” or “How can I have a reliable service to deliver value to my customer?”.

Read more →

How to succeed at Progressive Delivery

There is a lot of buzz around the practice of Progressive Delivery lately. Rightfully so, as it’s a great addition to continuous delivery. By gradually exposing new versions to a subset of users, you’re further mitigating risks. As usual with new and shiny things, many of us are eager to try it out, learn about the tools and shift to this way of working. But as is also common, there are reasons to do it, reasons to not do it, and reasons you might not succeed at it. Let me save you some trouble by elaborating on a few of them. 

Read more →

What if your suppliers don’t deliver anymore?

Introduction

In the current day and age, technology is becoming part of the core business of many organizations. The software landscape is increasing in size rapidly, and the complexity of the systems grows. When applications, services, or even the entire IT landscape become unavailable, this will have a severe impact on the continuity of the business. 
Many companies rely heavily on third parties to run and support their systems; this ranges from integrations with SaaS services to building upon the services of a hosting party who run the software and maintain the hardware. Without these parties, it is impossible to run your software; however, what if they suddenly can not deliver anymore? 
If integrations with third-party software stop working, then your application will lose some functionality. Worst case scenario your application stops functioning altogether. The same issue presents itself if your hosting party suddenly can not deliver anymore. If your software is not running, then you are unavailable as well. This blog discusses how to prepare and handle such issues.


Mapping the current situation

The first thing that you need to do is to make sure that you have an accurate mapping of your existing software landscape. With an accurate representation of the landscape, you can create an overview of all third-party integrations, making sure you do not miss any. Using this overview, you can determine which integrations are responsible for which part of each of your functionalities. Order these from most important to least important, and you know exactly which third party software is most important to you, allowing you to prioritize properly. 


Ask a statement about their continuity

If you have not done this already, then now is the best time to do it. Once you have a good overview of your current situation, you can start to contact your third-party providers and ask them about their continuity. They will probably have a statement at the ready for you, which you can use to determine if it is enough to meet your demands. If they do not have such a statement, then that is already something to take note of. Why do they not have such a statement, are they prepared to keep the company going even in tough times? In the future, you should ask these questions upfront, before you start using a third-party service if you did not do this already.
Of course, there is still a chance that the company fails to comply with the statement, if that is the case, you will have to make sure that you are ready to handle the issues that arise because of this. Below a number of solutions for these issues have been stated. 


Integrating with third-party software

If your application integrates with third-party services, then you need to check if they have an Escrow agreement. An Escrow agreement allows you to take over the companies source code in case of the company going bankrupt or if something else goes fatally wrong. This allows you to start running the software for yourself until another solution has been found. Preventing you from being unable to use that service. 
Of course, this is a temporary solution. You do not have any employees that can maintain this solution, and you also had a reason for buying this software instead of writing it yourself. The code of these companies is often complicated for you to maintain and expand upon as well, so if this happens, you immediately have to start thinking about new solutions. Finding a replacement for this software is the most natural solution. Another solution is to revise the decision to buy the software and decide to start writing it yourself. However, just keeping the software as you received it will inevitably end up causing problems, such as security issues.

 
Conclusion

Software solutions tend to rely more and more on external factors that you can not influence as a customer. Therefore it is important to be aware of the risk these external factors bring and what happens when they become unavailable. Your company needs to have a clear idea of which preventive actions to take and how to handle it when a third party becomes unavailable. This blog discussed a number of preventive measures that you can take to make sure the unavailability of third parties will not affect your own business continuity. 

Multi products Scrum teams, how do you deal with that?

Multi Products Scrum teams are in reality observed often. One team serving different stakeholders and customer segments. Both would like to use the same people to work on their improvements.

In most organizations there tend to be more products than teams. While scaling frameworks give solutions on how to cope with a big Product and orchestrating value delivery among multiple teams. But how to deal with many products and a few Scrum teams?

Read more →

Five quality patterns in Agile development

In this blog series, I’ll discuss five quality patterns in Agile development to deliver the right software with great quality.

For years now companies have been adopting Agile ways of working and mostly the Scrum framework as their way to develop software. Scrum is all about working in dedicated teams on small increments of working software. Software that can potentially be released every single sprint. I’m sure you agree with me that this means that this software is therefore always tested every sprint as well. How could we otherwise release it right? This blog is about a trend I have noticed in a lot of companies that after time teams have more and more issues delivering quality software that conforms to business requirements.

Read more →

Monitoring AWS EKS audit logs with Falco

Background

AWS recently announced the possibility to send control plane logs from their managed Kubernetes service (EKS) to CloudWatch. Amongst those logs are the API server audit events, which provide an important security trail regarding interactions with your EKS cluster

Sysdig Falco is an open-source CNCF project that is specifically designed to monitor the behavior of containers and applications. Besides monitoring container run-time behavior, it can also inspect the Kubernetes audit events for non-compliant interactions based on a predefined set of rules.

Wouldn’t it be nice if you could automatically monitor your EKS audit events with Falco? In this blog post we will show you how to make this work.

Read more →

Going from a Value Stream Map to Value Stream Optimisation

Read this blog if you already have a Value Stream Map (VSM) and you are wondering how to reap its benefits using a structured process. If you need to know why and how you should make a VSM, please read the article ‘How to create a Value Stream Map’ written by my colleague Michiel Sens. Don’t forget to return here.

1. Read your VSM

So, how does a VSM look like? Depending on the linearity of a process people either display the process based on activities per role or by a flow of process steps.

Fig 1a. Process steps as a flow

VSM Linear process

Fig 1b. Activities per role
VSM activities per role

Even better is to combine the two approaches to get an idea of work per role as well as the overall process flow.
Read more →

Build and secure containers to support your CI/CD pipeline

There are 2 systems in any company that are critical: the payroll system, and the CI/CD system. Why? You may ask…
If the payroll system doesn’t work, people will leave the company and the company (may) face legal problems; the CI/CD system is the gateway to production. If it is down and there is a bug in production, it will affect your business; loss of revenue, loss of customers, loss of money, just to name a few.

Usually, I find these problems regarding the CI/CD tooling:

  • Poor Software Lifecycle Management, with outdated software, containing critical vulnerabilities
  • Ancient capabilities in the build agents. In extreme cases, frameworks and tools that are no longer supported by the vendors
  • Drifting agents. It means that teams had to do some sorcery to get the software build
  • Lack of proper isolation between different builds. It means that a build could access to another build files
  • Lead teams to upgrade or install a new framework
  • Outdated and strict rules mandated by a operations team. Usually from people that outdated heuristics on how software should be developed

Read more →

Kubernetes in the cloud: the 6 best options

The container wars are over. Kubernetes has won. The fact that Docker even integrates it in it’s desktop version says enough. But creating and maintaining a K8S cluster is still hard. You need to know a lot of the internals of Kubernetes, like etcd, overlay networking and more. And you need to be an expert in all the components: ingress, configmaps, pods and so on. So think twice before creating and managing your own cluster. Instead, choose one of the managed Kubernetes services.

Running Kubernetes in the cloud

Until a few months ago, your best (and probably only) option to run a cluster in the cloud was GKE. But things have changed. There are a lot of viable alternatives. So I decided to write a blog about these alternatives. In my blog I cover Google Kubernetes Engine (GKE), Tectonic by CoreOs, Azure Container Service (AKS), Openshift by Red Hat and Rancher 2.0. All of  them are fully managed and take care of upgrading, scaling and monitoring your cluster. And if you reall want to run your own Kubernetes, take a look at the various tools that exist to spin up a cluster. These tools are maturing pretty quickly. Just keep in mind: managing a cluster is harder than just creating one!

Read more on the blogpost on Instruqt

Learning by doing

If you want to try out Kubernetes yourself, learn more about it on Instruqt. It offers online courses and tracks for DevOps tools and Cloud services. By solving challenges, you will learn new stuff by doing, instead of watching video’s or following boring tutorials. Try it out for yourself and create an account on Instruqt. And please let us know what you think, we love to get your feedback. And if you are interested in using Instruqt in your company, let’s get a coffee!

Kubernetes on Instruqt

A screenshot of the online course for Kubernetes

 

The eight practices for Containerized Delivery on the Microsoft stack – PRACTICE 8: Environment-as-code pipeline and individual pipeline

This post is originally published as article within SDN Magazine on October 13th, 2017.

During the past year I supported several clients in their journey toward Containerized Delivery on the Microsoft stack. In this blogseries I’d like to share eight practices I learned while practicing Containerized Delivery on the Microsoft stack using Docker, both in a Greenfield and in a Brownfield situation. In the last blogpost of this series I want to talk about Environment-as-code pipeline and individual pipelines.
Read more →