hamin.se

The problem that created DevOps

My guess is that if you are reading this, you are somehow involved in an IT organization. This means that you probably are familiar with IT Operations. For many companies, IT Operations is now mostly a cost center. Their goal is to keep applications and infrastructure running, but the point that keeping it running actually brings value to the organizations customers is often lost. This is partly because infrastructure and applications fast become complex, with bad documentation, stupid hacks of integrations and made to fit both the organization 10 years ago as well as in the future. The value gets lost, because IT Operations will focus on removing bugs, fixing errors and supporting the current systems instead of upgrading them to become better. Funny thing is that the most intricate and broken systems often are the ones that are super important: HR, economy, BI databases, applications that deliver data to all other systems etc. When they break, all hell breaks lose. When a new system or application gets created IT Operations usually are the ones that are destined to take care of it, without any insight into the design or choice of platforms and infrastructure. The stress of onboarding these types of applications, together with the frustration from the developers in the organization and the management that want to deliver more value creates a circle of failure, or a Downward spiral (DevOps Handbook, Kim/Humble/Debois/Willis). This problem makes things go slower, become more costly, makes the organization risk averse, and creates frustration between teams that become more and more siloed.

Devops is a mindset and way of working that aims to solve this.

The solution called Devops

Utilizing Devops creates high performing organizations that outperform traditional IT organizations. A Devops organization can deliver code faster, more frequent, with better metrics, restore from failure faster, grow and learn faster.

How? With the focus on minimizing deployment lead time, feedback loops, pipelines and experimentation.

How to implement Devops

Devops is not just a new role you create, or even worse, a role you rename. It is also not the place where Developers and Operations physically sit. Enforcement of Agile, Lean and CI/CD within an ITIL or ITSM frame can be Devops but often isn’t. Devops is a combination of ways of working, mindsets and structures. We will walk through some of them below.

Sprints and Pipelines

“With Scrum, a product is built in a series of iterations called sprints that break down big, complex projects into bite-sized pieces,” (Megan Cook, Group Product Manager for Jira Software at Atlassian).

Utilizing sprints and breaking down projects in to small bits and pieces makes work much more visible than having a big bag called PROJECT X and putting everything in there. The goal is to make it easier to estimate effort needed for functionality, breaking up large problems and making implementations visible for the Business. This goes hand in hand with the implementation of development pipelines. A pipeline is as the name implies, a line which works gets transferred. It goes from A to B to C and ends up in Z somewhere in the end. Work starts at A as the source, gets built in B and tested in C and when it is at Z it is deployed. The pipeline can often follow the structure of a sprint, where a task in a sprint goes from Investigate, Development, Testing and Delivered. Making the work go through a Sprint that replicates the structure in a Pipeline will give you clearer visibility in the status of the tasks and implementations. As with tasks in Sprints, a deployment in a pipeline should also aim at breaking down a problem in small bitesized pieces. This decreases the amount of work that goes into each deployment and reduces both effort, failure blast radius and troubleshooting.

Feedback loops

Much of the waste in releasing software comes from the progress of software through testing and operations. For example, it is common to see Build and operations teams waiting for documentation or fixes Testers waiting for “good” builds of the software Development teams receiving bug reports weeks after the team has moved on to new functionality Discovering, towards the end of the development process, that the application’s architecture will not support the system’s nonfunctional requirements (Continuous Delivery: Anatomy of the Deployment Pipeline, Humble/Farley)

Software that takes a long time to get into production environments and software that is buggy has a common cause: the feedback loop between the development and the operations is too long. This is also true for the team management. Imagine a project where you have a weekly meeting where you discuss the ups and downs of the project, looking into ongoing problems and making sure that the project is following the timeline. Problems are noted with action points for specific people and followed up the following week. Quite typical, no? Now instead imagine the meeting, but instead of having it once a week, it happens once a day. Since the time between meetings are so short, you can faster follow up on issues and their solution. Because the meetings happen every day, you can keep them short and focus only on the current ongoing tasks and blockers. And since you keep your daily meetings short, larger problems can be solved in a separate meeting with the people involved in the actual problem, so that the others can work on their own tasks. Follow up is every day. Resolutions happen faster, and less issues fall between the gaps. This organizational daily feedback loop imitates the feedback loop of pipelines: fast, transparent, often.

The better the feedback loops you have, the easier it will be for your organization to work on intricate and complicated applications - errors will be detected early and removed, problems become easily visible early and validation of the implementation will be part of the development lifecycle.

Experiment and fail often

Compare two companies. Company X have long development cycles and releases their product once or twice a year. Before releasing they have a big crunch time, sleepless nights and pizza for days. And when they release, they release a big update. Company Y does instead small releases. Maybe once a week or once a day. They focus on incremental changes and try to roll their application slowly forward by creating value based on a prioritized system. When they release, they release changes that sometimes are so small that the customers doesn’t notice them.

When Company X fails, they fail big. It is extremly noticeable. The size of the release and the time spent is directly correlated with the size of the failure. When Company Y fails, it is barely noticed. Company X cannot afford to fail, while Y fails as often as possible. The feedback loop is well utilizied when Company Y fails and from that feedback, they make sure to avoid those failures again. The effort in avoiding the failures is minimal, since time spent is unsubstantial compared to Company X.

Which company do you think will output the better product?

Experimentation and cloud

One of the most defining selling points of cloud is that it enables you to experiment. Since there are no lead times to when you get access to the infrastructure or services, and since you pay for what you use, you can easily spend less than an euro to test a new enterprise service or application. Most services even come with free trials.

Which of the two above Companies do you think are willing to experiment the most? If you have short lead times and fast releases, you can afford to experiment and also affordo failures when you experiment. This is not something you see in organizations like Company X. They are risk averse and tend to keep using what they always have been using.

To be able to experiment often leads to the willingness to improve processes teams operate within. We try to avoid solving problems and instead work around them when there is no time left over for working on anything but putting out fires. If time gets reserved for experimentation, we can let teams organize on fixing those problems. These fixes can be as simples as changing the time of a meeting so that people can come prepared. Important is that feedback from the improvements get replicated elsewhere the problems are present as well. The feedback loop comes into place here once again.

Where to start

Devops need its advocators both from above and below. Your role is to make it possible to work in a Devops manners in your organization. At the same time, you will need to find the people from within the teams that are willing to work on a new one, create new toolchains and spread the gospel.

The first movers should be willing to invest their time and have applications that can be easily transformed to be developed in a Devops manner. Look for early and easy wins where the team structure and application architecture start simple but later expands. A big bang is not the way to go.

As important it is to find the advocates, it is also of great importance to find the vocal opponents. They are invested in current organization, have knowledge about the intricate structures of the internal social standings and usually have inofficial mandate to take their own decisions due to previous agreements with higher management. These people will be the last ones to transform, well after building critical mass and majority. They cannot be included early, due to risk of sabotage and general resistance. Typical teams and roles include Networking, Security and specialist Data administrators.

Five step stepladder to Devops

Setup a meeting and planning structure

Visibility is the key value here. Every team needs insight into their work backlog and highlights bottlenecks that blocks delivery. You achieve this with a higher level of communication, both internally and between teams. Start simple: have daily standup meetings that are as short as possible focusing only on current blockers. Have a bi-weekly planning meeting that is both used for retrospective as well as planning for the coming two weeks. Change the periods of these meetings after six months if needed.

Once the teams starts talking about what is blocking their work you’ll quickly discover opportunities to remove work that is unneccessary.

Define why you deliver

To be able to deliver faster, you need to know why you are delivering. This might be different from team to team, but it important to define it so that the end goal is clear.

Say for example that your organization is working with business to business products, and has a vision to expand outside of the current region. If this vision isn’t clear for the application teams, then the architecture of the applications will not follow the vision.

Working in a Devops manner will make it possible to work in an iterative manner towards the vision and break down the problem in smaller parts.

Automation for delivery

Delivery needs to be done with automation. The end goal should be to have fully automatic delivery as often as possible. A way of doing this is utilizing CI/CD Pipelines. Start small, a simple application, and automate building and deploy it. Then add testing, security testing, monitoring, infrastructure deployments and try to make each step traceable.

Next step is to automate the automation. Make sure that you can bootstrap your solution with a single command. Utilize parameters and environment variables to setup multiple environments at once.

Now move to parallelization. This can be tricky, and you will probably find yourself in a situation where you have multiple race conditions; so as with everything else, start small and run build jobs in parallel. Then artifact creations. Last deployments.

Last step is to make sure that you can automate based on monitoring input. Look into what you can monitor and automatically solve. Simple things, such as adding more build servers so that no builds are stalled is a good start. Then look automating alerts. Lastly, automatically resolve those alerts.

Fail and learn

The best way to learn is to fail. If you are afraid of failure you will not be able to learn. The important thing here is to have a group where failing is accepted. The meetings you have should not focus on who, but why. The retrospectives should look into how to avoid this in the future and focus on technical improvements that can be done, not looking for someone to blame.

This can be one of the hardest things to do but if you manage to keep the discussions going without starting a blame game, then you are on a good way to a better Devops culture.

Spread the knowledge

How many organizations have you worked with which have created an organization but after the first initial push for creation never looked into evolving it? This is especially true when it comes to operational organization such as developer teams and other IT structures.

To be able to evolve you need to share and get input from other parts of the enterprise. Look into introducing non-advocators when your organization is stable and let them see what you have created.

This can be done by keeping workshops, hackatons, simple presentations or my favorite: “Tech and Beer”-type of presentations. Keep the discussion flowing, note down the input you get and add tasks based on that input to the backlog if they are relevant to your work.

The Enterprise cloud journey - Devops - Tue, Jan 21, 2020