How to mess up DevOps: working at the wrong level of abstraction
There is a story I’ve seen unfold enough times to find disappointing:
A tech company gets its product off the ground with a small handful of developers and a user-friendly fully hosted Platform-as-a-Service (PaaS) solution like Heroku.
The company’s product is a success. A huge one! The company raises money, they scale the team, they iterate. One thing that doesn’t change is the PaaS. It’s working for them. Maybe not as well as they’d like but well enough to keep up with the roadmap.
At some point, costs get way out of hand. The once $1k/month bill has exploded to $40k/month. On top of this, developers are sick of hacking around arbitrary constraints of the PaaS. They learn of the dramatically better performance they can achieve at lower cost if they take greater ownership of their infrastructure.
They engage consultants. Often dubious ones.
The consultants come in with fantastic promises and build with buzzword-y tools. Projects drag on for months. Maybe years. If anything does see the light of day from the consultants’ efforts, it is so bespoke and complicated that simple tasks require several arcane commands that developers can never keep in their heads.
In most cases, I believe that his scenario is a consequence of fundamentally misunderstanding the function of DevOps, and of working at the wrong level of abstraction.
DevOps is about building user interfaces for developers.
If consultants are spending 90% of their time working out lower-level nuances of infrastructure orchestration then they have failed.
Just as the Ruby on Rails framework freed developers from having to make decisions about mundane implementation details that are universal to all web apps, various DevOps/enterprise PaaS solutions free you from having to architect from scratch basic and universal functionality like access management, log shipping, and exposing environment variables to applications.
These solutions include, but are not limited to:
Somewhere in the middle are also hosted platforms like Google App Engine, which give you a PaaS like experience (“app” abstractions, CLI, etc) but at better value and with fewer resource constraints than offerings like Heroku.
Working with any such solution is a vastly different experience than working with low-level infrastructure automation tools like Chef, Ansible, the increasingly popular Terraform or any bespoke CLIs built around tools of this nature.
I find that an instructive litmus test of whether DevOps is delivering for a team is the ease with which a developer (not an infrastructure specialist) can “fork” (copy) an existing app and deploy to it. With high-level tools like OpenShift or Cloud66 this should be a relatively intuitive and quick task. With more bespoke tools it may not be something the developer is ever able to achieve by themselves.
Fundamentally, if your developers cannot work comfortably with your DevOps solution, the solution is a liability. I hope that this article has shed light on the critical distinction between infrastructure automation and higher-level PaaS, and that it saves organizations from going too far down the road of bespoke DevOps solutions where they may not be truly necessary or well-suited.
1 Comment