From time to time, I put my writings into a story format. All persons and companies below are fictitious, although they are sometimes inspired by the wonderfully colorful characters I've had the pleasure to work with over the years.
"But, we don't want to use MetricsMongrel1," I told Lucy, the head of engineering, shortly after getting hired as a team lead. After a beat, she replied, "Well, that's what Operations is already using." I knew something was off with that statement, but - back then - I couldn't put my finger on it. Now, I understand the contradiction that threw me off.
I advocate for "you build it; you run it" - so much so that this very blog is sub-named in honor of its influence on me. However, I've worked with many tech leaders who have heard of the idea, will agree with the idea, and yet they have no real-life experience with it; they don't know what it looks like in practice. Because of this, they often fall back on traditional approaches like "Operations has always monitored the apps in the past..." or “QA has always been in charge of testing…” or even (heaven forbid) “we’ve always run our own servers…”
I help teams choose technologies that let them be autonomous; that is, they have the ability to build, test, deploy, and run their app. This is the chief reason I am “serverless-first” - it lets the development team do what they need to do without relying on outside groups. But, being autonomous extends beyond just the building and testing. To truly own it, those same developers need to run it, and, for that, they need good tools. Over and over, my highest-performing teams bought into ownership more quickly when they could pick the best tools. Get them involved!
So, what does it mean to say you are going to "run it"? It means you must be responsible for things like SLAs and expected behavior. Essentially, you are in charge of the promised customer experience. It means you must detect problems as they arise and you must correct those problems once they’re identified. To have any hope of pulling this off, you must watch it; you must analyze it. You need to know things like:
What is the state of the system right now?
How do I know it?
What exactly happened when we received that error last night?
How do I know that?
To answer these questions, you need great tooling. But, before we decide on the particular tools, we have to decide on ownership - who is answering the questions above: Operations or Development?
The answer is syllogistic2:
(major premise): If you build it, then you run it.
(minor premise): Development built the app
(conclusion): Therefore, Development should run it (surprise!)
If development is running the app, they should pick the best tool for themselves. If that happens to be MetricsMongrel, so be it. (As a side note, my favorite serverless instrumentation is Lumigo. Go check it out.) The most important3 thing is that the team doing the "running" is involved in the decision.
“You build it; you run it“ means challenging tradition by asking hard questions about ownership. Understand who should be owning each phase of your delivery cycle. Make sure your tooling is aligned with your ownership model. If you do this right, it will be Development all the way down4. Your new boss might not yet believe you, but I do.
Names have been changed to protect the innocent
A bit rationalistic, but it’s all in good fun
Followed closely by “choose the right team to do the running”
Tip-of-the-hat to https://en.wikipedia.org/wiki/Turtles_all_the_way_down