Pilot licenses • fdbck lps

Contronyms

Last year I remarked (opens in new tab) that pilot projects are contronyms that become a corporate fait accompli the moment they’re given the name. Now that I’ve got a few more pilots under my belt, I have cause to revisit my thoughts!

Here’s some intuition for the argument. For one thing, pilots often require just as much investment as full-scale projects in many cost centres — you only save on scaling, and your sunk costs don’t know that your pilot is a pilot. Second, the non-pecuniary cost of spinning up a pilot project can be big enough that you might be politically exposed if it’s shuttered too quickly. So you should expect from first principles that many pilots will live on, at worst considered a qualified success.

But you might object that you’ve seen pilot projects fail before, and of course they do! So what’s going on here?

What ends up being load-bearing here is the word “pilot”, not the actual project it was intended to represent. So there’s a decrease in accountability that follows from the contronym observation. Your pilot doesn’t fail because it was a bad idea, it comes to fail because it was a pilot — we just didn’t scope it correctly or fund it properly, &c &c. This decouples the failure from the idea that spawned it, rendering the root cause illegible to the system. And this illegibility can cut both ways. Failure is supposed to generate information that we can use to improve things. As Reinertsen says in The Principles of Product Development Flow:

… most companies do a far better job of communicating their successes than their failures. They carefully document successes, but fail to document failures. This can cause them to repeat the same failures. Repeating the same failures is waste, because it generates no new information. Only new failures generate information.

But Reinertsen’s upside is contingent on running your pilot the right way.

Pilots as hedges

Let’s start with the unhappy path: pilot as hedge. If you believe the contronym theory, then the adverse outcomes are easy to spot because they direct us toward that lower-accountability area. There, the illegibility of failure blurs the distinction between learning that something doesn’t work and declining to try at all. So the curse of the hedged pilot is its inversion of experiment logic: now it ensures that we learn as little as possible! It’s kind of like an anti-trial balloon: instead of leaking an idea to see if it has popular support, we launch a pilot to avoid the tough conversation about its validity. Trial balloons may feel duplicitous for good reason, but at least they help test your Overton window. Pilots-as-hedges do the opposite and never manage to get off the ground.

It would be convenient if we could somehow induce organizations to index on the visibility of results. But if the entire point of a hedged pilot is its illegibility, then that won’t work. The system is following the process for its own sake, not because it represents an objectively good or pro-organizational idea. Perversely this can actually weaken de facto authority, as it becomes progressively more rational to select for illegibility and hedge your pilots, even as it harms the system. Thinking back to the Reinertsen quote, this all leads us to a liminal space where good experimentation is designed to fail often, but the vehicle we use for experimentation could displace failure entirely!

Hedge conditions

How do we interact within such a system? One way to motivate the difference is with a classic quadrant-style approach where we put optionality on one axis and accountability on the other:

low-optionality, low-accountability: these projects need to exist — you’ll always need someone answering the phones — but if you’re running one, it’ll be easy enough to operationalize that you won’t need to pilot anything first.
high-optionality, low-accountability: the sweet spot for working-level experimentation. Lessened accountability means there’s little need for a pilot’s political cushioning.
High-optionality, high-accountability: this is strategic experimentation, where nobody gives you a roadmap and every choice is critical. Pilots aren’t actually useful here, except as the outcome of whatever founder-mode strategic plan you put in place.
low-optionality, high-accountability: these projects are everyone’s nightmare: must-win situations where your hands are tied. These are precisely the conditions that produce the pilot.

If you’re in this last quadrant, pilots are escape routes, but they might take you in multiple directions! The “pilot-as-hedge” moves toward low-optionality/low-accountability by shedding accountability without actually taking on anything new. But there’s also a way to move toward high-optionality/low-accountability.

Pilots as experiments

The obvious counter to the hedged pilot is the experimental pilot. Implementation details are pretty straightforward if you think about how a typical experiment or RCT is run. If we’d preregister our data analysis plan, we’ll also preregister our pilot’s evaluation criteria, failure conditions, and stopping rules. Likewise, we might borrow from power analysis by calculating the sample size required to detect whether our pilot had a practically significant effect. I’ve never seen pilots do this! They’ll often timebox their pilots based on calendar deadlines, not on whether the scope would produce a detectable and falsifiable signal by then. But it’s an easy fix. Finally, even if an experiment doesn’t generate enough evidence to discard the null hypothesis — let’s ignore the problems with NHST long enough for my point to land — we’ve still learned something. Maybe we can’t publish it in an academic journal, but it’s probably suitable for an internal wiki.

This is all well and good, but if hedged pilots are the equilibrium of what’s essentially a tragedy of the commons, then you won’t get much institutional buy-in for an experimental approach. How do we move past this? One interesting thing about experiments is that they’re often blinded — we deliberately withhold information sharing to reduce bias. What I’m going to suggest is organizational heterodoxy, but what if we siloed pilots similarly?

“Throwing it over the wall” is a known antipattern in DevOps. It’s now considered bad form to hand work over to another team without context, design details, or an attuned sense of why we’re solving a problem this way. But what if the wall is a feature for pilots, not a bug? You literally can’t deploy a pilot if you have to throw it over the wall to someone else! And this forces the kind of experiment-driven conversation I mentioned earlier: naturally you’ll start discussing with the downstream team, who will no doubt want to hammer out evaluation criteria and the like. In Team Topologies terms, you’d probably want to run your pilot within an enabling team because of their timeboxed interaction pattern. And this beats the counterfactual every time; when a stream-aligned team runs a pilot, they also own production, which incentivizes the hedge. This means that you bypass the tragedy of the commons problem entirely, because you don’t need to dip into social capital reserves to implement within your division’s own enabling team.

So yes, pilots are contronyms, but only insofar as your org naturally selects for hedges. Throwing your pilot over the wall can cut through the inertia and make the word mean what it’s supposed to. But otherwise, expect the contronym to win.