The call was simple to state and expensive to get wrong. Our legacy MFA — the older multi-factor methods most of the company logged in through every morning — had, in the shape this incident was taking, quietly become a liability instead of a control. Retire it, and thousands of people would have the ground move under them mid-workday. Keep it, and we would be holding a door we no longer trusted.
That fork landed in my lane. This is the cut underneath the piece I wrote on why executive readiness is not a title — not the leadership lesson this time, but the decision architecture itself: how you retire a compromised control across the workforce, in the middle of a live incident, without taking the business down.
The model is four words: Sequence. Stage. Fallback. Brief. The rest of this is what each one means under fire — and the team that actually carried it.
The room, and the edges of my lane
It was 2022. There was no CIO in the seat. There was, thankfully, an experienced CISO running the security and forensics response — the strategy and the investigation were his, and the call to bring in outside forensics was his, not mine. I was glad every day that he was in that chair.
My lane was the enterprise IT response: everything the business depended on to keep operating, held steady while the security team worked the thing in the dark. Most of that work was shared, cross-functional, and slow. But one decision in that lane was unmistakably mine, and it was fast — on the record, with the C-suite, Legal, and the forensics team in the room. We were going to retire our legacy MFA, and we were going to do it without anyone outside a small circle feeling it.
I made that call in record time. The speed was not bravado; it was the product of knowing exactly where my lane ended and someone else’s began.
Making the call is the easy half
Anyone can demand a system be ripped out. The harder part is architecting how you do it so that a workforce of thousands keeps logging in and doing their jobs while the locks change underneath them — and then standing in front of the room to own the plan as yours.
I designed the cutover around four disciplines. They are not clever. They are the entire difference between a controlled change and a second, self-inflicted incident.
- Sequence — move the riskiest population first, under the most supervision, on a path the next wave inherits already proven. - Stage — move in waves small enough to recover, never one global switch. - Fallback — decide in advance what failure looks like and exactly how to revert, before each wave. - Brief — tell the right people before the change, and give the affected population only what it needs to keep working.
Get those four right under fire and a company-wide control change becomes invisible. Get any one of them wrong and you have caused the outage the attacker could not. And the clock matters here in a way worth naming: a compromised authentication control is not a problem that rewards deliberation, but the same urgency that argues for speed is exactly what tempts you into the global switch, the skipped fallback, the migration nobody briefed. You let the clock raise the stakes without letting it make the decisions.
How to retire legacy MFA mid-breach without downtime
Sequence is the order of operations. The riskiest population moves first — and riskiest means highest-privilege and most-observable, so that if something breaks, the failure is both contained and immediately visible. The largest population moves last, down a path the earlier waves have already proven. You are not just deciding what changes; you are deciding the order.
Stage is the refusal to flip one global switch. You move in waves small enough that if a wave goes wrong, the blast radius is a cohort you can recover — not the whole company locked out at once. Size a wave by what you can fully reverse inside one maintenance window. A staged cutover trades a little speed for the ability to stop, and in an incident, the ability to stop is worth more than the speed.
Fallback is the discipline most cutovers skip and most failed cutovers needed. Before each wave, you decide in advance what “this is going sideways” looks like, and exactly how you put the previous state back. Under fire you do not get to invent the reverse gear at 2 a.m. — it has to already be on the shelf.
Brief is the part that decides whether anyone feels the ground move. It has an artifact too: a pre-written communications kit, a single source-of-truth status channel, and the help desk briefed before wave one — not after the tickets land. Communication, in a cutover, is load-bearing. A perfect technical migration that surprises the workforce still reads, to the business, as an outage.
If you are running a cutover with no reverse gear, no staging, or a help desk that finds out when the users do, you have already failed the plan — you just have not met the bill yet.
Why business continuity is part of the security decision
Here is the position I will sign: in a high-stakes security cutover, business continuity is not the constraint on the security decision — it is part of the security decision.
The reflex under pressure is to treat “keep the business running” as the thing slowing down “make us safe.” Those get framed as opponents: security pulls one way, continuity the other, and the brave move is to choose security and let the chips fall. That framing is wrong, and it is expensive. A containment step that takes the business down has not contained the incident — it has opened a second one, an availability incident, on top of the breach you were already fighting, and now you are running two crises with one team. The discipline is not choosing safety over continuity. It is refusing the premise that you have to.
That is the whole reason the four words matter. Sequence, stage, fallback, and brief are how you make the security decision and the continuity decision the same decision.
The unglamorous half nobody photographs
The cutover was the visible call. The work that made it safe was anything but visible.
I put my team onto more than 2,000 service accounts — the non-human logins, the ones used by software rather than a person, that nobody thinks about until one of them turns out to be the open door. Find every one of them. Find who owns it. Close the gap. More than two thousand small decisions — rotate, revoke, re-scope, or verify — executed alongside the remediation the forensics team handed us, in the quiet after the room emptied, long after the dramatic part of the incident had moved on. That was the real texture of it: not the war-room moment, but a long stretch of methodical, exact work on an ordinary Tuesday that no one would ever photograph.
I want to keep the credit honest. I did not bring the security expertise into the room; I was brought into a room that already had it. The forensics were not mine. The security strategy was not mine. What was mine was the enterprise IT lane: the legacy-MFA decision, the cutover architecture, and a team that did the slow, exact, uncelebrated work after the room had emptied. The readiness that mattered was not pretending to be the specialist. It was running my lane well enough that the specialists could move at speed.
Why it held
The cutover held because the boring parts were finished before they were needed. None of sequence, stage, fallback, or brief was invented mid-incident. The pieces had been assembled in calmer rooms — on prior separations, carve-outs, and migrations — and pulled off the shelf when it finally counted. That is the quiet argument under the whole story: you do not rise to the occasion of a live breach. You fall back on the playbooks you wrote when nothing was on fire.
The payoff was not abstract. The workforce stayed productive, the business kept operating, and the compromised control came out — all in the same stretch of days. The thing that retired that control across the workforce without a single person feeling it was not authority and it was not nerve. It was architecture: decided in advance, sequenced honestly, and credited to everyone who actually carried it.
Common questions
How do you retire legacy MFA during a live breach without business disruption?
Architect the cutover around four disciplines: sequence (move the riskiest, highest-privilege population first under the most supervision, the largest population last on a proven path), stage (move in waves small enough to recover instead of flipping one global switch), fallback (decide in advance what failure looks like and exactly how to revert before each wave), and brief (tell the right people before the change and give the affected population only what it needs to keep working). Done together, these let you remove a compromised authentication control across a workforce of thousands while they keep working.
Why is business continuity part of the security decision, not a constraint on it?
A containment step that takes the business down has not contained the incident — it has opened a second one, an availability incident, on top of the breach. That leaves one team running two crises at once. Designing the control change so it both removes the compromised control and keeps the workforce productive is what actually finishes the job; treating continuity as the enemy of safety just trades one incident for another. The two are not opposing pulls — they are the same decision made well.
Who leads a security incident when there is no CIO in the seat?
Responsibility splits by lane. In a 2022 incident with no CIO in seat, an experienced CISO ran the security and forensics response and owned the call to bring in outside forensics. The enterprise IT lane — keeping everything the business runs on standing, and owning the call to retire legacy MFA and architect the cutover — was a separate, clearly bounded responsibility. The discipline is owning the outcome in your lane, crediting the specialists in theirs, and never slowing the room by pretending to be the specialist.
What makes a cutover plan safe enough to execute under incident pressure?
A reverse gear that already exists. The most common failure is a cutover with no fallback — a bet dressed as a plan. Before each wave you must know what ’this is going sideways’ looks like and exactly how to restore the previous state, decided in advance rather than invented mid-incident. Speed is cheap; the harder discipline is moving only as fast as the architecture lets you move safely, letting the clock raise the stakes without letting it make the decisions.