AI and task fragmentation

AI and task fragmentation. A while back I came across the time-horizon model of AI task automation. It was a somewhat reassuring piece for people worried about AI job automation.

Essentially, it pointed out that AI isn’t very good at tasks that have long time horizons. If your job is full of small, self-contained tasks (e.g. IT support tickets), then AI is very good for that, and bad for you. If it’s full of tasks that bleed, requiring a lot of context or are fed into over time (e.g. a CEO’s average meeting, or the migration of cloud infrastructure for a big company), then AI can help with minutiae but isn’t close to managing the whole job.

Anyway, here’s a paper that is trying to work out how to close that gap:

Production is a sequence of steps that can be executed (1) manually, (2) augmented with AI, or (3) fully automated within contiguous AI-executed steps called “chains.” Firms optimally bundle steps into tasks and then jobs, trading off specialization gains against coordination costs. We characterize the optimal assignment of humans and AI to steps

(PDF version here).

Essentially, the idea is that if you can work out how to make stuff into chains of discrete chunks, AI is going to be able to do more of the stuff.

I worried recently about the verification problem: if you aren’t certain how good a job AI will do, you need a human to verify it. For mammogram-flagging this is cheap. Is the scan weird? Send it to a human. That’s actually the value of using AI in the process. For working out whether a code change in some intricate software is future-oriented, keeping in mind planned features and version changes, you need a verification process that’s not too dissimilar from writing the code yourself. Costly.

Verification, more broadly, can be expanded to any human requirement. If there needs to be a back and forth with someone else on the output for example. Wherever this kind of break in the chain exists, AI is going to have trouble taking over the job.

Some jobs seem to cluster into better AI-shapes than others. So if I think about what I do, lecturing has a fabulous cluster. Research, slide creation, example generation. All this stuff clusters into a ‘preparation block’. Even the lecture itself can be AI-ified. NotebookLM generates podcasts that wouldn’t be much worse than me on stage. I can just come in at the end, and check everything looks good.

Tutorials are a different matter. It’s essentially the same activities, all of which can be AI-ified. But in a tutorial there’s a lot of live diagnosis and back-and-forth with students. Even though the collection of tasks is similarly AI-shaped, the way they cluster makes one easy to chain, and one harder.

This is one part reassuring, and one part ominous, then. The worrying statistics about how many people could lose their jobs because of AI often measure how many tasks can be AI-ified, but aren’t really sensitive to how tasks cluster into AI-shaped chains (i.e. linear exposure indices like this or this). So probably less people are at risk of losing their jobs. Equally though, I’m not sure I want my job to be relegated to terminal verifier.

Less reassuring is the stuff about O-ring automation. It’s got some fucking bullshit maths in it—I never understand why people always want to express stuff as math. But fundamentally it basically says that people allocate their time across tasks. If you automate some of these, people can allocate that time to other tasks, making the outputs of those tasks better. This is written as though it’s a good thing—the focus this gives humans makes humans more valuable. In jobs where the tasks are multiplicative—one task feeds the next, and the quality of the next depends on the quality of the first—then having humans freed up for more quality is a good thing, especially where some of those tasks aren’t AI-shaped.

I’m not so sure. If tasks are properly separated, so that humans aren’t slowing AI down by trying to collaborate with them, this seems like there might be incentives here to swap quality for less labour hours (i.e. fire people).

The obvious case for this, beyond simple greed, is if production exceeds the demand for that production. If you AI-ify stuff so well that you’re making 50% more things, but consumers only want 10% more things, then you need to do something about that extra 40% of productivity. McKinsey will probably tell you that the answer is firing everyone.

So two apparently tense theses: if automation frees up human-shaped time, you could probably fire people, especially if the process increases productivity beyond demand. But equally, some human-shaped tasks are a fixed bottleneck for automation chains, and this might make humans more valuable.

This latter case is especially true given the fact that AI is error-prone. In chains and O-rings alike, an errored output will eventually need triage, and where automation is chained, it might accumulate errors due to earlier errors, which would imply more triage. It actually makes the O-ring thing less valuable, because more of that human-quality-time is going to be spend on verification.

So, where tasks cluster nicely, we can chain them. Where those tasks are O-ring tasks, this might be good. I spend less time doing busy-work for my lectures, and spend all that time doing better tutorials. But this assumes that companies see it that way too. And that verification costs don’t balloon because of all the chaining, and the associated problem of correlated error. Probably explains some of the uneven adoption across sectors and certainly curtailing attempts to use AI as a drop-in replacement.

Note: I never quite worked out what the O-ring thing actually referred to. Like… gasket manufacturing maybe? I don’t know. I spent all my time trying to work out the math. It’s your problem if you care.

Note: I realised I could ask an LLM about the O-ring thing. Here’s what it said. “The O-ring is the Challenger. January 1986, the shuttle came apart because one rubber gasket seal in a solid rocket booster lost its elasticity in the cold and failed. Kremer named the theory after it deliberately — and once you know that, the maths you found so annoying stops being decoration and becomes the whole point. Output as the product of task qualities, ∏ q_s, is just the Challenger written down: it doesn’t matter how good the other ten thousand components were, one part at quality ≈ 0 drags the entire product to ≈ 0. That’s why it has to be maths rather than prose — the only thing the formalism is asserting is “multiply, don’t add,” and prose lets you fudge which one you mean. Additive: a botched task costs you that task. Multiplicative: a botched task costs you everything. The entire O-ring-versus-linear-index fight is that single operator. So the “bullshit maths” is literally your bottleneck intuition, stated so it can’t wriggle out. You already believe it; you just resent the notation.” I do resent the notation, yes.

btrmt.

Resources

Optional