Successful Prophets and other things

Hello,

Here’s everything since my last little missive to you:

I finally finished Successful Prophets. An article that took essentially five years to write. It’s of no particular consequence to you, probably, but this was the article that got me writing again. Made me realise I could use my articles to think through ideas, and make it exciting to write.

What you will care about is, much like Mundane Cults, it shows just how vulnerable we all are to being collected into high-control groups like cults. That there’s nothing special about them, and nothing that especially protects you from them. In fact, the way we talk about cult leaders, like the way we talk about cults, actually leaves us more vulnerable.

I also republished the sister article, Folie à deux. This was the idea that started my thinking about cult leaders and successful prophets. Written during the height of the pandemic, I think you might be able to read the subtext. A time of widespread human craziness, and again very comprehensible if you look at what happens when people are isolated with just a couple of people to keep them company.

I think, after reading, the new era of digital gurus makes a great deal more sense, as do the more malevolent online communities surrounding them. If anything, it’s surprising that there aren’t more. It’s just the human machine at work. An article for another day, though. For now, I hope you enjoy these.

In other news, I’m writing a book chapter on making AI useful, so lots of marginalia about that, as I collect my thoughts and scattered notes. Worth a skim—the jagged edge of AI is less and less reassuringly far away. This is good for you in the short term—some clear ideas about how to make AI more productive for you. But bad in the long term, because the clear benefit of humans is in jobs that are more fragmentary, and we are getting better at chaining AI across fragments.

Articles:

Longer reads. Me thinking out loud, harassing an idea into shape. Can be a little difficult to read, because I’m teaching myself something, but often have my most important thoughts.

Successful Prophets

Main idea: We think of cults as the product of dangerously charismatic leaders but on examination this narrative falls apart. Really, the most successful prophets are not a person, but the followers, who use the leader as an emblem.

Folie à deux: the madness of two

Main idea: Folie à deux is a striking phenomenon, but poorly understood. It seems to me that it might be just one misleading face of social isolation.

Audio (with edited transcript):

Me explaining an idea to others. Usually tighter, more coherent, and easier to parse than my articles because I’ve already thought it through. Come with edited transcripts, for the readers.

Meditation Isn’t For Everyone

Main idea: Meditation is sold as a special, universal, risk-free good. It’s none of those: it can harm, it isn’t for everyone, and underneath the branding it’s just trained attention—which describes half of what you already do. The only real variable is intention, and observation without action changes nothing.

Sages and Wisdom

Main idea: Empiricism and reflection rest on a third, intuitive way of knowing. The doctor and the guru run on the same authority structure—the only difference is the costume. Pick your sages by what sits underneath the figure, not the coat, or you lose to charlatans.

Marginalia:

My notes on content online and elsewhere. Sometimes me reminding myself of something, sometimes pointing interesting stuff out to you. Often they accumulate into an article. Enjoy.

AI Dark Output. I vacillate about how useful AI is. It’s got classic Malcolm Gladwell Shit/Karstic energy. It feels productive, for many people using it. But there are really very little signs of this productivity in productivity measures. I’ve had two big arguments about this, and both times I’ve come out thinking the opposite thing at the end. Here are some of my other posts on this:

The productivity case for my skepticism: AI isn’t changing anything yet and the slow gains despite apparent capability bursts and the uneven adoption across sectors.
The case for the likelihood that we’ll overestimate the benefits of AI, like we always overestimate tech benefits in the early stages (though, per Amara’s law, we underestimate in the long run).
It only pays where you find the actual use—not a drop-in replacement (and also here), more procrastination than productivity tool.
One of these is quite clearly discrete tasks which are chained together in a useful-for-AI shape.

Anyway. This paper is actually not really worth reading in full, because it gets all excited about its own little dark-matter analogy. What it does indicate is that productivity measures don’t necessarily capture areas that AI could be adding value.

This seems likely, but equally, it might be that the difficulty of getting tasks into a useful-for-AI shape is a hard and mostly unsolved problem.

That is, AI might be adding value we’re not measuring, or AI might not be adding value because any gains we perceive are eaten by the fact that we then need to take the output and do other slow and tedious stuff with it—email people, insert into word docs, take into a meeting. If an AI gives you a productivity boost of 50% on something that’s 1% of your job, you only get a 0.5% increase in productivity.

Both of these things would produce a change in productivity that’s difficult to distinguish from human noise.

There are a couple other problems, on the top of my mind. One is a verification problem.

A lot of people talk about AI’s jagged edge—it’s (increasingly less) surprisingly good at some stuff and (increasingly less predictibly) bad at other stuff.

Verification is the same thing but one level up. AI is getting better across the board, but we need to be able to check whether AI is getting better.

For something like mammography, human analysis is expensive. If an AI can triage which mammograms to read, then you get a huge workload cut. Verifying this is cheap—you can easily analyse how well an AI reads a mammogram for oddities. It’s a fairly mechanical thing.

In contrast, if you want to work out how good AI is at solving difficult or intricate coding tasks, you need someone who understands the code as well as if they’d written it themselves in order to verify that. Regression tests don’t capture future-oriented code structure—code designed for new features or version changes. This kind of verification is do-able, but the verification process is really expensive. This is at least part of that infamous study, in which AI use seemed to slow expert coders.

So, if you can’t predict how good an LLM is at a task, you need to factor verification cost into the process. Productivity gains require verification to be cheaper than production. This will be better as we get better at making things AI-shaped—we can verify a bundle of tasks at once, rather than slowing down to collaborate with AI more frequently. But you still have the fixed cost of verification to worry about.

The last thing I’ll mention here, since I’ll probably want to find it again somewhere, is scope creep. AI helps people do tasks they wouldn’t have otherwise done. Backlogs and deferred projects. Vibe-coding helper tools. If AI boosts productivity outside of the stuff that characterise your productivity, then is it a productivity gain, or is it procrastination? Open question.

So. I guess it’s not the just the productivity measures that are making me skeptical.

It occurs to me that my fixation on this is probably a case of my own inability to tolerate incoherence

Link

–

AI and task fragmentation. A while back I came across the time-horizon model of AI task automation. It was a somewhat reassuring piece for people worried about AI job automation.

Essentially, it pointed out that AI isn’t very good at tasks that have long time horizons. If your job is full of small, self-contained tasks (e.g. IT support tickets), then AI is very good for that, and bad for you. If it’s full of tasks that bleed, requiring a lot of context or are fed into over time (e.g. a CEO’s average meeting, or the migration of cloud infrastructure for a big company), then AI can help with minutiae but isn’t close to managing the whole job.

Anyway, here’s a paper that is trying to work out how to close that gap:

Production is a sequence of steps that can be executed (1) manually, (2) augmented with AI, or (3) fully automated within contiguous AI-executed steps called “chains.” Firms optimally bundle steps into tasks and then jobs, trading off specialization gains against coordination costs. We characterize the optimal assignment of humans and AI to steps

(PDF version here).

Essentially, the idea is that if you can work out how to make stuff into chains of discrete chunks, AI is going to be able to do more of the stuff.

I worried recently about the verification problem: if you aren’t certain how good a job AI will do, you need a human to verify it. For mammogram-flagging this is cheap. Is the scan weird? Send it to a human. That’s actually the value of using AI in the process. For working out whether a code change in some intricate software is future-oriented, keeping in mind planned features and version changes, you need a verification process that’s not too dissimilar from writing the code yourself. Costly.

Verification, more broadly, can be expanded to any human requirement. If there needs to be a back and forth with someone else on the output for example. Wherever this kind of break in the chain exists, AI is going to have trouble taking over the job.

Some jobs seem to cluster into better AI-shapes than others. So if I think about what I do, lecturing has a fabulous cluster. Research, slide creation, example generation. All this stuff clusters into a ‘preparation block’. Even the lecture itself can be AI-ified. NotebookLM generates podcasts that wouldn’t be much worse than me on stage. I can just come in at the end, and check everything looks good.

Tutorials are a different matter. It’s essentially the same activities, all of which can be AI-ified. But in a tutorial there’s a lot of live diagnosis and back-and-forth with students. Even though the collection of tasks is similarly AI-shaped, the way they cluster makes one easy to chain, and one harder.

This is one part reassuring, and one part ominous, then. The worrying statistics about how many people could lose their jobs because of AI often measure how many tasks can be AI-ified, but aren’t really sensitive to how tasks cluster into AI-shaped chains (i.e. linear exposure indices like this or this). So probably less people are at risk of losing their jobs. Equally though, I’m not sure I want my job to be relegated to terminal verifier.

Less reassuring is the stuff about O-ring automation. It’s got some fucking bullshit maths in it—I never understand why people always want to express stuff as math. But fundamentally it basically says that people allocate their time across tasks. If you automate some of these, people can allocate that time to other tasks, making the outputs of those tasks better. This is written as though it’s a good thing—the focus this gives humans makes humans more valuable. In jobs where the tasks are multiplicative—one task feeds the next, and the quality of the next depends on the quality of the first—then having humans freed up for more quality is a good thing, especially where some of those tasks aren’t AI-shaped.

I’m not so sure. If tasks are properly separated, so that humans aren’t slowing AI down by trying to collaborate with them, this seems like there might be incentives here to swap quality for less labour hours (i.e. fire people).

The obvious case for this, beyond simple greed, is if production exceeds the demand for that production. If you AI-ify stuff so well that you’re making 50% more things, but consumers only want 10% more things, then you need to do something about that extra 40% of productivity. McKinsey will probably tell you that the answer is firing everyone.

So two apparently tense theses: if automation frees up human-shaped time, you could probably fire people, especially if the process increases productivity beyond demand. But equally, some human-shaped tasks are a fixed bottleneck for automation chains, and this might make humans more valuable.

This latter case is especially true given the fact that AI is error-prone. In chains and O-rings alike, an errored output will eventually need triage, and where automation is chained, it might accumulate errors due to earlier errors, which would imply more triage. It actually makes the O-ring thing less valuable, because more of that human-quality-time is going to be spend on verification.

So, where tasks cluster nicely, we can chain them. Where those tasks are O-ring tasks, this might be good. I spend less time doing busy-work for my lectures, and spend all that time doing better tutorials. But this assumes that companies see it that way too. And that verification costs don’t balloon because of all the chaining, and the associated problem of correlated error. Probably explains some of the uneven adoption across sectors and certainly curtailing attempts to use AI as a drop-in replacement.

Note: I never quite worked out what the O-ring thing actually referred to. Like… gasket manufacturing maybe? I don’t know. I spent all my time trying to work out the math. It’s your problem if you care.

Note: I realised I could ask an LLM about the O-ring thing. Here’s what it said. “The O-ring is the Challenger. January 1986, the shuttle came apart because one rubber gasket seal in a solid rocket booster lost its elasticity in the cold and failed. Kremer named the theory after it deliberately — and once you know that, the maths you found so annoying stops being decoration and becomes the whole point. Output as the product of task qualities, ∏ q_s, is just the Challenger written down: it doesn’t matter how good the other ten thousand components were, one part at quality ≈ 0 drags the entire product to ≈ 0. That’s why it has to be maths rather than prose — the only thing the formalism is asserting is “multiply, don’t add,” and prose lets you fudge which one you mean. Additive: a botched task costs you that task. Multiplicative: a botched task costs you everything. The entire O-ring-versus-linear-index fight is that single operator. So the “bullshit maths” is literally your bottleneck intuition, stated so it can’t wriggle out. You already believe it; you just resent the notation.” I do resent the notation, yes.

Link

–

Space skepticism. Essentially, the idea that moving operations to space isn’t going to solve problems we face on earth. Not yet, anyway. Interesting for a few reasons:

How stuff works. From energy to communications to AI datacentres, it talks about what’s required to make them happen.
It’s a good case study into how bad we are at thinking about problems. Asteroid mining, for example, seems superficially sensible, but just red-penning the time taken to do it makes it instantly problematic, not least of which because by the time it got back to earth, the demand might have collapsed.
By examining what wouldn’t work, it also demonstrates what’s likely. Via negativa in action.

Fun.

Link

–

AI productivity. I’m pretty skeptical about AI enhancing productivity (see e.g. AI isn’t changing anything yet and the slow gains despite apparent capability bursts). This paper shows that each year of progress on frontier models cuts task-completion time by ~8%. Extrapolated, you end up with something like a 20% productivity increase over the decade.

Seems positive, but I’m not yet convinced. This paper is about the speed of task completion compounding, not the quality of it. And speed on 30-minute discrete tasks.

He also identifies a puzzle which worries me: when the AI does a task alone, the output quality scales with compute. Experts grade frontier model output at a 6-7 out of 7 for quality. In comparison, humans get ~3.5/7 on the same task when unassisted. Annoyingly, when the task is collaborative (a human-AI back and forth) the quality ends up in the middle—~4.3/7. That is, when humans are monitoring AI, the quality is better than when the human does the task herself, but when they collaborate on the task, it sort of averages out.

He says this is a puzzle, but it seems to actually imply that humans are a bottleneck on AI production for certain kinds of tasks.

That said, as we get better at making tasks ai-shaped, lots of this will be resolved one way or the other.

Link

–

I hope you found something interesting.

You can find links to all my previous missives here.

Warm regards,

Dorian | btrmt.

Newsletter

btrmt.

Resources

Optional