The sin, the oil, the cracks, and the crisis
Are we the unpaid labor force of the current AI bubble? Yessiree, Bob!
I was listening to a Youtube video from the Wall Street Journal’s Tech Live conference last week, where OpenAI’s CFO, Sarah Friar, let slip the word “backstop”. Hearing this, made me realise that the AI bubble is no longer a metaphor.
In the video, she suggested the U.S. government should “guarantee” the financing for the $1.4 trillion in infrastructure her company needs, a request so outrageous that Sam Altman had to spend the weekend in full blown panic, clarifying that “taxpayers should not bail out companies”.
This moment felt like a Freudian slip for the entire AI boom.
And I am wondering if it reveals the potential endgame of a gold rush that has grown so fast and so colossally large that its leaders are already, accidentally, flirting with ‘too big to fail’ status.
This entire journey, from a research paper to now a gazillion-dollar problem, began with what can only be described as an Original Sin.
The Sin
That sin was mistaking one brilliant, 8y old idea: the Transformer…for the only idea.
And that mistake is why everything that follows became inevitable.
Llion Jones, one of the authors of the original 2017 paper that created the Transformer, “Attention is All you need”, the paper I mention every time I write about LLMs, said at a TED AI event last month that he is now “absolutely sick” of the transformer architecture.
According to him, the transformer project at Google was “very organic, bottom up.” Researchers talked over lunch, scrawled on whiteboards, and faced no pressure from management. There were no demands to publish a certain number of papers, no metrics to hit, no quarterly objectives to meet. They had the freedom to explore an idea that might not work, and that freedom led to one of the most significant breakthroughs in modern computing.
Today, despite unprecedented investment and talent flooding into AI, he argues that “this has somehow caused the narrowing of the research that we’re doing.”
He talks about the immense pressure coming from investors demanding returns and researchers scrambling to stand out in an overcrowded field.
Even researchers making millions per year don’t feel empowered to try wild, speculative ideas because of the pressure to prove their worth.
So, while the resources available for AI research have exploded, the range of research directions being explored has actually contracted.
In fact, as of of late 2025, AI startups have captured over 50% of all global venture capital. But, this capital isn’t funding a diverse ecosystem of ideas but is instead funding a gold rush for one architecture. The transformer architecture.
We are stuck in a local maximum, continuously “putting a carbon-fiber tail fin on a carriage,” as Llion Jones said, because the market has no patience for real exploration.
So, trapped in this monoculture, the industry needed fuel to keep the hype machine running. So it turned to the cheapest, most abundant resource available. Us.
The Oil
This bubble, like all bubbles, needs a resource to inflate it. It runs on the oil. That oil is us. We are the raw material being mined.
We are all, collectively, running the Turing Test. But this isn’t Alan Turing’s 1950s parlor game of “can it fool a human judge?” That’s a dumb, academic question for these times. The real test is the economic Turing Test, and its only question is, “Can your model prove to investors that you are worth a $500 billion valuation?”
…aka let’s get those DAU/MAUs in.
DAU = Daily Active Users
MAU = Monthly Active Users
There are 2 types of Turing Tests.
First, there’s the paid test: the explicit, multi-billion dollar “data labeling” and “AI tutor” industry. This is where companies openly pay for human “oil.”
They pay generalists $20 an hour, and in because they want quality, they now pay expert lawyers, doctors, and engineers anywhere from $250 to $1,000 an hour to teach their models not to sound stupid. This is the “boutique”, high grade crude oil. Maybe olive oil with truffle flavouring.
But the real interesting one, that makes the DAU/MAUs possible, is the unpaid test. This is the mass-scale, “free” labor force of 800 million weekly active users.
And I know that most folks think the “snake oil” in AI is the faulty product, or the money, but in fact, the cons are probably the deals we produce by consuming.
We are the unpaid R&D, QA, and content moderation teams for the wealthiest companies on Earth. Every time we use these models (either free or paid versions) we are the ones “Turing-testing” the product into viability.
Every “regenerate” click, every “thumbs down,” every re-phrased prompt is a free data point that tells their model how to be better. (RLHF ftw, we talked about this before)
But what we maybe do not realise is that with every interaction, we are meticulously, and freely, teaching a machine how to replace the very faculties we’re using.
We are outsourcing our own cognition, deskilling ourselves from creators into passive, frustrated editors.
We are participating in a massive cognitive transfer, all for the privilege of... what, exactly?
This is the oil. This cognitive transfer is what is feeding their pockets and getting investors locked in. This free, mass-scale R&D is what allows LLM companies to claim huge ARRs and what generates the $3.70 ROI for every $1 invested that the industry brags about.
This is the fuel that inflates the bubble. And it’s what leads, inevitably, to the “backstop“ moment the industry, glutted on our consumption and labor, hinting that the public should be the ultimate guarantor for its bets.
The Cracks
And when you place all your bets on a single, sometimes hard to control architecture while fueling it with free labor: eventually, reality catches up. And when it does, the cracks appear everywhere at once.
And, in the AI bubble, the financial hype collides with technical reality. Quite often.
There are many many cracks in the bubble, but I will mention a few that are closer to my day to day life.
1. The crack in data analysis (my fave lol)
As a data professional, this is where I start to lose my mind. The industry is selling “LLM-powered insights,” a product that makes 0 sense. LLMs are probabilistic parrots, not deterministic calculators. They guess math. They hallucinate.
Text Prediction ≠ Data Analysis
I wrote extensively about this before. LLMs don’t fit distributions and they don’t validate assumptions. They guess plausible text, which might look analytical, but it’s NOT grounded in the math that matters.
On clean academic benchmarks like Spider 1.0, LLMs hit 85-90% accuracy at generating SQL. Pretty impressive, right? That’s the demo effect and what gets sold to executives.
The reality is quite different. For example, on BIRD, a benchmark using realistic databases with messy real world characteristics, GPT4’s execution accuracy drops to 73%. That’s already a 27% failure rate on moderately realistic data.
On Spider 2.0, which tests enterprise-scale databases (the kind your business actually runs on), even OpenAI’s o1-preview achieves only a 23.77% success rate.
Read that again: 76% failure rate on enterprise data.
And this is just SQL generation, the retrieval step before any analysis happens. The LLM hasn’t validated assumptions, checked for statistical significance, or done any actual analytical work. It’s just trying to fetch the right data, and it’s failing 3 out of 4 times.
Look, I use these tools daily. I learned Python with Claude. But I always have colleagues with more experience review my code because I understand the flaws. I can challenge Claude’s output and catch errors because I know enough to test it. That’s the difference between useful and dangerous. The problem isn’t that these tools exist but that we’re deploying them as if they don’t have a 76% enterprise failure rate. We need better guardrails and governance.
But should we go from saying “oh, useful tools” to extrapolating to “replace all data analysts”?
Also, the problem compounds when LLMs are asked to reason about complex domains:
When asked about legal information, even top models hallucinate 6.4% of the time, compared to 0.8% for general knowledge
On advanced mathematical reasoning tasks, leading models score below 25%, with most under 5%
In medical systematic reviews, GPT4 fabricated citations 28.6% of the time.
So, before investing in “AI-powered data insights” or “autonomous analytics,” make sure the math is mathing. Which benchmark are you measuring against? Because the gap between what works in a demo (90%) and what works with your messy enterprise database (23%) is the gap between a product and vaporware.
2. The crack in vibe coding
This is the one everyone feels and talks about lately, including yours truly. As I wrote in my previous article, vibe coding is the illusion of productivity.
The hype cycle has accelerated to such a terrifying speed, compressing a multi year boom and bust into a matter of months, and has created a generation of startups valued not on their revenue or customers, but on the perceived quality of their “vibes”..
And the second you start looking into the numbers, user trends, and market dynamics you immediately can see a clear disconnect, driven by a demo bubble where billions are being poured into solutions that data suggests nobody was really asking for, and their moment of peak interest may have already passed?
I am going to do myself a favor and not cover this topic again, I did so here 2 weeks ago but taking the chance to mention it in this week’s edition too.
3. The Crack in Academia
This is the most high stakes crack. This is where the hype claimed that AI would cure diseases and write perfect legal briefs but the reality is that it is just a plausible bullshit generator.
People are noticing. A major academic publisher just had to retract over 20 papers that were found to contain hallucinated AI-generated citations referencing to studies that do not, and never have, existed.
Lawyers have been fined for citing nonexistent case law created by an LLM.
If it wasn’t this sad, it would be quite funny.
So we have the Sin (monoculture), the Oil (our free labor), and the Cracks (everywhere). The rational response would be:
Stop. Audit. Build a strategy. Create governance. Test for rigor.
… but NOPE, that ain’t happening.
The Crisis
The industry’s actual response is “Ship it faster.”
This is the Crisis. The true problem is the management. It’s the vibe from the devs floor scaling up to the Csuite. It’s the complete, systemic abdication of professional responsibility.
We have countless of detailed, undeniable, specific, fundamental flaws and we know that while extremely powerful and useful, the transformer architecture is quite brittle, a known liar, and insecure, but, in the face of these documented failures, companies are still stampeding. Everyone is so desperate to do AI, so terrified of being left behind, so obsessed with “productivity,” that they are ignoring the evidence.
There is no strategy, no standard procedures, no risk mitigation. There is just a frantic, top down mandate to “integrate AI,” which means “give the vibe-based tool access to our shit and hope for the best.”
And right into this vacuum of process, agents have entered the chat.
The industry’s “fix” for a tool that is known to fail on real world data, known to be insecure, and known to hallucinate? Make it autonomous and let it run unsupervised.
This is how you set yourself up for failure. It’s the inevitable, banal result of zero governance.
I gave a similar example on Linkedin 4 months ago: let’s say you have an enthusiastic junior dev build an “autonomous agent” to “optimize” a workflow. It’s a “vibe.” It’s “good enough.” It’s tied to their personal API key.
Six months later, that developer leaves your company.
What happens to the agent?
You’ve created an unmanaged, autonomous process with active credentials. It’s still running. It’s still making decisions based on 6mo old “vibe” logic. The debugging question evolves from “what’s this code doing?” to “what are these orphaned systems doing when we’re not looking?”
This is the ultimate symptom of the crisis. These unmanaged automations are the physical manifestation of an industry that has abdicated all responsibility in its rush to “be first.”
And if you think this sounds like a paranoid, late night rant, you’re wrong.
The entire enterprise security industry is panicking and pivoting to sell you a solution to the problem they just helped create.
Exhibit A: November 10th, Microsoft published a blog post about “Riding the AI Wave” by evolving its entire Entra ID platform to handle “automated identity lifecycle management“ for AI agents.
Exhibit B: In the same day, CrowdStrike issued a press release naming them the leader in “Identity Threat Detection” specifically for their new ability to secure “AI agent“ identities “across the full hybrid identity lifecycle.”
So, the companies that profit from security failures are now selling you “agent identity management” to fix the governance vacuum that should never have existed in the first place?
And because history has a way of repeating itself. 30 years ago, we had the internet. I always go back to this in my writing because “the internet” followed the same playbook.
The rush to get “online” meant shipping products like Windows 95 and the TCP/IP protocol with zero governance. This governance vacuum inevitably created disasters like the ILOVEYOU virus, SQL Slammer, and a global malware pandemic. And that chaos, in turn, created the multi-hundred-billion-dollar cybersecurity industry. An entire generation of companies like McAfee and Symantec was born to sell us back the “process” and “governance” (antivirus, firewalls) that should have been built-in from the start.
We’re watching the same movie again. Except this time, the stakes are higher and the systems more complex. And ofc, there is way more money involved.
In the end, the most absurd part of this whole whole thing is our own complicity.
We are simultaneously the unpaid labor, the frustrated developer, and the hyped consumer. We’re the ones getting drunk on the kool-aid while complaining about the taste. This entire spectacle runs on a collective abdication of our own critical thinking. It’s scary.
The vibe isn’t just in the code, it’s in our desperate, almost religious, willingness to believe that this tool (read architecture) will finally be the magic, process-free solution to everything.
We’re building systems nobody understands, on a scale nobody can afford, that are already failing their most basic tasks. And we’re trapped, stuck in a monoculture (the Sin), fueling it with our free labor (the Oil), watching the failures multiply (the Cracks), while management demands we ship faster (the Crisis).
And in the end, we, the public who served as the “Turing Test,” are the ones they’ll ask for the “backstop” because THEY are too big too fail.
Until next time,
x
Juliana





“This entire spectacle runs on a collective abdication of our own critical thinking. It’s scary.”
As most, if not all, bubbles have throughout modern economic history. The messianic faith of AI, railroads, gold, tulips, insurance, and real estate all has one intertwining theme - the collective abdication of critical thinking. It must feel amazing to believe it’s all sorts of fun and guaranteed riches simply to ditch the ability to reason. I don’t understand, but millions clearly have.
This is one of the clearest articulations I’ve read of why the AI moment feels structurally unstable, not just overhyped.
The framing of Sin (monoculture) → Oil (unpaid cognitive labor) → Cracks → Crisis → Backstop is especially powerful. It explains why the “too big to fail” language slipped out so naturally, the system already behaves as if it expects socialised risk.
The most unsettling part isn’t hallucinations or benchmarks, it’s the abdication of governance. We’re not scaling intelligence, we’re scaling unexamined automation. History doesn’t repeat, but it does rhyme and this feels eerily pre-cybersecurity boom.