Back
publication17h 26m compressed

AI’s New Cost Curve

7 features in this issuepublished

The Brief

AI is shifting scarcity from raw capability to the costs around it: inference budgets, chip data movement, reliable evaluation, durable memory, human supervision, capital substitution for labor, and weak-link institutions. This issue maps where leverage moves once models become useful enough that the hard question is no longer only what they can do, but who can allocate the remaining compute, trust, context, labor, power, and access.

Features

Noam Brown says AI benchmarks need a compute axis

YouTube thumbnail for The field is underestimating inference compute | Noam Brown

Noam Brown's core claim is that reasoning models make intelligence a function of inference budget, not just model weights. If a model can get smarter by spending more tokens, calls, search, or scaffolding at answer time, then a single benchmark score hides the real scarce resource: who can afford, govern, and allocate inference compute. That changes the allocation economy because capability is no longer just prebuilt into a released model; it can be bought dynamically by whoever has access to compute, money, orchestration, and evaluation discipline.

>

The benchmark leaderboard is becoming a pricing surface. Brown's argument turns every reasoning-heavy eval into a score-versus-cost curve, which means model capability depends on the inference budget buyers, labs, regulators, and attackers are willing to spend.

>

Safety policy inherits the same problem. If a lab evaluates a model under a low inference cap but users can scaffold much larger budgets around it, release thresholds may understate the effective capability available in the wild.

>

Compute access is now institutional leverage. Brown's university comments imply that AI talent, research agendas, and even academic status will flow toward organizations that can allocate GPUs per researcher, not just toward places with prestige or publication culture.

Read full feature

AI chip design is a fight to spend silicon on compute instead of moving data

YouTube thumbnail for Chip design from the bottom up – Reiner Pope

This episode explains AI chips as an allocation problem inside silicon: each square millimeter can be spent on arithmetic, memory, routing, synchronization, flexibility, or programmability. Pope's core claim is that the scarce resource is not just raw transistor count but useful compute per unit of communication. Low precision, Tensor Cores, systolic arrays, scratchpads, clocking choices, FPGAs, GPUs, TPUs, and MatX's hinted architecture all become versions of the same question: how much expensive area and bandwidth should be allocated to moving data versus doing the matrix math that makes AI inference and training valuable?

>

The winning AI-chip primitive is not just "more FLOPs"; it is more reusable compute per trip through expensive data movement. That is why systolic arrays matter: they amortize register-file and memory bandwidth across many multiply-accumulates.

>

Low precision is powerful because multiplier area grows roughly with the product of operand widths, while storage and transport can scale more linearly. That makes FP4 and similar formats more than a software compression trick; they change the silicon economics of neural-network math.

>

The GPU-versus-TPU split is a real allocation trade-off. GPUs buy flexibility and many local pathways through smaller tiled units, while TPUs buy amortization through coarser matrix units; MatX's "splittable systolic array" framing suggests a search for both properties at once.

Read full feature

Dan Shipper Says AI Automation Creates More Human Allocation Work

YouTube thumbnail for The AI paradox: More automation, more humans, more work | Dan Shipper

Dan Shipper's core argument is that AI does not simply erase work; it reallocates work. When agents make writing, coding, analysis, and operations cheaper, humans do not disappear. They move up a layer into choosing what should be done, maintaining agent systems, reviewing output, deciding what belongs in the product, and making undifferentiated model output into something specific and valuable. That matters to the allocation economy because scarce leverage shifts from raw production capacity to judgment, trust, attention, interface control, model access, token economics, and the human operators who can direct automated capacity toward coherent outcomes.

>

The scarce role is moving from "do the task" to "allocate and govern the task." Shipper's version of the AI future needs people who can maintain agents, decide what work matters, review higher volumes of output, and keep products coherent as automated capacity rises.

>

SaaS may gain leverage rather than lose it if agent workspaces become the user's operating system. In Shipper's model, apps still matter, but they need to support humans and user-provided agents collaborating on the same object, which changes margins, UX, rate limits, logs, approvals, and pricing.

>

PMs and full-stack designers may become more valuable because AI turns taste and problem selection into directly executable capacity. The model commoditizes yesterday's implementation skill, but the human still has to decide what is worth making and why it should look or behave differently from default model output.

Read full feature

AI's growth boom still has to pass through the weak links

YouTube thumbnail for "A.I. and Our Economic Future," Professor Chad Jones

Chad Jones argues that AI can still produce explosive economic growth, but only after it works through the weak links that actually constrain production, trust, safety, coordination, physical deployment, and institutional adoption. The allocation-economy point is that value does not flow simply to whoever has more software or more compute; it flows to the scarce bottleneck left after AI makes other inputs abundant. That means software engineers, managers, equity owners, care workers, regulators, cloud and chip suppliers, and public institutions all face a moving scarcity map, while the risks of breaking fragile systems may arrive before the full productivity upside.

>

The allocative crux is weak-link scarcity: if AI makes one task abundant, the premium shifts to the next bottleneck instead of automatically delivering economy-wide abundance.

>

Jones is not dismissing AI acceleration. He says even conservative simulations eventually explode, but the timeline changes from "sudden takeoff" to a decades-long process of automating successive bottlenecks.

>

The risk asymmetry matters for operators and policymakers: weak-link systems can take a long time to improve, yet fail quickly if AI helps a bad actor attack software, finance, energy, or biosecurity.

Read full feature

AI crossed from benchmark intelligence to usable work

YouTube thumbnail for OpenAI's Yann Dubois: Why AI Progress Suddenly Feels Real

Yann Dubois argues that AI progress feels sudden because frontier models have crossed a reliability threshold: they are no longer just winning math and coding contests, they are starting to perform messy user work with enough dependability to matter. The allocation-economy crux is that scarce advantage moves from raw model intelligence alone toward reliability, evaluators, domain experts, permissions, connectors, memory, harnesses, and product teams that can turn general capability into usable work inside real organizations.

>

The strategic shift is from benchmark performance to allocated usefulness: once post-training and RL can optimize for real user utility, leverage accrues to whoever can define the task, measure the result, and feed the right domain data and experts into the loop.

>

The new bottleneck is not just compute; it is also evaluation capacity, verifiable rewards, domain expertise, and the ability to tell when an open-ended answer is actually good.

>

Startups still have room if they own the last mile: permissions, connectors, workflows, agent harnesses, vertical context, and trust are all scarce operating layers that foundation-model companies are not fully specializing around yet.

Read full feature

Gemini's Memory Bet Is a File System, Not a Personal Model

YouTube thumbnail for Gemini Co-Lead on World Models, RL's Next Domains & Continual Learning

Oriol Vinyals' most concrete message is that the next useful version of continual learning may not be a personal model whose weights constantly change. It may be a shared frontier model that can write, retrieve, and improve a durable external memory system. That matters for the allocation economy because the scarce bottleneck shifts from only training compute to who controls the agent's context, evals, data exhaust, memory substrate, hardware pipeline, and the trust layer around models that act over long horizons.

>

Durable agent memory may be an infrastructure market before it is a model-weight breakthrough. Vinyals' file-system answer implies leverage accrues to whoever can organize, secure, retrieve, and evaluate personal or enterprise context without forcing every user onto separate model weights.

>

World models are still more promise than settled capability. Google can show language-controlled video rendering in Omni, but Vinyals repeatedly says the deeper unsolved question is whether visual data can produce transferable concept learning and physics understanding without language labels.

>

RL is constrained by data generation, not just algorithms. Vinyals' Go comparison makes the next frontier a search for domains where agents can create useful training situations and get reliable feedback without needing human-labeled or perfectly verifiable tasks.

Read full feature

Compute, Taste, and the New Service Economy

YouTube thumbnail for Spotify CEO Joins, Alex Tabarrok Joins, Lots of news from SpaceX, Anthropic, OpenAI, Nvidia and Meta

This TBPN episode is a tour of AI turning abstract software progress into concrete allocation fights: SpaceX is raising public-market capital by selling scarce compute capacity, Anthropic's growth depends on GB200 access, Spotify is using proprietary taste data to allocate tickets and license AI remixes, and Alex Tabarrok explains the macro frame underneath it all. If AI and robots let capital substitute for labor in services, the old service-cost bottleneck weakens, but the new bottlenecks become compute, chips, power, proprietary data, licenses, distribution, and the institutions that decide who gets access.

>

The best economic line in the episode is Tabarrok's: AI matters if it lets capital replace labor in services. That reframes the AI boom away from chatbots and toward whether compute, robots, and workflow tools can break the labor bottlenecks behind health care, education, repair, and professional work.

>

SpaceX's IPO narrative shows compute has become a capital-market product. The Anthropic deal makes the story legible: a frontier lab's model demand becomes a multibillion-dollar infrastructure contract, which then supports a giant public-market TAM.

>

Spotify's Investor Day material is an allocation story disguised as media strategy. "Taste" becomes a rights and routing layer: it decides which fans get scarce tickets, which remixes are legal, which creators get reach, and how AI-created abundance is filtered back into attention.

Read full feature

Supplementary Resources

Retained same-date feed resources that support the issue, linked directly to the original sources.