Every essay, most recent first. Long-form thinking on strategy, applied AI, growth, and the disciplines that inform them — filter by pillar or tag to find your way in.
The binding constraint on autonomous agents isn't intelligence — it's that per-step success probabilities multiply. A 95%-reliable agent finishes a 20-step task 36% of the time. The fix is topology, not IQ.
Answer engines read many sources and emit one synthesized reply. You no longer compete for a rank on a page of links; you compete to be the source the model quotes — and most businesses are still optimizing a channel that is shrinking.
MAMMAL's real contribution is not a benchmark win. It's a bet that molecules, proteins, and gene expression can share one sequence-to-sequence language — and a 458M-parameter generalist that proves the bet pays.
MAMMAL posts state-of-the-art on nine benchmarks, but the result that matters is four potency predictions on drugs it never saw, confirmed by a real assay. Here's why that one experiment outweighs the leaderboard.
A 458M-parameter, open, sequence-only model out-discriminated AlphaFold3 on binder-vs-non-binder in 5 of 7 antibody targets. The lesson isn't "sequence beats structure" — it's what task was actually being scored.
Answer engines retrieve passages and synthesize an answer, so getting cited is a craft: lead each chunk with a self-contained claim, make it survive being torn out of context, and hand the model the cleaner, more attributable fact than your competitors did.
Enterprise agent pilots stall at "impressive demo, never shipped" because teams score final answers while agents operate on trajectories — path-dependent decision sequences where one demo tells you almost nothing.
The limit on agent autonomy isn't capability, it's accountability. Every high-trust role is built around liability, and an AI bears no consequences for being wrong, so a human stays on the hook permanently.
The next discovery layer isn't search or an answer engine, it's the agent's own catalog of callable tools. If a planner can't find and invoke your capability, you don't exist in the workflows leaving the human web.
An agent's planner picks tools by reading a name, a description, and an input schema, then betting on the best fit. Winning that bet is a craft, and it lives in the contract, not the marketing.
Companies deploy AI like installing software. The right model is introducing an organism into an ecosystem, and selection pressure predicts the failure modes the ROI math can't see.
A discount books this month's revenue by permanently repricing every future transaction downward. You trade durable willingness-to-pay for a volume bump at a punishing exchange rate.
The consequential shift isn't agents running your errands, it's agents transacting with other agents. That needs identity, binding commitment, and settlement primitives the web never built, and it opens an adversarial surface it has never faced.
Today's agents are amnesiacs that re-solve your problem from scratch every session. The next advance isn't a smarter model but persistent, structured memory, and the accumulated record of working with you is where the real moat forms.
The price of a fixed unit of model intelligence is falling roughly 10x a year, and that single curve quietly invalidates the pricing model most AI companies are built on. Build on what the curve can't touch.
Product-market fit is a lagging, luck-contaminated indicator you can only read after the bets are placed. Founder-market fit — a specific, unfair edge in information, access, or lived problem-knowledge — is the leading one.
As agents act on our behalf, the binding constraint stops being capability and becomes trust: whether an agent serves your interest, resists hijacking, and is who it claims to be. The winners will compete on verifiable trust primitives, not raw IQ.
The scaling hypothesis is the most successful empirical regularity in the history of machine learning and an explanation of nothing. The industry has bet its capital structure on a line it cannot explain continuing straight.
Agent capability is bounded by the action space and feedback you expose, not the model's raw IQ. Most "our agent isn't smart enough" complaints are misdiagnosed environment-design problems.
Most durable production value comes from small, specialized models doing bounded jobs under deliberate orchestration. That's not a budget compromise; it's often the more robust and defensible design.
LLMs model the correlational structure of their training data with astonishing fidelity, but correlation is not causation and fluency is not truth. Knowing where that ceiling sits tells you what to trust them for and what the next paradigm must add.
Venture capital buys variance, not excellence. A fund lives on rare outliers, so a steady, cash-generative business is a failure to the fund even when it is generational wealth to you.
Science is not hypothesis generation, which is cheap and always was. It is the disciplined killing of hypotheses against reality, plus the taste to pick which are worth testing — and neither is a text problem.
Every model that ranks "what drives outcome Y" hands you a correlation, but you spend money on causes. The gap between the two is where data-driven companies quietly bleed, and more data makes it worse.
AI drug discovery keeps slipping because biology's labels are scarce, confounded, and often non-reproducible. You can't learn a reliable function from unreliable data; more compute just delivers the wrong answer faster.
"Will agents replace this job?" has a false premise in its grammar. The unit of automation is the task, not the job, and that reframe predicts which roles compress and which expand.
From inside a working lab: agents compress every part of science where a check is fast and cheap, and stall wherever the answer is gated by a wet-lab experiment that takes weeks. Difficulty was never the dividing line.
A clinical AI that is right 95% of the time is more dangerous, in one specific way, than one right 70% of the time: high reliability switches off the human vigilance the whole safety case depends on, and deskilling means the backstop never forms.
Most "dead" growth loops are working loops judged on the wrong clock. A control-systems view of why operators kill compounding loops at day 20 and overfeed vanity loops that quietly go negative.
Clinical AI's real future isn't a diagnosis-in-a-box. It's an agent that generates the full hypothesis space and proposes the cheapest discriminating test, while the physician stays the control layer that owns the priors and the cost of being wrong.
Per-seat licensing for a probabilistic system makes the buyer eat the reliability risk while the vendor gets paid whether it works or not. Outcome-based contracting is the only frame that puts accuracy back on the party who controls it.
Every task an agent takes over spins off new supervisory work: someone must bound it, review it, own its errors, and reconcile it with everyone else's. That load lands on middle management, and the span-of-control math breaks.
A pivot is a selection decision made under emotional pressure, and most founders answer it backwards: they keep the product they built and throw away the validated learning that was the only asset worth carrying.
Enterprises are re-running the RPA hype cycle with agents, and the thing that killed RPA — brittle integrations, dirty data, undocumented exceptions — is exactly what kills agents. The binding constraint is data legibility, not model quality.
LLMs are confident, fluent pattern-matchers that will always produce a plausible answer, right or wrong. Medicine built a discipline for reasoning safely around exactly that kind of mind: the differential diagnosis.
Agent pilots automate the clean 80% of cases and the business case dies on the messy 20%, because the exception tail holds most of the real cost — and it's exactly what a pilot curates away.
Your onboarding funnel measures signup completion. Retention is predicted by first-value delivery — a product event that fires after the funnel ends, so the dashboard is structurally blind to the moment that actually matters.
Your price is a filter that decides who walks in the door before it touches revenue — and the cheapest customers usually arrive with the worst version of the problem you solve.
The modal startup death isn't too few opportunities. It's too many pursued at once, none finished — and the cell solved this a billion years ago with a mechanism startups lack: programmed death.
"We have network effects" is the most over-claimed moat in startup strategy. Most so-called network effects saturate, cluster, and leak — and advantage is a metabolism you run, not an asset you possess.
Most growth spikes companies celebrate and slumps they panic over are regression to the mean — statistical gravity, not signal. Mistaking it for causation rewards noise and punishes sense.
Trust in a skeptical market is bought with signals that are expensive to fake — and "efficiency" is how you delete the exact thing that made them work.
Every clever prompt trick is a bet against the next model release, and you will lose it. The skill that appreciates is specifying the problem: goal, real constraints, acceptance test, and the cost of being wrong.
Your database schema is a frozen set of assumptions about what your business is. Once thousands of features depend on them, they constrain strategy far more than your language or framework ever will.