By day I operate on spines. Outside the OR, I build and run autonomous AI systems.
Last night one of those systems ran 19 machine learning experiments without manual intervention: training models, evaluating results, adjusting parameters, and queuing the next batch. When I woke up, I had a ranked summary of the results and recommended next steps on my phone. From midnight to seven, the loop ran without anyone touching it.
A few months ago, a separate workflow screened 116 papers overnight and surfaced a connection in the neuroscience literature that I had not seen discussed explicitly before. That does not mean the machine “did science.” It means the cost of triage, synthesis, and iteration has dropped fast enough that many professionals still have not updated their intuitions.
Consider what happened last week. Anthropic’s Claude found a heap buffer overflow in the Linux kernel’s NFS daemon — a vulnerability that had been hiding in the code since 2003, before Git itself existed. The model mapped out a complete attack flowchart in ninety minutes. The researcher running the demonstration said he had never found a kernel vulnerability in his life, but the model did. Hundreds of previously unknown vulnerabilities in open-source software have been discovered this way. Meanwhile, Anthropic’s upcoming model, Mythos, has been described internally as “far ahead of any other AI model in cyber capabilities.” Cybersecurity stocks fell on the news. Some longstanding security assumptions — built on the idea that finding vulnerabilities requires human expertise and time — are weakening.
This creates an uncomfortable paradox. Many useful agentic systems become far more capable when granted broader permissions: system-level access, tool permissions, network reach. The tension is real — tighter controls reduce agent usefulness, but broader access increases blast radius. Palo Alto Networks’ Unit 42 calls a compromised AI agent a “supercharged insider threat.” In recent industry surveys, most organizations reported risky agent behavior such as oversharing data or overreaching permissions. And yet the productivity gains are significant enough that the companies and individuals who manage this tradeoff intelligently will move faster than those who avoid it entirely.
The same acceleration is straining science. At NeurIPS 2025, over a hundred hallucinated citations slipped past peer review — fabricated authors, nonexistent journals, URLs leading nowhere. At ICLR 2026, twenty-one percent of manuscript reviews were themselves AI-generated. Submissions to major conferences have nearly doubled in two years. Peer review was designed for a world where producing scientific text was expensive and slow. The system is increasingly strained by text that mimics good science convincingly enough to consume reviewer attention. Reviewers often cannot tell the difference reliably at scale, and there are not enough of them anyway. We need systems that evaluate thinking, not fluency. Process, not output. We are not there yet, and the gap is widening.
The disruption extends well beyond academia. Microsoft’s AI chief recently argued that much white-collar work could be automated within eighteen months. Anthropic’s CEO warns that half of all entry-level office jobs are at risk. Harvard Business Review reports that companies are already laying people off based on AI’s potential, not its current performance. The World Economic Forum projects 92 million jobs displaced by 2030. The exact timeline is uncertain. The directional pressure is not.
Small teams often adapt faster because they have fewer approval layers and can tolerate more operational risk. A solo developer can now prototype many internal business tools in a day that would previously have taken weeks. Most of what I use daily are custom tools I built for myself. I do not buy off-the-shelf clinical research software or business management platforms. I prompt them into existence. The bottleneck has shifted from technical skill to problem definition. That shift looks durable.
This is where domain expertise becomes more valuable, not less. The barrier to building useful software has dropped sharply. But human domain knowledge — the kind that takes decades of training and thousands of patient encounters to develop — cannot be acquired quickly through prompting alone. It is the information layer that still provides real differentiation. The surgeon who understands spinal biomechanics and also knows how to direct an AI agent will produce work that neither the surgeon alone nor the AI alone could match. The same applies in law, in finance, in engineering, in any field with deep tacit knowledge.
The infrastructure for the next phase is already being built. Coinbase’s x402 protocol embeds stablecoin payments directly into HTTP requests, backed by Cloudflare, Circle, and AWS. Google’s payment protocol has attracted sixty partner organizations. The next generation of agents is being designed to handle transactions as well as research, writing, and code — although real deployment is still early. The plumbing for machine-to-machine micro-payments is going in now; what gets built on top of it remains to be seen.
Recent model releases have changed what is practical to automate, often enough to invalidate workflows designed just months earlier. The model I am running today is meaningfully more capable than what I had six months ago, and the one coming next will be another step change. If you are planning your career, your company, or your research program around what AI could do last year, you are already behind.
I am not writing this to alarm anyone. I am writing it because as a surgeon and a coder, I know that you cannot fix a problem you refuse to see. Whether you are excited about AI or afraid of it, the trajectory is the same. Those who learn to direct these systems — who combine deep domain knowledge with the ability to orchestrate autonomous agents — will have a significant advantage. The adjustment cost rises the longer you postpone learning how to work with them.
If your job involves processing information, making decisions from data, or producing written output, how are you preparing for a world where a machine can do all three while you sleep?