Partial Autonomy: Mastering AI Tech in its Age of Weirdness
Practical Strategies to Harness AI Despite Its Quirks
AI is weird.
Among the words to describe it that keep getting thrown around, this is one I keep coming back to.
Because it is. It’s unlike other software we’re used to. It feels magical, in that you don’t know what you’re going to get, but this also means inconsistent.
It can always surprise me with its capabilities, but this also means it lets me down without sufficient baselines and guardrails.
It’s constantly evolving, and like the mainframes of old we access the most advanced versions remotely, like sharing time on a mainframe.
It’s got absolutely incredible capabilities to assist us in all manner of ways, and yet it can be frustratingly hard to know when and exactly how. And how to act on it with consistency.
In today’s Substack I dive into this weirdness using a very specific source as framework: a talk by former OpenAI founding member and former Director of AI at Tesla Andrej Karpathy.
Drawing on his June keynote at the AI Startup School in San Francisco, I explore his topic of partial autonomy, grounded in my own experiences at PTP.
AGI vs Practical AI Agent Applications
The stated goal of the largest AI producers (like OpenAI and Google Deepmind) is developing AGI or reaching human-level intelligence.
Meta, in an effort to catch up, has recently shifted its aim a step further, targeting superintelligence, or ASI.
Timelines on when we’ll get there have generally come down, and even some highly respected AI doomers (like Eliezer Yudkowsky) have become increasingly vocal about wanting to stop AI development altogether out of a fear of survival.
(For more on this and his influence over companies like OpenAI, you can check out September’s A.I.’s Prophet of Doom Wants to Shut It All Down from the New York Times’s The Shift.)
But still, with GPT-5 looking a lot like GPT-4o, AGI talk, at least for businesses, has started to die down.
If something like AGI is coming anytime soon (beyond hype and speculation), I share the opinion that it is going to require additional approaches and far more awareness on the part of AI systems. In other words, entire stages of development that have yet to come to light.
In the talk I referenced above, Karpathy drew comparisons to self-driving cars, which, a decade ago, looked quite good in tests. He thought we'd have them widely in use on the streets in a couple of years.
And yet just as sometimes tech can take a surprisingly dramatic leap forward (that “ChatGPT moment”), it can also hang up for long periods of time, caused by unexpected real-world challenges.
Driving assist is one great example of this, because resources have been poured into making cars self-driving, and yet all manner of complexities continue to drag on the process. (Not the least of which is weather.) No matter how perfectly AI can drive cars in a vacuum, or in a world of other AI driving cars, on our streets it continues to face unexpected challenges.
Karpathy suggests, and I agree, that 2025 is not the year of AI agents. It is instead, the beginning of a decade of AI agents.
In other words, it's the time when we get our minds around AI weirdness and build products that tap into what AI can do right now to transform our lives and workplaces.
Andrej Karpathy AI Insights: Welcome to Software 3.0
Karpathy describes his belief that software hadn’t changed significantly for 70 years. But now he’s seen it change twice in only a few years.
In his view, Software 1.0 is the code that we have written for decades. It’s the way we program computers, going back to punch cards.
Software 2.0 came with advent of the earliest neural networks, where tuning was achieved through datasets and weights. This, he argued, functioned more like a decision tree, and he points to examples like the earliest image recognition systems that came about in the 2010s.
Software 3.0, the newest shift, refers to using natural language for application logic. Now we program LLMs with our words, voices, and images.
And while you may not agree with this breakdown, Karpathy, who also coined the term “vibe coding” back in February, is 100% correct that we are seeing a new era of natural language programming rising before us.
In the meantime, we have varying systems built with different goals and approaches in mind, all interacting within our businesses.
And just as codebases are constantly rewritten to take advantage of newer, more efficient, or accessible languages, so, too, will our software of today have to be adapted for best case use by AI.
Partial Autonomy AI: Managing AI’s Weirdness
If we’re being realistic here and can agree that business application of AGI is still many years off at a minimum, what do we make of current systems?
They cannot think as we do and certainly can’t feel. They have no consciousness, at least not like living creatures, and yet they do have something that looks like, or pretends to be, psychology.
They are, as Karpathy aptly describes them, “stochastic simulations of people.”
He points to films like Rain Man to get at LLM’s encyclopedic knowledge but lack of real-world capacity, and Memento and 50 First Dates for their lack of continuity, memory, and ability to learn and grow.
Today’s AI systems can find patterns humans have missed for decades in vast swaths of data, and yet still miscount the number of r’s in strawberry.
And unlike all of our coworkers, they do not go home and sleep and process what they’ve learning but instead are so often forced to start fresh.
They also have significant security limitations that come from a lack of human awareness and assessment. AI systems don’t differentiate the who or why of most requests, which is why prompt injections are so successful at getting them to leak information.
And yet, they also have encyclopedic knowledge. LLMs have read more than any human possibly could and can draw from across it with superhuman ability and speed in critical ways.
So rather than wait and dream of the day that AI is like us, how do we meet them halfway, and seize on their incredible abilities to empower us, while being smart and efficient with their shortcomings?
We do this with partial autonomy.
Emerging AI Startups in 2025 Are Thriving on Partial Autonomy
AI-driven software development, and maybe AI productivity tools in general, come in only a few varieties.
Some aim to do all the work for you. For software, this is akin to vibe coding. You tell it what you want and let it work, and take the results as finished. Check it through, and if it’s got issues, have it redone.
This is close to full autonomy, though it still requires you directing it and providing the requirements. (True full autonomy would be the AI deciding what it needs to vibe code as well as doing the actual coding, and the checking.)
And while this approach can yield startlingly impressive results, especially for weekend projects or in cases where the final product doesn’t need to be secure or efficient, it’s nowhere near the level needed to power enterprise software.
Software assistance, wherein AI provides us autocomplete, checking, education, rough drafts, initial code reviews, and more, is a far more realistic model for business use today.
This is partially autonomous, and it can be very successful.
We see partial autonomy with AI thriving in all sorts of markets, from customer service to sales to recruiting to medicine to engineering.
At its best, partial autonomy acts as an extension of the human being to take on the repetitive, mindless asks, to jump start our work, to back us up, and double check us.
And as a business model, we see startups that are thriving by taking this model to heart.
Cursor is one example profiled by Karpathy, and I won’t repeat his details here, suffice to say that as a platform it is effective at assisting users with accessing LLMs in a far more fluid manner. And it does this by allowing regular code editing, all the way to vibe coding, in a kind of autonomy slider.
Perplexity is another fine example. While the company has come under fire from sourcing and crediting at times, their product is one of the best examples of AI search on the market, and available to use effectively for free.
To test it, you only need open ChatGPT or Claude and Perplexity side by side and run the same queries. Perplexity uses OpenAI, Anthropic, and Meta models to power its information, and yet it provides some GUI in a comfortable interface, handles the model selection, calls, and integration for you, and still often comes up with superior results for quick research and faster, too.
And with AI agents, partial autonomy goes beyond this, too. It brings in simple, repetitive steps to gradually slide the level of autonomy we entrust to AIs when they are ready for it.
AI in Tech Stacks or AI Eats Our Tech Stacks?
Many AI companies now promise systems that do everything well and can handle it all. And yet we’re nowhere near this, realistically.
I believe many of the pilots we see failing for enterprises and startups alike come from this excessive enthusiasm, or optimism to get AI to handle far too much, and ignoring the ways it can transform business by doing less.
By entrusting too much autonomy to agents, they can waste time and money overlooking what today’s AI systems thrive at.
Writers, like coders, will tell you correctly that AI is not yet ready to create finished products. But what it’s already doing (and so many are using and even taking for granted), is providing assistance at research, outlining, rough drafts, autocomplete, and debugging or troubleshooting, day after day.
In some ways, the LLMs function like a computer itself, or its operating system. But what is still needed are the applications or software that run on this, and the GUI we use for interacting.
Karpathy describes a time from his experience at Tesla where they found their neural networks were increasingly being needed to do things they’d written in C++ code (he calls this Software 1.0 and 2.0).
The C++ code was continually being “eaten through,” or remade, in the form that was useful to the neural network.
To this end, context is critical and managing it may be like controlling a system’s memory. Where once we wrote code to manipulate a machine to gather and process data, increasingly the LLM can do many of these steps for us.
And yet we are still sharing these systems via the cloud. This poses security risks, but it also slows the process down.
Today’s AI systems need human-checking and they also need fine-tuning. Those parts, in my experience, are non-negotiable.
This makes it essential that we go against the grain of the “do-everything” approach to AI, and instead work with smaller, faster iterations to ensure we’re checking the AI and helping it to improve as quickly and effectively as possible.
This is far easier done in tiny bites than trying to digest entire meals, or stacks of meals, all at once.
Anyone who’s written much software via an LLM has seen the system vanish into a rabbit hole, where suddenly it’s not realistically assessing itself, getting much accomplished, or even really appearing to listen to your requests anymore.
And the best recourse in such situations is to just take what you need and start fresh. Or take advantage of the AI’s very limited memory, using partial autonomy.
This is a great example of our need for quicker checks for the human-in-the-loop. As Karpathy says, it is in our best interest, in the workplace at least, to keep LLMs on the tightest leash possible.
Conclusion: AI Workflow Automation and AI Agents for Daily Tasks
It’s not always exciting to say you want your LLM to help with the easiest possible asks. And yet, there is incredible value in that many are not yet realizing.
While AGI is tantalizing, it is also an incredibly complex, moving target with a highly fuzzy definition. And while it may benefit the largest tech companies to push all their chips in on its development, for the rest of us it’s far less important in the here and now than the question of how AI agents, practically, can assist in our daily tasks.
To this end, I see the near-term future of AI at work being about mastering safe, fast, affordable, and consistent use of AIs to slowly rewire what we do without bogging us down in frustrating complexities and rabbit holes.