For those looking for my normal political commentary, this isn’t it. I wrote this a few months ago and have nowhere else to put it. I’ve been involved with AI at the margins for a long time. It’s just awesome in what it can do, but can it truly replace a person? I do not think so. Here’s why:
Abstract
Billions of dollars are being bet on the arrival of artificial general intelligence soon. People smarter than me are confident it’s coming. But every time I look at what AGI would actually require, I run into the same set of problems. They’re not just hard. They seem to be at war with each other.
AGI, as people describe it, needs to do three things: understand and justify what it knows, reflect on its own reasoning, and operate across any domain. These sound reasonable together. But try to build something that does all three, and you hit a wall. Satisfy one requirement, and you undermine another. The more I look at it, the more AGI seems less like a hard engineering problem and more like a package of demands that can’t be met simultaneously.
This essay works through why. Not to claim machines can’t be powerful or useful—they obviously can. But the specific thing called AGI, the thing that understands and knows and justifies, may be asking for a triangle with four corners.
Introduction
I’ve been watching AI hype cycles for a while now. In the late 1980s, I worked at Gold Hill on Golden Common Lisp and GoldWorks, an expert system shell. Expert systems were going to change everything. Companies were betting big. The future was obvious.
Then came the AI winter. The technology hit walls nobody had anticipated. The money dried up. As the company contracted, I ended up as one of the last caretakers—eventually head of R&D, though by then the job was less about expanding the technology than refining what remained of an outmoded paradigm. I watched that hype cycle collapse from inside, with my hands on the equipment.
The current moment feels familiar. Startups raise billions for AGI infrastructure. Job postings announce platforms “for when AGI is achieved.” Not if. When. The confidence is remarkable.
Maybe they’re right. The people making these bets are smart, and the technology is genuinely impressive. I use these tools every day. They’re useful in ways that would have seemed like magic in 1989.
But I keep running into problems. Not new problems—old ones, from the early days of AI research. Problems that got set aside rather than solved. And when I push on them, they seem to connect in a way that makes me doubt AGI is coherent as a goal.
I’m not a philosopher. I don’t have the vocabulary for some of this. But I’ve been thinking about these issues for decades, and I want to lay out what I keep bumping into.
What AGI Would Have to Be
First, some clarity on what we’re talking about.
When people say AGI, they sometimes mean general capability across domains. But that’s nearly here already. Combine a few specialized systems with a generalist coordinator and you get broad capability. Nobody would call that AGI. Nobody’s funding platforms for “when ensemble systems arrive.”
The real excitement, the thing driving the valuations and the apocalyptic warnings, is something bigger: a system that genuinely understands. That knows things, not just predicts them. That can justify its reasoning, catch its own errors, and operate reliably across situations it’s never seen before.
That’s what I mean by AGI in this essay. Not just capability, but understanding.
For a system to genuinely understand, it seems like it needs to do at least three things:
First, it needs epistemic authority. It has to be the kind of thing that can know, not just output. There’s a difference between a broken clock showing the right time and someone who knows what time it is. The clock correlates with truth twice a day. The person has justification. AGI needs to be in the second category, or it’s just a very sophisticated broken clock.
And I don’t mean reliable within some narrow domain—we have that already. AGI claims to be the kind of thing that can know across arbitrary situations, the way humans can walk into an unfamiliar field and start figuring things out. That’s the ambition. Anything less is just narrow AI with good marketing.
Second, it needs self-reflection. It has to model its own reasoning, catch errors, know what it knows and what it doesn’t. Otherwise, it’s just running patterns without any understanding of what it’s doing.
You could build systems that do surface-level error checking without deep meta-reasoning. But that’s not what AGI promises. The whole point is a system that can question its own foundations when they fail—and that’s where the problems start.
Third, it needs to handle novelty. Not just perform well on variations of training data, but actually cope with genuinely new situations—figure out what matters, what’s relevant, what concepts apply.
These seem like reasonable requirements. Taken one by one, none of them is obviously impossible. But when you try to engineer a single system that satisfies all three at once, the tensions start to show. That is where the triangle appears.
The Triangle
Here’s what I keep running into.
Epistemic authority—being a genuine knower—requires some kind of consistency. You can’t claim to know things if you contradict yourself all the time. An inconsistent system can “prove” anything, which means its outputs are worthless as knowledge claims. So if you want AGI to actually know things, it needs to maintain some coherent standards.
But coherent standards bring you into the territory of formal systems. And formal systems run into Gödel’s incompleteness theorems. I’ll get into this more below, but the short version: any system powerful enough to reason about itself and consistent enough to be reliable will have blind spots. There will be truths about itself it cannot prove. It cannot fully verify its own reliability.
One response: relax consistency. Let the system tolerate contradictions, revise itself, and operate more loosely. This might help with Gödel. But it kills epistemic authority. An inconsistent system can’t ground knowledge claims. It might still be useful, but it’s not a knower in any meaningful sense. It’s just an engine that produces outputs.
Another response: don’t worry about self-reflection. Just build a capable system without demanding it understand itself. But then you’ve given up on a key part of what AGI was supposed to be. A system that can’t reflect on its own reasoning can’t catch its own errors, can’t know what it knows, can’t be trusted to understand rather than just perform.
And underneath all of this is a deeper problem: figuring out what’s relevant. Any situation involves an infinite number of potentially applicable facts. How does a system decide what matters? This is called the frame problem, and it’s been haunting AI since the 1960s. I’ll get into this too, but here’s the punch line: solving it seems to require already having the understanding you’re trying to build.
So the requirements keep colliding:
Epistemic authority requires consistency. Consistency exposes you to Gödel. Your system has blind spots about itself. It can’t fully verify its own reliability.
One escape: relax consistency. But then you lose epistemic authority. An inconsistent system isn’t a knower. It’s just an output generator.
Another escape: give up on self-reflection. But then you’ve abandoned a core part of what AGI was supposed to be. You’ve got capability without understanding.
Another escape: treat AGI as an open-ended process. But the halting problem means such a process can’t know if it’s converging. It has no stable identity. It’s not a thing that can have achieved AGI.
Underneath all of these: the frame problem. To handle novelty, you need to determine relevance. To determine relevance, you need meaning. Meaning is what you were trying to build. The dependency runs the wrong way.
Every corner of the triangle leads to another corner. The requirements aren’t just hard to satisfy together. They may be impossible to satisfy together.
Not because any single problem is insoluble. Because solving some forecloses others.
To see why the first side of this triangle causes trouble, it helps to look more closely at what kind of thing a reasoning machine actually is.
The Gödel Corner
Let me unpack Gödel, because it’s easy to misuse. But to understand why it matters for AGI, I need to start earlier—with what computers actually are.
At the turn of the twentieth century, Bertrand Russell and Alfred North Whitehead tried to put all of mathematics on a foundation of pure logic. Their Principia Mathematica was an attempt to show that mathematical truth could be derived from logical rules alone—complete, consistent, certain.
In 1931, Kurt Gödel proved this was impossible. Any formal system powerful enough to express basic arithmetic and consistent enough to be reliable contains true statements it cannot prove. If it’s consistent, it’s incomplete. The dream of a complete logical foundation for mathematics was over.
Here’s what matters for AGI: computers inherit the logicist project. Formal consistency is the basis for their reasoning.
At bottom, every computer is Boolean logic—AND, OR, NOT gates doing arithmetic. That’s all a computer is. The software, no matter how sophisticated, runs on top of these logical foundations. Neural networks, probabilistic models, and deep learning—all of it compiles down to logic gates doing formal operations.
This means anything a computer does is, at the deepest level, a formal system powerful enough to express arithmetic. The very thing Gödel proved was incomplete.
Now, Gödel was talking about proof systems—mathematics, logic. A neural network isn’t trying to prove theorems. It’s matching patterns. So you might think Gödel doesn’t apply.
For many purposes, that’s right. A system that predicts the next word doesn’t need to worry about incompleteness. It’s not making knowledge claims.
But AGI is supposed to be different. It’s supposed to know things. Justify its reasoning. Be a reliable epistemic agent. The moment you want that, you need consistency, discipline—standards for what counts as justified belief, for how to revise when you’re wrong, for what follows from what.
Any such standards, if they’re powerful enough to handle AGI-level reasoning, constitute a formal system. You can’t build a reasoning machine without building a formal system. And because computers are Boolean logic at the bottom, any formal system running on them inherits the same foundations that Gödel proved incomplete.
There’s no escaping this by making the system “more flexible” or “less formal.” The flexibility is implemented in formal logic. The computer underneath is still doing arithmetic. The foundations don’t change.
A clarification matters here. Gödel’s theorem doesn’t forbid computation. It forbids completeness. A consistent formal system can still compute, still reason, still do useful work. What it cannot do is prove all truths expressible in its own language—including, crucially, truths about its own reliability. Most philosophers today see Gödel’s theorems not as prohibitions on machine reasoning, but as structural markers of limits on self-reference. The question is whether those limits matter for AGI. I think they do.
Philosophers have argued about this for decades. Lucas and Penrose claimed Gödel shows human minds transcend computation because we can see the truth of Gödel sentences that formal systems can’t prove about themselves.
Critics pushed back hard. LaForte, Hayes, and Ford pointed out that the argument assumes humans are consistent. But we’re not. We contradict ourselves constantly. If Gödel only limits consistent systems, and humans aren’t consistent, then maybe the theorem doesn’t show anything special about us.
Here’s where it gets interesting for AGI. That rebuttal is correct. Humans aren’t consistent. And that might be exactly how we escape Gödel’s limits—by being the kind of messy, inconsistent thinkers that formal constraints don’t bind.
But AI can’t use that escape hatch.
First, because an inconsistent AI loses the epistemic authority AGI requires. You can’t trust the outputs of a system that contradicts itself. If you want AGI to be a knower, it needs consistency. And consistency means Gödel.
Second, and more fundamentally, any “inconsistency” in an AI would have to be implemented in the formal system. It would be inconsistent according to rules, running on logic gates, grounded in the same Boolean foundations. That’s not the kind of constitutive inconsistency humans have. That’s a formal system simulating inconsistency—which is still a formal system.
Humans might escape Gödel because we were never formal systems to begin with. Our inconsistency isn’t implemented in logic. It might go all the way down.
AI can’t make that move. The computer is a formal system all the way down. There’s no deeper level where it escapes. The logicist foundations are the floor.
The rebuttal intended to rescue AGI actually makes things worse. It shows there’s an asymmetry. Humans can be inconsistent and still know things. Machines, built on formal logic, need consistency to be trustworthy. And consistency means limits they cannot escape—not because they haven’t tried the right architecture, but because the hardware itself is Gödel’s territory.
Gödel’s results mark one boundary: what a consistent, self-reflective formal system can know about itself. But AGI also has to live in a world, not just in a proof calculus. Once you ask how a system decides what matters in that world, you run into a different class of problems.
The Frame Problem Corner
The frame problem sounds technical, but it’s actually pretty simple.
Imagine a robot trying to fetch you a cup of coffee. It needs to know what’s relevant. The cup’s location matters. The cup’s weight matters if it’s full. The color of the walls doesn’t matter. The political situation in France doesn’t matter. The cup’s manufacturing history doesn’t matter.
Humans make these relevance judgments instantly, without thinking. We just see what matters. But how? Nobody knows. And for a formal system, this is catastrophic.
In principle, any fact could be relevant to any situation. Every action could change every other fact. To be sure you haven’t missed something, you’d have to check everything. The computational space explodes.
Early AI researchers assumed they’d solve this with clever heuristics. Six decades later, no general solution exists.
Modern AI sidesteps the problem through massive pattern matching. Large language models learn what’s typically relevant by absorbing vast amounts of human text—text produced by minds that already solved the relevance problem. The model doesn’t determine relevance. It approximates what humans have determined to be relevant in similar situations.
This works remarkably well within the distribution of training data. It fails, sometimes spectacularly, when genuinely novel situations arise.
Here’s the deeper issue. To determine what’s relevant, you need to understand what the situation is about, what your goals are, what concepts apply. But understanding is what you were trying to build. You can’t solve the frame problem without already having the understanding that solving it was supposed to produce.
Meaning requires relevance. Relevance requires meaning. It’s circular.
And when you try to determine relevance for self-reflection—figuring out what’s relevant about yourself—you hit another wall. What’s relevant to determining what’s relevant? That’s a relevant question too. It recurses. And there’s no guaranteed stopping point. The frame problem, applied to itself, becomes a kind of halting problem: there’s no way to know when you’ve checked enough, no termination condition that doesn’t require solving the problem you’re trying to solve.
Humans somehow live inside this circle without resolving it. We don’t figure out what’s relevant by applying rules. Things just matter to us, directly, before reasoning kicks in. A loud noise matters before you decide it matters. Pain matters before you think about it.
Consider what happens when you touch a hot stove. You don’t first sense temperature data, then consult a rule about tissue damage, then compute an appropriate response, then execute a withdrawal motion. The burning matters immediately, in the experience itself, before any reasoning occurs. The urgency is not computed. It’s there from the start. Or think about catching a ball someone throws to you unexpectedly—you’re already moving before you’ve consciously registered the situation. The relevance of the ball’s trajectory to your hands was never decided. It was given.
This is what I mean by “inhabiting” rather than “implementing”. When you inhabit a situation, the mattering comes first. When you implement a response, the rules come first, and the mattering, if it’s there at all, gets added by external interpreters.
I’ve started calling this the difference between implementing and inhabiting. A system that implements rules specified independently of anything that matters to the system. All the mattering was put there by the designers. A system that inhabits encounters a world where things already matter. Mattering comes first.
Building an “implementing system” is straightforward. Building an “inhabiting system” requires specifying what matters. But specifying what matters is implementing, not inhabiting. You can’t implement your way into inhabitation.
At this point, a natural response is: “Sure, pure logic-based systems and pure pattern-matchers each have limits. But what if we combine them?” That is the promise of hybrid or neuro-symbolic approaches.
What About Hybrid Approaches?
Some researchers hope to get around these problems by combining neural networks with symbolic reasoning—so-called neuro-symbolic or hybrid AI. The idea is to get the best of both worlds: neural networks handle pattern recognition and learning from data, while symbolic systems handle logical reasoning and explicit knowledge.
Gary Marcus, one of the most prominent advocates for this approach, argues that hybrid architectures are “necessary for robust intelligence, but not sufficient.” He’s right on both counts, and the second part is the problem.
Hybrid systems are real and useful. Google’s AlphaGeometry combines neural networks with symbolic proof systems to solve geometry problems. IBM and others have built systems that integrate learning with logical rules. These work better than either approach alone on certain tasks.
But they inherit the limitations of both components. The neural part still approximates relevance from training data—it doesn’t originate new frames. The symbolic part still runs into frame problems in open-ended situations: which rules apply? When do you stop searching? How do you know your knowledge base is complete enough? The hybrid has to decide when to hand off from one component to the other, and that decision is itself a relevance judgment.
As one recent survey put it, integrating neural and symbolic approaches remains “a fundamental and open challenge.” The combination helps with specific tasks but doesn’t dissolve the underlying circularity. You’ve distributed the problem across two systems. You haven’t solved it.
The early AI approaches I worked on—expert systems, knowledge-based reasoning—were symbolic systems that hit the frame problem head-on. The current neural approaches sidestep it through massive pattern matching. Hybrid systems try to do both. None of them escapes the fundamental issue: something has to determine relevance, and determining relevance requires meaning, and meaning is what you were trying to build.
The Halting Problem Closes the Escape
There’s one more piece, and it’s Turing’s.
In 1936, Alan Turing proved you can’t write a program that determines, for all possible programs and inputs, whether that program will ever halt or run forever. There’s no general solution to “will this process finish?”
Why does this matter for AGI?
Some people respond to the problems above by redefining AGI. Forget completed systems. Think of AGI as an open-ended process—always learning, always revising, never finished. A trajectory, not a destination.
But a trajectory has its own problems. If the system is always revising itself, can it know when revision should stop? Can it know if its changes are progress or just drift? Can it know if it’s converging toward understanding or wandering in circles?
The halting problem says no. A system can’t, in general, determine whether its own processes will terminate or converge.
So open-ended AGI can’t know if it’s getting anywhere. And without that, there’s no stable identity over time. What makes version 47 of the system the same knower as version 1? If it’s abandoned previous standards, adopted new criteria, changed what it counts as understanding, in what sense is it the same agent?
This is a version of the Ship of Theseus puzzle, but worse. With the ship, at least there’s a continuous physical object being gradually modified. With a self-revising AI that changes its own criteria for what counts as knowledge, understanding, or even improvement, there’s no stable frame in which to even ask whether it’s the same system. The very standards by which you’d judge continuity are themselves subject to revision. It’s not just that the planks are being replaced—the concept of “plank” keeps changing.
The problem isn’t just that we can’t verify such a system. The problem is that there’s no “it” to verify. It’s not a system at all. It’s a sequence of states with no principled continuity.
Calling that “AGI” is smuggling in human intuitions about identity and persistence while abandoning everything that made AGI a coherent goal.
The Triangle Closes
Let me pull this together.
AGI, as people actually mean it, requires epistemic authority, self-reflection, and the ability to handle novelty. These seem like reasonable requirements. But they fight each other.
Epistemic authority requires consistency. Consistency exposes you to Gödel. Your system has blind spots about itself. It can’t fully verify its own reliability.
One escape: relax consistency. But then you lose epistemic authority. An inconsistent system isn’t a knower. It’s just an output generator.
Another escape: give up on self-reflection. But then you’ve abandoned a core part of what AGI was supposed to be. You’ve got capability without understanding.
Another escape: treat AGI as an open-ended process. But the halting problem means such a process can’t know if it’s converging. It has no stable identity. It’s not something that could have achieved AGI.
Underneath all of these: the frame problem. To handle novelty, you need to determine relevance. To determine relevance, you need meaning. Meaning is what you were trying to build. The dependency runs the wrong way.
Every corner of the triangle leads to another corner. The requirements aren’t just hard to satisfy together. They may be impossible to satisfy together.
Not because any single problem is insoluble. Because solving some forecloses others.
If that diagnosis is correct, AGI, as usually advertised, is not just distant; it is mis-specified. Still, it is important to be clear about what would actually count as evidence that this picture is wrong.
What Would Change My Mind
I could be wrong about all of this. I’m a software engineer, not a philosopher. I’ve been thinking about these problems for a long time, but smarter people than me are confident AGI is coming.
Three things would make me reconsider.
First, a demonstrated system that originates new relevance frames without borrowing from human semantics. Not interpolation from training data. Not the retrieval of human-generated concepts. Genuine origination in a domain with no prior human framing.
Second, an explanation of how “this should matter” can arise from “this correlates” without smuggling in meaning. A formal account of normativity that doesn’t presuppose what it’s trying to explain.
Third, a system that revises its own interpretive framework without human input to determine what counts as improvement. A machine that can tell, on its own, whether its changes are progress or drift.
None of these exist today. That doesn’t prove they’re impossible. But it shifts the burden. People claiming AGI is imminent should explain how these gaps close, not just assume that scale will handle it.
Conclusion
I want to be clear about what I’m claiming and what I’m not.
I’m not claiming AI is useless. It’s obviously useful. I use it every day. The current tools are more capable than anything I imagined when I was working on expert systems in the 80s.
I’m not claiming machines can’t be intelligent in some sense. Narrow AI is real and powerful.
I’m claiming that the specific package called AGI—a system that genuinely understands, knows, justifies, reflects on itself, and handles arbitrary novelty—may be internally inconsistent. Not hard to achieve. Incoherent as a goal.
The requirements fight each other. Satisfying some undermines others. Every escape route leads back into a wall.
What we can build, and in fact are already building, looks more like what I want to call artificial general participation. These are systems that plug into human practices of understanding rather than replacing them. They help extend and coordinate human meaning-making, but they do not originate it from nowhere.
In practice, artificial general participation might look like:
Systems that help humans explore complex domains by surfacing patterns, analogies, and candidate hypotheses—while humans supply the standards for what counts as insight, relevance, or progress.
Tools that stabilize and share local know-how across communities, capturing tacit practices, constraints, and norms—without pretending that the tool itself has those practices or norms in the way practitioners do.
Agents embedded in institutional workflows—science, law, medicine, engineering—that can suggest options, flag inconsistencies, and simulate consequences, but whose proposals are always interpreted and normatively assessed by humans.
In all of these cases, the system participates in understanding without possessing understanding in the way AGI discourse suggests. It contributes to epistemic processes, but it does not itself stand as an autonomous epistemic agent with unconditional authority. Its “mattering” is always downstream of what matters to the humans and institutions that design, deploy, and oversee it.
Taking artificial general participation seriously would also change how we think about governance and trust. Instead of asking “Can we trust this system as if it were a human knower?”, we would ask:
Who is accountable when its outputs are taken up?
What human purposes and values is this system embedded in?
How are its limitations represented to users, and how easy is it to override or contest its suggestions?
The focus shifts from conjuring a disembodied super-knower to designing socio-technical arrangements where limited, fallible tools can still help humans reason better together. That is not as flashy as AGI, but it is much closer to what current systems can actually do—and to where serious work on alignment, safety, and institutional design needs to happen.
What would meaningful artificial participation look like in practice? How do you govern, align, and trust systems that extend human understanding without possessing it? Those are real questions, and they deserve serious attention. But they require first admitting what we’re actually building—and what we’re not.
The last time I watched an AI hype cycle, everyone was sure expert systems would change everything. They were wrong, not because the technology was bad, but because the goals outran what the technology could actually deliver.
I see the same pattern now. Remarkable technology. Genuine usefulness. And confidence about where it’s heading that isn’t grounded in any account of how the foundational problems get solved.
Maybe I’m wrong. Maybe the people betting billions know something I don’t.
But the current discourse has a familiar ring. And I’ve learned to pay attention when smart people promise things that keep running into the same walls.
Notes
- McCarthy, J. and Hayes, P.J. (1969). “Some Philosophical Problems from the Standpoint of Artificial Intelligence.” Machine Intelligence 4: 463-502.
- Dennett, D.C. (1984). “Cognitive Wheels: The Frame Problem of AI.” In C. Hookway (ed.), Minds, Machines and Evolution. Cambridge University Press.
- Whitehead, A.N. and Russell, B. (1910-1913). Principia Mathematica. Cambridge University Press.
- Gödel, K. (1931). “Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I.” Monatshefte für Mathematik und Physik 38: 173-198. For accessible treatment: Nagel, E. and Newman, J.R. (1958). Gödel’s Proof. New York University Press.
- Lucas, J.R. (1961). “Minds, Machines and Gödel.” Philosophy 36(137): 112-127.
- LaForte, G., Hayes, P.J., and Ford, K.M. (1998). “Why Gödel’s Theorem Cannot Refute Computationalism.” Artificial Intelligence 104(1-2): 265-286.
- Penrose, R. (1989). The Emperor’s New Mind. Oxford University Press.
- Brooks, R. (1991). “Intelligence Without Representation.” Artificial Intelligence 47: 139-159.
- Clark, A. (2013). “Whatever next? Predictive brains, situated agents, and the future of cognitive science.” Behavioral and Brain Sciences 36(3): 181-204.
- Turing, A.M. (1936). “On Computable Numbers, with an Application to the Entscheidungsproblem.” Proceedings of the London Mathematical Society 42: 230-265.
- Dreyfus, H.L. (1972). What Computers Can’t Do. MIT Press.
- Marcus, G. (2020). “The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence.” arXiv:2002.06177. See also Marcus, G. and Davis, E. (2019). Rebooting AI: Building Artificial Intelligence We Can Trust. Pantheon.
- Garcez, A. and Lamb, L.C. (2023). “Neurosymbolic AI: The 3rd Wave.” Artificial Intelligence Review 56: 12387-12406. See also: Krishnan, A. et al. (2024). “Towards Cognitive AI Systems: A Survey and Prospective on Neuro-Symbolic AI.” arXiv:2401.01040.
- Varela, F.J., Thompson, E., and Rosch, E. (1991). The Embodied Mind: Cognitive Science and Human Experience. MIT Press.
- Elish, M.C. (2019). “Moral Crumple Zones: Cautionary Tales in Human-Robot Interaction.” Engaging Science, Technology, and Society 5: 40-60.
- Shanahan, M. (2015). The Technological Singularity. MIT Press.