Agentic Coding: A New Abstraction Layer in the Programming Stack

Or How I Learned to Stop Worrying About AGI and Love Agents

February 2026

TLDR: I give some thoughts on how AI has influenced my coding setup throughout the years since ChatGPT's release. I argue that agents can be thought of as a new layer of abstraction in the programming stack, speculating (with some uncertainty) that future software developers will "code" requirements in natural language. (I am not sure what form this will take; what the "IDEs of the future" will look like.) I present a historical comparison of the mistrust at the advent of new layers of abstraction in the stack, motivating the previous point and arguing that the resistance to coding agents is similar to that of other major paradigm shifts in programming.

Agents and Me

Agents have taken the world by storm. I think it is fair to say that no one in the programming world has been left indifferent by their advent. Although these systems have not made their way into every workplace and every developer's computer yet, they have their fair share of early adopters. Across conversations with AI researcher and developer friends, I have really seen a range of opinions. Some PhD colleagues flat out refuse to use them out of mistrust: they feel like understanding the internals of these models makes them less trusting — like aviation engineers being scared to take the plane, knowing every possible thing that could go wrong. Others, like myself at this point, have totally leaned into the coding agents, for better or for worse. I have been using Claude Code with the --dangerously-skip-all-permissions flag active all the time (meaning it no longer asks me for permission to run commands) and have fully leaned into the "skills" and CLAUDE.md development philosophy. Customizing my Claude Code setup reminds me of how I would customize my IDE back when I started grad school. To be quite honest, agentic coding has made me love building again. I feel like the barrier to entry to forking a codebase or starting a new one from scratch is much lower. It is hard to say how much of a productivity gain coding with agents has brought me, but I am certain that I have saved myself a lot of time spent with the "blank page syndrome," wondering how to structure a codebase or just procrastinating starting by lack of motivation. Agentic systems might not always be right (however, being honest, they are right a lot of the time), but starting from a first draft, or saving the pain of reading through someone else's code to start a new project is definitely freeing for me. Going back to this IDE vs. CLAUDE.md thing, I noticed more broadly that, throughout my budding academic career, I can chart my software development paradigm shifts into four broad categories:

Coding + Stack Overflow: from undergrad to early ChatGPT days, I would code manually and go to Stack Overflow to find functions or one-liners to solve specific problems. I would typically organize my code (e.g., a main or a complicated object/function) by first writing the headers and writing comments in natural language explaining what each subpart of the code would do. Then I would either code each "comment" or go look for a Stack Overflow post or such to fill it in.
Coding + ChatGPT: The first "AI revolution" for me was the shift from Stack Overflow to ChatGPT. When ChatGPT got good enough at writing Python code, I started using it the same way I would use Stack Overflow. The rest of the development remained the same. Through this phase, I started using ChatGPT to generate more and more ambitious bits of code, giving it more and more freedom (from one or two lines of code to writing full functions/methods for an object).
Coding + Copilot: At first this was similar to ChatGPT use, writing comments and having Copilot autocomplete the missing bits according to the spec. The magic here was that I could write the comments and have Copilot directly add the lines of code in the IDE! At this phase I started dabbling with agents in the Copilot setup. I was not convinced though; at that point Copilot's agentic setup was going too fast for me and was making a fair bit of mistakes. I felt that I needed to give more supervision to the AI I was letting code.
Agents (Claude Code in my case): I finally decided to let the barbarians in in the summer of 2025. A friend of mine at ACL said "You have to try Claude Code man, it's really good" in response to my skepticism concerning agents. I did, and I have not looked back. Depending on the nature of the project, I sometimes don't even look at the code anymore. The more I use it, the more I realize how good it is. I am not the best programmer, but I am not a terrible one either. In any case, Claude is definitely a better programmer than I am.

Agents as a New Layer of Abstraction

Going through these shifts in my software development philosophy got me thinking about the nature of software development. Which is, namely, about developing software. In terms of a black box, we have something like "requirements in → software out." Throughout most of the history of programming, computer scientists developed new layers of abstraction to facilitate the development of code. At the current point of my ridiculous vibe coding follies, I figured that we are at the cusp of a new layer of abstraction.

Andrej Karpathy tweeted a couple years back that "The hottest new programming language is English." This could not be truer today in the era of coding agents. I code in English now. Claude does the technical part for me. But this does not mean that I code mindlessly. Throughout all the tokens I burned, I noticed a principle which I dub "information in, information out": although I write mostly in English and can abstract away algorithmic details (how the tensor operation will be done, what specific method to use for something), I have to be clear about design decisions. What loss function to use? How to structure the data tensor (batch first or not)? What positional encodings for the transformer? Although these are not low-level technical details, these design choices still require some level of technical competency. When I leave these unspecified, the model typically does it wrong; using an incorrect loss function or inferring some technical detail of the architecture different from what I want. And how could it know? Especially in research and when it comes to the "art of training deep models," there is no objective right answer to these questions.

To make this concrete, here is a rough example from my actual workflow. Say I want to train a small transformer on a sequence copying task (the kind of toy task you would use to study length generalization). If I just tell Claude Code "write me a transformer that learns to copy sequences," I will get something — but it will make a bunch of design choices that I probably disagree with. It might use learned positional encodings when I wanted NoPE or RoPE. It might set up the causal mask wrong for an encoder-decoder vs. a decoder-only setup. It might default to cross-entropy when the task structure calls for something else. The code will run, but it won't be my experiment. What I actually need to specify is closer to: "decoder-only transformer, RoPE embeddings, 4 layers, 256 hidden dim, trained on sequence copying where input is [tokens] [SEP] and target is the same tokens again, use teacher forcing, cross-entropy loss only on the target portion of the sequence, train lengths 1–20 and evaluate on lengths 21–40." Once I give it that, it gets it right almost every time. The implementation details — how to build the RoPE rotation matrices, how to set up the data loader, how to mask the loss — those it handles fine on its own. The point is: the things I still need to specify are design-level decisions, not implementation-level ones. The abstraction eats the implementation but not the design.

I think this maps pretty cleanly onto how previous layers of abstraction worked. Each one removed a category of decisions from the programmer's plate while keeping the ones above it. When C came along, you no longer had to think about register allocation or instruction scheduling, but you still had to manage memory — malloc, free, pointer arithmetic, all of that was still on you. When Java and managed runtimes appeared, memory management got abstracted away by garbage collection, but you still had to think about your object structure, your type hierarchy, how to decompose the system. When scripting languages like Python took off, you could stop worrying about type declarations and boilerplate, but you still had to think about the algorithm, the control flow, the architecture of your code. Agents, I think, are the next step in this progression. They abstract away the implementation — the specific syntax, the library calls, the boilerplate — but the design decisions stay with you. In the DL case: the model architecture, the training procedure, the evaluation protocol. In web dev, I imagine it would be something like the database schema or the API design. The stuff that requires judgment rather than knowledge of how to write it down.

This led me to speculate about the future of software development (or computer use in general). Although speculation often turns out to be wrong, I felt like I would give it a go anyway. I figure if my reflections are high-level enough, I can retroactively justify them as being right in hindsight à la Schmidhuber (who probably has already scooped my blog post idea...). I will talk about AI (my field) mostly as I don't have much recent experience developing other things, but I think a lot of the ideas transfer.

As it stands, we now have a "design requirements in natural language in → code out" pipeline. That being said, the way I interact with Claude Code is actually not that different from how I used to code before AI was in the picture. Back in undergrad, my workflow was to write out the structure of my code in comments first — natural language descriptions of what each block should do — and then fill in the actual code underneath each comment (or go find it on Stack Overflow). The agentic workflow is the same thing, except now I never write the code underneath the comments. The comments are the program. The filling-in is handled by the agent. In a way, the paradigm shift is less radical than it seems: I was already programming in a mix of natural language and code, I just didn't have a system that could execute the natural language part.

If I had to guess where this goes next, I think there will be something like a convergence on how to write good "specs" for agents. Every field of programming already has its own terminology and conventions, and these are probably the natural starting point. A DL researcher prompting an agent and a frontend developer prompting an agent are going to use very different vocabularies, but in both cases the quality of the output is going to depend heavily on how precisely you specify the design-level decisions. I figure the workflow will eventually settle into something like two parts: first you detail the specs — inputs, outputs, design choices, constraints — and then you give a high-level pseudocode of what you want the system to do. Not implementation-level pseudocode, but something more like a description of the data flow and the major components. I don't know exactly what form this will take or what the "IDEs of the future" will look like, but I think the direction is fairly clear.

There is, however, one fundamental difference between this abstraction layer and every one that came before it: it is inherently stochastic. When a C compiler compiles your code, it produces the same assembly every time for the same input. When the JVM runs your bytecode, the behavior is deterministic. When you write a Python script, it does the same thing on every run (modulo random seeds and such). Agents don't work this way. Ask Claude Code the same thing twice and you might get different implementations. Sometimes meaningfully different. This is, I think, the core reason the trust problem with agents feels different from the trust problem with compilers or garbage collectors. With a compiler, once you've verified it works, you can trust it on new inputs because the mapping from source to output is fixed. With an agent, every invocation is in some sense a new roll of the dice. I am not sure what the implications of this are exactly. Maybe it means we will need new kinds of testing and verification. Maybe it means the "spec" you write for the agent needs to be precise enough to constrain the output distribution sufficiently. Maybe it just means we accept some amount of variance and learn to work with it, the way we already accept that two human programmers given the same spec will produce different code. I don't have an answer here, but I think it is the right question to ask.

A Historical Outlook

As is the case with most paradigm shifts, there is resistance to change. I have seen this myself (see intro) and I figure most of this mistrust is not unfounded. An interesting way to see we are in a paradigm shift is to look back, which I will try to do in this section. Previous major shifts in programming showed a similar resistance-to-change timeline. For instance, when compilers were introduced, many senior programmers refused to use them, stating that they could optimize instructions better than the machine. They weren't totally wrong: at the time, early compilers were producing worse code than a skilled assembly programmer could write by hand. John Backus, the creator of Fortran, described the programmers of the 1950s as "members of a priesthood guarding skills and mysteries far too complex for ordinary people" who met the idea of compilers with "considerable hostility and derision." Even von Neumann himself — the father of the stored-program computer — reportedly dismissed the idea of compilers: "Why would you want more than machine language?". Grace Hopper built the first compiler (A-0) in 1952 and nobody would touch it for three years. She was told computers could only do arithmetic. She had a working compiler and people still didn't believe it. In a way, there is something deeply human about this. Not only was this a matter of trust ("can the machine really do what I do?"), but also of identity ("if the machine can do what I do, what am I?"). There is a famous Usenet poem from 1983 called "The Story of Mel" about a programmer who refused to use even an optimizing assembler because, as he put it, "you never know where it's going to put things." The thing is, Mel's hand-tuned code did run faster. The resistance was not irrational; it was empirically grounded — at the time. But by 1958, over half of all code on IBM machines was already being generated by Fortran. Today compiling by hand would be unthinkable. Certain software engineering programs have even stopped teaching assembly altogether!

The compiler story is probably the most directly analogous to agents, but the pattern repeats at every rung of the abstraction ladder. Object-oriented programming in the '80s and '90s is another good example. The core idea of OOP was to bundle data and the functions that operate on it into "objects," organized into inheritance hierarchies. The promise was that this would make code more modular and reusable. The criticism, broadly, was that this bundling created more problems than it solved. Inheritance hierarchies tend to get deep and tangled — you change something in a parent class and break five child classes you forgot about. State gets hidden inside objects, making it hard to reason about what a program is actually doing. And the reusability promise often didn't pan out in practice: Joe Armstrong, the creator of Erlang, put it memorably in an interview for Coders at Work (2009): "You wanted a banana but what you got was a gorilla holding the banana and the entire jungle." Armstrong also wrote an essay called "Why OO Sucks" where he noted that at the time "criticising OOP was rather like swearing in church" — there was real social pressure to go along with the paradigm. Alexander Stepanov, the guy who created C++'s own Standard Template Library, called OOP "almost as much of a hoax as Artificial Intelligence". His argument was that you should start with algorithms, not with class hierarchies — that OOP gets the decomposition backwards. Torvalds called C++ "a horrible language" and deliberately wrote Git in C, in part to keep C++ programmers away from the project. Dijkstra quipped that "object-oriented programming is an exceptionally bad idea which could only have originated in California." But OOP won anyway. Not because the critics were wrong about the problems (inheritance hierarchies are a mess), but because the abstraction made it easier for large teams to build large systems.

Looking at all of these transitions together, there is a pattern that I think is hard to ignore. First: "it can't be done" (compilers can't match hand-coded assembly). Then: "it's too slow" (Java is 20x slower, GC pauses are unacceptable). Then: "real programmers don't use it" (real programmers use Fortran, real programmers manage their own memory). Then, after the new abstraction wins anyway: "they ruined it" (see: every Agile Manifesto signatory writing essays titled "Agile is Dead" a decade later). I figure we are somewhere between stages two and three with coding agents right now. The objections are partially valid (they do hallucinate, they do sometimes write subtly wrong code), just like early compilers did produce worse assembly, and early Java was genuinely slow. But if history is any indication, the trajectory seems fairly clear. In a sense, every layer of abstraction we ever built did the same thing: we distilled insights about what we understood in programming, taking repetitive tasks and making software to do them for us. LLMs are no different. By training on large amounts of text and code, they compressed a lot of boilerplate and knowledge into a new abstraction layer. The insights might be fuzzier and based on probabilities, but at the end of the day coding is creative and when humans code there is some amount of randomness. Throughout my coding journey, I have been told multiple times by many programmer colleagues and friends that there is a "right way" to code. They all had a different conception of what this meant. In any case, if you are looking you can find me vibe coding with my buddy Claude.