It’s interesting how almost every model released in 2025 has specifically targeting coding. That focus has clearly been paying off: these coding models are getting really good now.
I still contend that this sort of task is uniquely positioned to show off LLMs. The idea that they’ll turn into agents that can do real-world tasks remains a fantasy. Despite how impressive this is, they’re losing money and have no real path to profitability.
Look up Ed Zitron’s newsletter and podcast for more info on why the industry is a bubble. I’m genuinely impressed with this specific example, but our economy is gonna suffer when the bubble bursts.
I’m curious. What would it take to change your mind? I’d like to check in with you in two years to see what you think then.
A demo that was open to the public (as in, not stage managed) where people could have the “agents” perform complex tasks without failing on a regular basis. Large training models are notoriously bad at anything they haven’t been trained to do. They’re worlds away from being able to interpret a new situation and “figure it out.”