Oh no!
Anyway…
I’ve been hearing about the imminent crash for the last two years. New money keeps getting injected into the system. The bubble can’t deflate while both the public and private sector have an unlimited lung capacity to keep puffing into it. FFS, bitcoin is on a tear right now, just because Trump won the election.
This bullshit isn’t going away. Its only going to get forced down our throats harder and harder, until we swallow or choke on it.
With the right level of Government support, bubbles can seemingly go on for literal decades. Case in point, Australian housing since the late 90s has been on an uninterrupted tear (yes, even in ‘08 and ‘20).
But eventually, bubbles either deflate or pop, because eventually governments and investors will get tired of propping it up. It might take decades, but I think it’s inevitable.
The hype should go the other way. Instead of bigger and bigger models that do more and more - have smaller models that are just as effective. Get them onto personal computers; get them onto phones; get them onto Arduino minis that cost $20 - and then have those models be as good as the big LLMs and Image gen programs.
Well, you see, that’s the really hard part of LLMs. Getting good results is a direct function of the size of the model. The bigger the model, the more effective it can be at its task. However, there’s something called compute efficient frontier (technical but neatly explained video about it). Basically you can’t make a model more effective at their computations beyond said linear boundary for any given size. The only way to make a model better, is to make it larger (what most mega corps have been doing) or radically change the algorithms and method underlying the model. But the latter has been proving to be extraordinarily hard. Mostly because to understand what is going on inside the model you need to think in rather abstract and esoteric mathematical principles that bend your mind backwards. You can compress an already trained model to run on smaller hardware. But to train them, you still need the humongously large datasets and power hungry processing. This is compounded by the fact that larger and larger models are ever more expensive while providing rapidly diminishing returns. Oh, and we are quickly running out of quality usable data, so shoveling more data after a certain point starts to actually provide worse results unless you dedicate thousands of hours of human labor producing, collecting and cleaning the new data. That’s all even before you have to address data poisoning, where previously LLM generated data is fed back to train a model but it is very hard to prevent it from devolving into incoherence after a couple of generations.
this is learning completely the wrong lesson. it has been well-known for a long time and very well demonstrated that smaller models trained on better-curated data can outperform larger ones trained using brute force “scaling”. this idea that “bigger is better” needs to die, quickly, or else we’re headed towards not only an AI winter but an even worse climate catastrophe as the energy requirements of AI inference on huge models obliterate progress on decarbonization overall.
That would be innovation, which I’m convinced no company can do anymore.
It feels like I learn that one of our modern innovations was already thought up and written down into a book in the 1950s, and just wasn’t possible at that time due to some limitation in memory, precision, or some other metric. All we did was do 5 decades of marginal improvement to get to it, while not innovating much at all.
Are you talking about something specific?
Huh?
The smartphone improvements hit a rubber wall a few years ago (disregarding folding screens, that compose a small market share, improvement rate slowed down drastically), and the industry is doing fine. It’s not growing like it use to, but that just means people are keeping their smartphones for longer periods of time, not that people stopped using them.
Even if AI were to completely freeze right now, people will continue using it.
Why are people reacting like AI is going to get dropped?
Because in some eyes, infinite rapid growth is the only measure of success.
People are dumping billions of dollars into it, mostly power, but it cannot turn profit.
So the companies who, for example, revived a nuclear power facility in order to feed their machine with ever diminishing returns of quality output are going to shut everything down at massive losses and countless hours of human work and lifespan thrown down the drain.
This will have an economic impact quite large as many newly created jobs go up in smoke and businesses who structured around the assumption of continued availability of high end AI need to reorganize or go out of business.
Search up the Dot Com Bubble.
People differentiate AI (the technology) from AI (the product being peddled by big corporations) without making clear that nuance (Or they mean just LLMs, or they aren’t even aware the technology has a grassroots adoption outside of those big corporations). It will take time, and the bubble bursting might very well be a good thing for the technology into the future. If something is only know for it’s capitalistic exploits it’ll continue to be seen unfavorably even when it’s proven it’s value to those who care to look at it with an open mind. I read it mostly as those people rejoicing over those big corporations getting shafted for their greedy practices.
the bubble bursting might very well be a good thing for the technology into the future
I absolutely agree. It worked wonders for the Internet (dotcom boom in the 90s), and I imagine we’ll see the same w/ AI sometime in the next 10 years or so. I do believe we’re seeing a bubble here, and we’re also seeing a significant shift in how we interact w/ technology, but it’s neither as massive or as useless as proponents and opponents claim.
I’m excited for the future, but not as excited for the transition period.
Hope?
Because novelty is all it has. As soon as it stops improving in a way that makes people say “oh that’s neat”, it has to stand on the practical merits of its capabilities, which is, well, not much.
I’m so baffled by this take. “Create a terraform module that implements two S3 buckets with cross-region bidirectional replication. Include standard module files like linting rules and enable precommit.” Could I write that? Yes. But does this provide an outstanding stub to start from? Also yes.
And beyond programming, it is otherwise having positive impact on science and medicine too. I mean, anybody who doesn’t see any merit has their head in the sand. That of course must be balanced with not falling for the hype, but the merits are very real.
The merits are real. I do understand the deep mistrust people have for tech companies, but there’s far too much throwing out of the baby with the bath water.
As a solo developer, LLMs are a game-changer. They’ve allowed me to make amazing progress on some of my own projects that I’ve been stuck on for ages.
But it’s not just technical subjects that benefit from LLMs. ChatGPT has been a great travel guide for me. I uploaded a pic of some architecture in Berlin and it went into the history of it, I asked it about some damage to an old church in Spain - turned out to be from the Spanish civil war, where revolutionaries had been mowed down by Franco’s firing squads.
Just today, I was getting help from an LLM for an email to a Portuguese removals company. I sent my message in English with a Portuguese translation, but the guy just replied back with a single sentence in broken English:
“Yes a can , need tho mow m3 you need delivery after e gif the price”
The first bit is pretty obviously “Yes I can” but I couldn’t really be sure what he was trying to say with the rest of it. So I asked ChatGPT who responded:
It seems he’s saying he can handle the delivery but needs to know the total volume (in cubic meters) of your items before he can provide a price. Here’s how I’d interpret it:
“Yes, I can [do the delivery]. I need to know the [volume] in m³ for delivery, and then I’ll give you the price.”
Thanks to LLMs, I’m able to accomplish so many things that would have previously taken multiple internet searches and way more effort.
Okay now justify the cost it took to create the tool.
Welcome to the top of the sigmoid curve.
If you were wondering what 1999 felt like WRT to the internet, well, here we are. The Matrix was still fresh in everyone’s mind and a lot of online tech innovation kinda plateaued, followed by some “market adjustments.”
I think it’s more likely a compound sigmoid (don’t Google that). LLMs are composed of distinct technologies working together. As we’ve reached the inflection point of the scaling for one, we’ve pivoted implementations to get back on track. Notably, context windows are no longer an issue. But the most recent pivot came just this week, allowing for a huge jump in performance. There are more promising stepping stones coming into view. Is the exponential curve just a series of sigmoids stacked too close together? In any case, the article’s correct - just adding more compute to the same exact implementation hasn’t enabled scaling exponentially.
I hope it all burns.
I think I’ve heard about enough of experts predicting the future lately.
I work with people who work in this field. Everyone knows this, but there’s also an increased effort in improvements all across the stack, not just the final LLM. I personally suspect the current generation of LLMs is at its peak, but with each breakthrough the technology will climb again.
Put differently, I still suspect LLMs will be at least twice as good in 10 years.
Marcus is right, incremental improvements in AIs like ChatGPT will not lead to AGI and were never on that course to begin with. What LLMs do is fundamentally not “intelligence”, they just imitate human response based on existing human-generated content. This can produce usable results, but not because the LLM has any understanding of the question. Since the current AI surge is based almost entirely on LLMs, the delusion that the industry will soon achieve AGI is doomed to fall apart - but not until a lot of smart speculators have gotten in and out and made a pile of money.
Fingers crossed.
Oh nice, another Gary Marcus “AI hitting a wall post.”
Like his “Deep Learning Is Hitting a Wall” post on March 10th, 2022.
Indeed, not much has changed in the world of deep learning between spring 2022 and now.
No new model releases.
No leaps beyond what was expected.
\s
Gary Marcus is like a reverse Cassandra.
Consistently wrong, and yet regularly listened to, amplified, and believed.
It’s been 5 minutes since the new thing did a new thing. Is it the end?
As I use copilot to write software, I have a hard time seeing how it’ll get better than it already is. The fundamental problem of all machine learning is that the training data has to be good enough to solve the problem. So the problems I run into make sense, like:
- Copilot can’t read my mind and figure out what I’m trying to do.
- I’m working on an uncommon problem where the typical solutions don’t work
- Copilot is unable to tell when it doesn’t “know” the answer, because of course it’s just simulating communication and doesn’t really know anything.
2 and 3 could be alleviated, but probably not solved completely with more and better data or engineering changes - but obviously AI developers started by training the models on the most useful data and strategies that they think work best. 1 seems fundamentally unsolvable.
I think there could be some more advances in finding more and better use cases, but I’m a pessimist when it comes to any serious advances in the underlying technology.
Not copilot, but I run into a fourth problem:
4. The LLM gets hung up on insisting that a newer feature of the language I’m using is wrong and keeps focusing on “fixing” it, even though it has access to the newest correct specifications where the feature is explicitly defined and explained.- Copilot can’t read my mind and figure out what I’m trying to do.
Try writing comments
So you use other people’s open source code without crediting the authors or respecting their license conditions? Good for you, parasite.
Very frequently, yes. As well as closed source code and intellectual property of all kinds. Anyone who tells you otherwise is a liar.
Ah, I guess I’ll have to question why I am lying to myself then. Don’t be a douchebag. Don’t use open source without respecting copyrights & licenses. The authors are already providing their work for free. Don’t shit on that legacy.
Ahh right, so when I use copilot to autocomplete the creation of more tests in exactly the same style of the tests I manually created with my own conscious thought, you’re saying that it’s really just copying what someone else wrote? If you really believe that, then you clearly don’t understand how LLMs work.
I know both LLM mechanisms better than you, it would appear, and my point is not so weak that I would have to fabricate a strawman that I then claim is what you said, to proceed to argue the strawman.
Using LLMs trained on other people’s source code is parasitic behaviour and violates copyrights and licenses.
Look, I recognize that it’s possible for LLMs to produce code that is literally someone else’s copyrighted code. However, the way I use copilot is almost exclusively to autocomplete my thoughts. Like, I write enough code until it guesses what I was about to write next. If that happens to be open source code that someone else has written, then it is complete coincidence that I thought of writing that code. Not all thoughts are original.
Further, whether I should be at fault for LLM vendors who may be breaking copyright law, is like trying to make a case for me being at fault for murder because I drive a car when car manufacturers lobby to the effect that people die more.
Not all thoughts are original.
Agreed, and I am also 100% opposed to SW patents. No matter what I wrote, if someone came up with the same idea on their own, and finds out about my implementation later, I absolutely do not expect them to credit me. In the use case you describe, I do not see a problem of using other people’s work in a license breaking way. I do however see a waste of time - you have to triple check everything an LLM spits out - and energy (ref: MS trying to buy / restart a nuclear reactor to power their LLM hardware).
Further, whether I should be at fault for LLM vendors who may be breaking copyright law, is like trying to make a case for me being at fault for murder because I drive a car when car manufacturers lobby to the effect that people die more.
If you drive a car on “autopilot” and get someone killed, you are absolutely at fault for murder. Not in the legal sense, because fuck capitalism, but absolutely in the moral sense. Also, there’s legal precedent in a different example: https://www.findlaw.com/legalblogs/criminal-defense/can-you-get-arrested-for-buying-stolen-goods/
If you unknowingly buy stolen (fenced) goods, if found out, you will have to return them to the rightful owner without getting your money back - that you would then have to try and get back from the vendor.
In the case of license agreements, you would still be participant to a license violation - and if you consider a piece of code that would be well-recognizable, just think about the following thought experiment:
Assume someone trained the LLM on some source code Disney uses for whatever. Your code gets autocompleted with that and you publish it, and Disney finds out about it. Do you honestly think that the evil motherfuckers at Disney would stop at anything short of having your head served on a silver platter?
Programmers don’t have the luxury of using inferior toolsets.
That statement is as dumb as it is non-sensical.
Yay
This is why you’re seeing news articles from Sam Altman saying that AGI will blow past us without any societal impact. He’s trying to lessen the blow of the bubble bursting for AI/ML.
Short on the AI stocks before it crash!
The market can remain irrational longer than you can remain solvent.
A. Gary Shilling