This was exactly my experience. Freaked myself out last year and decided best thing was to dive headfirst into it to figure out how it worked and what it’s capabilities are.
Which - it has a lot. It can do a lot, and it’s impressive tech. Coded several projects and built my own models. But, it’s far from perfect. There are so so so many pitfalls that startups and tech evangelists just happily ignore. Most of these problems can’t be solved easily - if at all. It’s not intelligent, it’s a very advanced and unique prediction machine. The funny thing to me is that it’s still basically machine learning, the same tech that we’ve had since the mid 2000s, it’s just we have fancier hardware now. Big tech wants everyone to believe it’s brand new… and it is… kind of. But not really either.
Have you coded with Claude Sonnet 3.5 yet? It is mind-blowingly better than Opus 3, which was already noticeably better than anything openAI has put out yet. Gpt 4 was nice to code with, but this is on a whole other level. I can’t imagine what Opus 3.5 will be able to do.
The issue with sonnet 3.5 is, in my limited testing, is that even with explicit, specific, and direct prompting, it can’t perform to anything near human ability, and will often make very stupid mistakes. I developed a program which essentially lets an AI program, rewrite, and test a game, but sonnet will consistently take lazy routes, use incorrect syntax, and repeatedly call the same function over and over again for no reason. If you can program the game yourself, it’s a quick way to prototype, but unless you know how to properly format JSON and fix strange artefacts, it’s just not there yet.
Which is why as an engineer I can either riddle with a prompt for half an hour… Or just write the damn method myself. For juniors it’s an easy button, but for seniors who know how to write these algorithms it’s usually just easier to write it up. Some nice starter code though, gets the boilerplate out of the way
Yeah. It’s really interesting because juniors and hobbyist are the ones getting used to how to interact with it. Since it is rapidly improving, it won’t be long until it will outpace the grunt work ability of seniors and the new seniors will be the ones willing and able to use it. Programming is switching away from being able to write tedious code and into being able to come up with ideas and convey them clearly to an llm. There’s going to be a real leveling of the playing field when even the best seniors won’t have any use for most of their grunt work coding skills. The jump up from Opus 3 to Sonnet 3.5 is absolutely insane, and Opus 3.5 should be here before too long.
That’s really interesting. For android studio it’s been absolutely crushing it for me. It’s taken some getting used to, but I’ve had it build an app with about 60 files. I’m no master programmer, but I’ve been a hobbyist for a couple decades. What it’s done in the last 5 days for me would have taken me 2 months easy, and there’s lots of extra touches that I probably wouldn’t have taken time to do if it wasn’t as simple as loading in a few files and telling it what I want.
Usually when I work on something like this, my todo list grows much faster than my ability to actually put it together, but with this project I’m quickly running out of even any features that I can imagine. I’ve not had any of the issues of it running in circles like I would often get it gpt4.
The funny thing to me is that it’s still basically machine learning, the same tech that we’ve had since the mid 2000s, it’s just we have fancier hardware now.
So much of the modern Microsoft/ChatGPT project is effectively brute-forcing intelligence from accumulated raw data. That’s why they need phenomenal amounts of electricity, processing power, and physical space to make the project work.
There are other - arguably better, but definitely more sophisticated - approaches to developing genetic algorithms and machine learning techniques. If any of them prove out, they have the potential to render a great deal of Microsoft’s original investment worthless by doing what Microsoft is doing far faster and more efficiently than the Sam Altman “Give me all the electricity and money to hit the AI problem with a very big hammer” solution.
It takes a lot of energy to train the models in the first place, but very little once you have them. I run mixture of agents on my laptop, and it outperforms anything openai has released on pretty much every benchmark, maybe even every benchmark. I run it quite a bit and have noticed no change in my electricity bill. I imagine inference on gpt4 must almost be very efficient, if not, they should just switch to piping people open sourced llms run through MoA.
Are you saying you have a local agent that is better than anything OpenAI has released? Where did this agent come from? Did you make it from scratch? How are you not worth billions if you can out perform them on “every benchmark”?
My dude, no, I’m not the creator, settle down. Mixture of agents is free and open to anyone to use. Here is a demo of it by Matthew Berman. It isnt hard to set up.
This was exactly my experience. Freaked myself out last year and decided best thing was to dive headfirst into it to figure out how it worked and what it’s capabilities are.
Which - it has a lot. It can do a lot, and it’s impressive tech. Coded several projects and built my own models. But, it’s far from perfect. There are so so so many pitfalls that startups and tech evangelists just happily ignore. Most of these problems can’t be solved easily - if at all. It’s not intelligent, it’s a very advanced and unique prediction machine. The funny thing to me is that it’s still basically machine learning, the same tech that we’ve had since the mid 2000s, it’s just we have fancier hardware now. Big tech wants everyone to believe it’s brand new… and it is… kind of. But not really either.
Have you coded with Claude Sonnet 3.5 yet? It is mind-blowingly better than Opus 3, which was already noticeably better than anything openAI has put out yet. Gpt 4 was nice to code with, but this is on a whole other level. I can’t imagine what Opus 3.5 will be able to do.
The issue with sonnet 3.5 is, in my limited testing, is that even with explicit, specific, and direct prompting, it can’t perform to anything near human ability, and will often make very stupid mistakes. I developed a program which essentially lets an AI program, rewrite, and test a game, but sonnet will consistently take lazy routes, use incorrect syntax, and repeatedly call the same function over and over again for no reason. If you can program the game yourself, it’s a quick way to prototype, but unless you know how to properly format JSON and fix strange artefacts, it’s just not there yet.
Which is why as an engineer I can either riddle with a prompt for half an hour… Or just write the damn method myself. For juniors it’s an easy button, but for seniors who know how to write these algorithms it’s usually just easier to write it up. Some nice starter code though, gets the boilerplate out of the way
Yeah. It’s really interesting because juniors and hobbyist are the ones getting used to how to interact with it. Since it is rapidly improving, it won’t be long until it will outpace the grunt work ability of seniors and the new seniors will be the ones willing and able to use it. Programming is switching away from being able to write tedious code and into being able to come up with ideas and convey them clearly to an llm. There’s going to be a real leveling of the playing field when even the best seniors won’t have any use for most of their grunt work coding skills. The jump up from Opus 3 to Sonnet 3.5 is absolutely insane, and Opus 3.5 should be here before too long.
That’s really interesting. For android studio it’s been absolutely crushing it for me. It’s taken some getting used to, but I’ve had it build an app with about 60 files. I’m no master programmer, but I’ve been a hobbyist for a couple decades. What it’s done in the last 5 days for me would have taken me 2 months easy, and there’s lots of extra touches that I probably wouldn’t have taken time to do if it wasn’t as simple as loading in a few files and telling it what I want.
Usually when I work on something like this, my todo list grows much faster than my ability to actually put it together, but with this project I’m quickly running out of even any features that I can imagine. I’ve not had any of the issues of it running in circles like I would often get it gpt4.
It’ll take a few spectacular failures and bankruptcies before people figure out AI isn’t quite what’s being sold to them, I feel.
So much of the modern Microsoft/ChatGPT project is effectively brute-forcing intelligence from accumulated raw data. That’s why they need phenomenal amounts of electricity, processing power, and physical space to make the project work.
There are other - arguably better, but definitely more sophisticated - approaches to developing genetic algorithms and machine learning techniques. If any of them prove out, they have the potential to render a great deal of Microsoft’s original investment worthless by doing what Microsoft is doing far faster and more efficiently than the Sam Altman “Give me all the electricity and money to hit the AI problem with a very big hammer” solution.
It takes a lot of energy to train the models in the first place, but very little once you have them. I run mixture of agents on my laptop, and it outperforms anything openai has released on pretty much every benchmark, maybe even every benchmark. I run it quite a bit and have noticed no change in my electricity bill. I imagine inference on gpt4 must almost be very efficient, if not, they should just switch to piping people open sourced llms run through MoA.
Are you saying you have a local agent that is better than anything OpenAI has released? Where did this agent come from? Did you make it from scratch? How are you not worth billions if you can out perform them on “every benchmark”?
My dude, no, I’m not the creator, settle down. Mixture of agents is free and open to anyone to use. Here is a demo of it by Matthew Berman. It isnt hard to set up.
https://youtu.be/aoikSxHXBYw
Believe it or not, openai is no longer making the best models. Claude Sonnet 3.5 is much better than openai’s best models by a considerable amount.