Least shocking news ever. This has clearly been in the works for a while. Not that it’ll matter at this point, given that the notion of OpenAI making any profit is kind of a pipe dream right now.
This is mostly just a play to get investors to sink more money into covering their absolutely insane cash burn for another year.
They might not make a profit, but Altman will be able to extract a lot of wealth by using 7% of a billions of dollars valuation. Even if he doesn’t sell any he can use it as collateral against loans to effectively turn them into cash.
Oh, absolutely. Altman is going to plunder this sinking ship for everything it’s worth, and then bail into a CTO position somewhere else. All the C suite at OpenAI will win big no matter what, everyone else there will get fucked.
Lmao, this has been a scam from the start.
This was precisely what I thought the moment I heard the news a couple days ago.
ClosedAI
Aaaaand
Pop goes the AI bubble.
Last stages of capitalism for tech is usually in the form of an ipo of some sort which is what this will lead to.
There will be other cool shit obviously with integrations and tools that will hopefully trickle down to open source models but the writing is on the wall. This is a cash out and enshittify move.
The best news out of it is we will start to see less and less “our company is Ai and we shoved Ai into said thing” as the companies late to the game will continue to shoot their shot until OpenAI has completely dominated the market and investors stop caring.
AI peaked a while ago IMO, the nail in the coffin for me was Microsoft making deals for nuclear power plants to power their data centers for ML and AI. It’s great they’re using nuclear power since it’s at least a clean source of energy, but it’s also extremely telling of the limitations and power requirements for these languages models. Without some kind of power reduction breakthrough, AI will continue to stall while these companies think of new ways to sell snake oil and gimmicks.
I was just thinking to myself as I got mad at my Google Home speaker for now sucking that they probably did it on purpose because the electrical and processing requirements were too high to keep it at the levels from, say, 2019. They had to cut off half of the assistant’s brain to stop draining money.
Huh, never thought of that that way.
I hate him so much bros
It’s a good thing that real open source models are getting good enough to compete with or exceed OpenAI.
Can you recommend some models to try?
First a caveat/warning - you’ll need a beefy GPU to run larger models, there are some smaller models that perform pretty well.
Adding a medium amount of extra information for you or anyone else that might want to get into running models locally
Tools
- Ollama - great app for downloading/managing/running models locally
- OpenWebUI - A web app that provides a UI like the ChatGPT web app, but can use local models
- continue.dev - A VS Code extension that can use ollama to give a github copilot-like AI assistant running against a local model (can also connect to Anthropic Claude, etc…)
Models
If you look at https://ollama.com/library?sort=featured you can see models
Model size is measured by parameter count. Generally higher parameter models are better (more “smart”, more accurate) but it’s very challenging/slow to run anything over 25b parameters on consumer GPUs. I tend to find 8-13b parameter models are a sort of sweet spot, the 1-4b parameter models are meant more for really low power devices, they’ll give you OK results for simple requests and summarizing, but they’re not going to wow you.
If you look at the ‘tags’ for the models listed below, you’ll see things like
8b-instruct-q8_0
or8b-instruct-q4_0
. The q part refers to quantization, or shrinking/compressing a model and the number after that is roughly how aggressively it was compressed. Note the size of each tag and how the size reduces as the quantization gets more aggressive (smaller numbers). You can roughly think of this size number as “how much video ram do I need to run this model”. For me, I try to aim for q8 models, fp16 if they can run in my GPU. I wouldn’t try to use anything below q4 quantization, there seems to be a lot of quality loss below q4. Models can run partially or even fully on a CPU but that’s much slower. Ollama doesn’t yet support these new NPUs found in new laptops/processors, but work is happening there.- Llama 3.1 - The 8b instruct model is pretty good, decent speed and good quality. This is a good “default” model to use
- Llama 3.2 - This model was just released yesterday. I’m only seeing the 1b and 3b models right now. They’ve changed the 8b model to 11b, I’m assuming the 11b model is going to be my new goto when it’s available.
- Deepseek Coder v2 - A great coding assistant model
- Command-r - This is a more niche model, mainly useful for RAG. It’s only available in a 35b parameter model, so not all that feasible to run locally
- Mistral small - A really good model, in the ballpark of Llama. I haven’t had quite as much luck with this as with Llama but it is good and I just saw that a new version was released 8 days ago, will need to check it out again
A really nice summary. Very useful. Thanks!
Y’all hear the scary music yet?
deleted by creator