I think the future is likely going to be more task-specific, targeted models. I don’t have the research handy, but small, targeted LLMs can outperform massive LLMs at a tiny fraction of the compute costs to both train and run the model, and can be run on much more modest hardware to boot.
Like, an LLM that is targeted only at:
teaching writing and reading skills
teaching English writing to English Language Learners
writing business emails and documents
writing/editing only resumes and cover letters
summarizing text
summarizing fiction texts
writing & analyzing poetry
analyzing poetry only (not even writing poetry)
a counselor
an ADHD counselor
a depression counselor
The more specific the model, the smaller the LLM can be that can do the targeted task (s) “well”.
Yeah I agree. Small models is the way. You can also use LoRa/QLoRa adapters to “fine tune” the same big model for specific tasks and swap the use case in realtime. This is what apple do with apple intelligence. You can outperform a big general LLM with an SLM if you have a nice specific use case and some data (which you can synthesise in come cases)
Re: your last paragraph:
I think the future is likely going to be more task-specific, targeted models. I don’t have the research handy, but small, targeted LLMs can outperform massive LLMs at a tiny fraction of the compute costs to both train and run the model, and can be run on much more modest hardware to boot.
Like, an LLM that is targeted only at:
The more specific the model, the smaller the LLM can be that can do the targeted task (s) “well”.
Yeah I agree. Small models is the way. You can also use LoRa/QLoRa adapters to “fine tune” the same big model for specific tasks and swap the use case in realtime. This is what apple do with apple intelligence. You can outperform a big general LLM with an SLM if you have a nice specific use case and some data (which you can synthesise in come cases)