I’ve been playing around with AI a lot lately for work purposes. A neat trick llms like OpenAI have pushed onto the scene is the ability for a large language model to “answer questions” on a dataset of files. This is done by building a rag agent. It’s neat, but I’ve come to two conclusions after about a year of screwing around.
- it’s pretty good with words - asking it to summarize multiple documents for example. But it’s still pretty terrible at data. As an example, scanning through an excel file log/export/csv file and asking it to perform a calculation “based on this badge data, how many people and who is in the building right now”. It would be super helpful to get answers to those types of questions-but haven’t found any tool or combinations of models that can do it accurately even most of the time. I think this is exactly what happened to spotify wrapped this year - instead of doing the data analysis, they tried to have an llm/rag agent do it - and it’s hallucinating.
- these models can be run locally and just about as fast. Ya it takes some nerd power to set these up now - but it’s only a short matter of time before it’s as simple as installing a program. I can’t imagine how these companies like ChatGPT are going to survive.
Physical held the least amount of info (you probably weren’t going to find much). Software like encarta was cool - had lots of info. But in the days of dinosaurs libraries was where it was at. It was common to ask an adult a question and you get either “I don’t know” or some BS that you believed was true (but wasn’t).
If you really wanted to know, you’d ask the librarian at school or at your towns public library and they’d help you find a book on that topic. Libraries were magical places - even for the people who were too cool to admit it.