Whilst I’ve avoided LLMs mostly so far, seems like that should actually work a bit. LLMs are imitating us, and if you warn a human to be extra careful they will try to be more careful (usually), so an llm should have internalised that behaviour. That doesn’t mean they’ll be much more accurate though. Maybe they’d be less likely to output humanlike mistakes on purpose? Wouldn’t help much with llm-like mistakes that they’re making all on their own though.
You are absolutely correct and 10 seconds of Google searching will show that this is the case.
You get a small boost by asking it to be careful or telling it that it’s an expert in the subject matter. on the “thinking” models they can even chain together post review steps.
Whilst I’ve avoided LLMs mostly so far, seems like that should actually work a bit. LLMs are imitating us, and if you warn a human to be extra careful they will try to be more careful (usually), so an llm should have internalised that behaviour. That doesn’t mean they’ll be much more accurate though. Maybe they’d be less likely to output humanlike mistakes on purpose? Wouldn’t help much with llm-like mistakes that they’re making all on their own though.
You are absolutely correct and 10 seconds of Google searching will show that this is the case.
You get a small boost by asking it to be careful or telling it that it’s an expert in the subject matter. on the “thinking” models they can even chain together post review steps.