Does AI actually help students learn? A recent experiment in a high school provides a cautionary tale.
Researchers at the University of Pennsylvania found that Turkish high school students who had access to ChatGPT while doing practice math problems did worse on a math test compared with students who didn’t have access to ChatGPT. Those with ChatGPT solved 48 percent more of the practice problems correctly, but they ultimately scored 17 percent worse on a test of the topic that the students were learning.
A third group of students had access to a revised version of ChatGPT that functioned more like a tutor. This chatbot was programmed to provide hints without directly divulging the answer. The students who used it did spectacularly better on the practice problems, solving 127 percent more of them correctly compared with students who did their practice work without any high-tech aids. But on a test afterwards, these AI-tutored students did no better. Students who just did their practice problems the old fashioned way — on their own — matched their test scores.
Traditional instruction gave the same result as a bleeding edge ChatGPT tutorial bot. Imagine what would happen if a tiny fraction of the billions spent to develop this technology went into funding improved traditional instruction.
Better paid teachers, better resources, studies geared at optimizing traditional instruction, etc.
Move fast and break things was always a stupid goal. Turbocharging it with all this money is killing the tried and true options that actually produce results, while straining the power grid and worsening global warming.
Investing in actual education infrastructure won’t get VC techbros their yachts, though.
Imagine all the money spent on war would be invested into education 🫣what a beautiful world we would live in.
And cracking open a book didn’t demolish the environment. Weird.
Traditional instruction gave the same result as a bleeding edge ChatGPT tutorial bot.
Interesting way of looking at it. I disagree with your conclusion about the study, though.
It seems like the AI tool would be helpful for things like assignments rather than tests. I think it’s intellectually dishonest to ignore the gains in some environments because it doesn’t have gains in others.
You’re also comparing a young technology to methods that have been adapted over hundreds of thousands of years. Was the first automobile entirely superior to every horse?
I get that some people just hate AI because it’s AI. For the people interested in nuance, I think this study is interesting. I think other studies will seek to build on it.
The point of assignments is to help study for your test.
Homework is forced study. If you’re just handed the answers, you will do shit on the test.
The point of assignments is to help study for your test.
To me, “assignment” is more of a project. Not rote practice. Applying knowledge to a bit of a longer term, multi-part project.
The education system is primarily about controlling bodies and minds. So any actual education is counter-productive.
LLMs/GPT, and other forms of the AI boogeyman, are all just a tool we can use to augment education when it makes sense. Just like the introduction of calculators or the internet, AI isn’t going to be the easy button, nor is it going to steal all teachers’ jobs. These tools need to be studied, trained for, and applied purposely in order to be most effective.
EDIT: Downvoters, I’d appreciate some engagement on why you disagree.
I don’t even know of this is ChatGPT’s fault. This would be the same outcome if someone just gave them the answers to a study packet. Yes, they’ll have the answers because someone (or something) gave it to them, but won’t know how to get that answer without teaching them. Surprise: For kids to learn, they need to be taught. Shocker.
I’ve found chatGPT to be a great learning aid. You just don’t use it to jump straight to the answers, you use it to explore the gaps and edges of what you know or understand. Add context and details, not final answers.
The study shows that once you remove the LLM though, the benefit disappears. If you rely on an LLM to help break things down or add context and details, you don’t learn those skills on your own.
I used it to learn some coding, but without using it again, I couldn’t replicate my own code. It’s a struggle, but I don’t think using it as a teaching aid is a good idea yet, maybe ever.
I wouldn’t say this matches my experience. I’ve used LLMs to improve my understanding of a topic I’m already skilled in, and I’m just looking to understand something nuanced. Being able to interrogate on a very specific question that I can appreciate the answer to is really useful and definitely sticks with me beyond the chat.
Not limited to kids.
Kids who take shortcuts and don’t learn suck at recalling knowledge they never had…
The only reason we’re trying to somehow compromise and allow or even incorporate cheating software into student education is because the tech-bros and singularity cultists have been hyping this technology like it’s the new, unstoppable force of nature that is going to wash over all things and bring about the new Golden Age of humanity as none of us have to work ever again.
Meanwhile, 80% of AI startups sink and something like 75% of the “new techs” like AI drive-thru orders and AI phone support go to call centers in India and Philippines. The only thing we seem to have gotten is the absolute rotting destruction of all content on the internet and children growing up thinking it’s normal to consume this watered-down, plagiarized, worthless content.
deleted by creator
I took German in high school and cheated by inventing my own runic script. I would draw elaborate fantasy/sci-fi drawings on the covers of my notebooks with the German verb declensions and whatnot written all over monoliths or knight’s armor or dueling spaceships, using my own script instead of regular characters, and then have these notebook sitting on my desk while taking the tests. I got 100% on every test and now the only German I can speak is the bullshit I remember Nightcrawler from the X-Men saying. Unglaublich!
I just wrote really small on a paper in my glasses case, or hidden data in the depths of my TI86.
We love Nightcrawler in this house.
Actually if you read the article ChatGPT is horrible at math a modified version where chatGPT was fed the correct answers with the problem didn’t make the kids stupider but it didn’t make them any better either because they mostly just asked it for the answers.
At work we give a 16/17 year old, work experience over the summer. He was using chatgpt and not understanding the code that was outputing.
I his last week he asked why he doing print statement something like
print (f"message {thing} ")
Im afraid to ask, but whats wrong with that line? In the right context thats fine to do no?
There is nothing wrong with it. He just didn’t know what it meant after using it for a little over a month.
Like any tool, it depends how you use it. I have been learning a lot of math recently and have been chatting with AI to increase my understanding of the concepts. There are times when the textbook shows some steps that I don’t understand why they’re happening and I’ve questioned AI about it. Sometimes it takes a few tries of asking until you figure out the right question to ask to get the right answer you need, but that process of thinking helps you along the way anyways by crystallizing in your brain what exactly it is that you don’t understand.
I have found it to be a very helpful tool in my educational path. However I am learning things because I want to understand them, not because I have to pass a test and that determination in me to want to understand is a big difference. Just getting hints to help you solve the problem might not really help in the long run, but it you’re actually curious about what you’re learning and focus on getting a deeper understanding of why and how something works rather than just getting the right answer, it can be a very useful tool.
Why are you so confident that the things you are learning from AI are correct? Are you just using it to gather other sources to review by hand or are you trying to have conversations with the AI?
We’ve all seen AI get the correct answer but the show your work part is nonsense, or vice versa. How do you verify what AI outputs to you?
You check it’s work. I used it to calculate efficiency in a factory game and went through and made corrections to inconsistencies I spotted. Always check it’s work.
Exactly. It’s a helpful tool but it needs to be used responsibly. Writing it off completely is as bad a take as blindly accepting everything it spits out.
I use it for explaining stuff when studying for uni and I do it like this: If I don’t understand e.g. a definition, I ask an LLM to explain it, read the original definition again and see if it makes sense.
This is an informal approach, but if the definition is sufficiently complex, false answers are unlikely to lead to an understanding. Not impossible ofc, so always be wary.
For context: I’m studying computer science, so lots of math and theoretical computer science.
I mean, why are you confident the work in textbooks is correct? Both have been proven unreliable, though I will admit LLMs are much more so.
The way you verify in this instance is actually going through the work yourself after you’ve been shown sources. They are explicitly not saying they take 1+1=3 as law, but instead asking how that was reached and working off that explanation to see if it makes sense and learn more.
Math is likely the best for this too. You have undeniable truths in math, it’s true, or it’s false. There are no (meaningful) opinions on how addition works other than the correct one.
The problem with this style of verification is that there is no authoritative source. Neither the AI nor yourself is capable of verifying for accuracy. The AI also has no expectation of being accurate or revised.
I don’t see how this is any better than running google searches on reddit or other message boards looking for relevant discussions and basing your knowledge on those.
If AI was enabling something new that might be worth it but allowing someone to find slightly less/more shitty message board posts 10% more efficiently isnt worth what’s happening. There are countries that are capable of regulation as a field fills out, why can’t america? We banned tiktok in under a month didnt we?
I personally use it’s answers as a jumping off point to do my own research, or I ask it for sources directly about things and check those out. I frequently use LLMs for learning about topics, but definitely don’t take anything they say at face value.
For a personal example, I use ChatGPT as my personal Japanese tutor. I use it discuss and break down nuances of various words or sayings, names of certain conjugation forms etc. etc., and it is absolutely not 100% correct, but I can now take the names of things that it gives me in native Japanese that I never would have known and look them up using other resources. Either it’s correct and I find confirming information, or it’s wrong and I can research further independently or ask it follow up questions. It’s certainly not as good as a human native speaker, but for $20 a month and as someone who likes enjoys doing their own research, I fucking love it.
I’m not at all confident in the answers directly. I’ve gotten plenty of wrong answers form AI and I’ve gotten plenty of correct answers. If anything it’s just more practice for critical thinking skills, separating what is true and what isn’t.
When it comes to math though, it’s pretty straightforward, I’m just looking for context on some steps in the problems, maybe reminders of things I learned years ago and have forgotten, that sort of thing. As I said, I’m interested in actually understanding the stuff that I’m learning because I am using it for the things I’m working on so I’m mainly reading through textbooks and using AI as well as other sources online to round out my understanding of the concepts. If I’m getting the right answers and the things I am doing are working, it’s a good indicator I’m on the right path.
It’s not like I’m doing cutting edge physics or medical research where mistakes could cause lives.
Its sort of similar to saying poppy production overall is pretty negative, but if smart critical people use it sparingly and apprehensively, opiates could be of great benefit to that person.
Thats all well and good and all but AI is not being developed to help critical thinkers research slightly easier, its being created to reduce the amount of money companies spend on humans.
Until regulations are in place to guide the development of the technology in useful ways then I dont know any of it should be permitted. What’s the rush for anyways?
Well I’m definitely not pushing for more AI and I like to try to stay nuanced on the topic. Like I mentioned in my first comment I have found it to be a very helpful tool but if used in other ways it could do more harm than good. I’m not involved in making or pushing AI but as long as it is an available tool I’m going to make use of it in the most responsible way I can and talk about how I use it knowing that I can’t control what other people do but maybe I could help some people who are only using it to get answer hints like in the article to find more useful ways of using it.
When it comes to regulation, yeah I’m all for that. It’s a sad reality that regulation always lags behind and generally doesn’t get implemented until there’s some sort of problem that scares the people in power who are mostly too old to understand what’s happening anyways.
And as to what’s the rush, I would say a combination of curiosity and good intentions mixed with the worst of capitalism, the carrot of financial gain for success and the stick of financial ruin for failure and I don’t have a clue what percent of the pie each part makes up. I’m not saying it’s a good situation but it’s the way things go and I don’t think anyone alive could stop it. Once something is out of the bag, there ain’t any putting it back.
Basically I’m with you that it will be used for things that make life worse for people and that sucks, and it would be great if that was not the case but that doesn’t change the fact that I can’t do anything about that and meanwhile it can still be a useful tool and so I’m going to use it the best that I can regardless how others use it because there’s really nothing I can do except keep pushing forward the best I can, just like anyone else.
It might just be the difference in perspective. I agree with your assessments if how things are but not how they will be in the future. There are countries that are more responsible in their research, so I know its possible. Its all politics and I dont believe in giving up on social change just yet.
Sometimes it leads me wildly astray when I do that, like a really bad tutor…but it is good if you want a refresher and can spot the bullshit on the side. It is good for spotting things that you didnt know before and can factcheck afterwards.
…but maybe other review papers and textbooks are still better…
Yea, this highlights a fundamental tension I think: sometimes, perhaps oftentimes, the point of doing something is the doing itself, not the result.
Tech is hyper focused on removing the “doing” and reproducing the result. Now that it’s trying to put itself into the “thinking” part of human work, this tension is making itself unavoidable.
I think we can all take it as a given that we don’t want to hand total control to machines, simply because of accountability issues. Which means we want a human “in the loop” to ensure things stay sensible. But the ability of that human to keep things sensible requires skills, experience and insight. And all of the focus our education system now has on grades and certificates has lead us astray into thinking that the practice and experience doesn’t mean that much. In a way the labour market and employers are relevant here in their insistence on experience (to the point of absurdity sometimes).
Bottom line is that we humans are doing machines, and we learn through practice and experience, in ways I suspect much closer to building intuitions. Being stuck on a problem, being confused and getting things wrong are all part of this experience. Making it easier to get the right answer is not making education better. LLMs likely have no good role to play in education and I wouldn’t be surprised if banning them outright in what may become a harshly fought battle isn’t too far away.
All that being said, I also think LLMs raise questions about what it is we’re doing with our education and tests and whether the simple response to their existence is to conclude that anything an LLM can easily do well isn’t worth assessing. Of course, as I’ve said above, that’s likely manifestly rubbish … building up an intelligent and capable human likely requires getting them to do things an LLM could easily do. But the question still stands I think about whether we need to also find a way to focus more on the less mechanical parts of human intelligence and education.
LLMs likely have no good role to play in education and I wouldn’t be surprised if banning them outright in what may become a harshly fought battle isn’t too far away.
While I agree that LLMs have no place in education, you’re not going to be able to do more than just ban them in class unfortunately. Students will be able to use them at home, and the alleged “LLM detection” applications are no better than throwing a dart at the wall. You may catch a couple students, but you’re going to falsely accuse many more. The only surefire way to catch them is them being stupid and not bothering to edit what they turn in.
Yea I know, which is why I said it may become a harsh battle. Not being in education, it really seems like a difficult situation. My broader point about the harsh battle was that if it becomes well known that LLMs are bad for a child’s development, then there’ll be a good amount of anxiety from parents etc.
deleted by creator
This! Don’t blame the tech, blame the grown ups not able to teach the young how to use tech!
The study is still valuable, this is a math class not a technology class, so understanding it’s impact is important.
Yea, did not read that promptengineered chatGPT was better than non chatGPT class 😄 but I guess that proofs my point as well, because if students in group with normal chatGPT were teached how to prompt normal ChatGPT so that it answer in a more teacher style, I bet they would have similar results as students with promtengineered chatGPT
If you actually read the article you will see that they tested both allowing the students to ask for answers from the LLM, and then limiting the students to just ask for guidance from the LLM. In the first case the students did significantly worse than their peers that didn’t use the LLM. In the second one they performed the same as students who didn’t use it. So, if the results of this study can be replicated, this shows that LLMs are at best useless for learning and most likely harmful. Most students are not going to limit their use of LLMs for guidance.
You AI shills are just ridiculous, you defend this technology without even bothering to read the points under discussion. Or maybe you read an LLM generated summary? Hahahaha. In any case, do better man.
Obviously no one’s going to learn anything if all they do is blatantly asking for an answer and writings.
You should try reading the article instead of just the headline.
deleted by creator
TLDR: ChatGPT is terrible at math and most students just ask it the answer. Giving students the ability to ask something that doesn’t know math the answer makes them less capable. An enhanced chatBOT which was pre-fed with questions and correct answers didn’t screw up the learning process in the same fashion but also didn’t help them perform any better on the test because again they just asked it to spoon feed them the answer.
references
ChatGPT’s errors also may have been a contributing factor. The chatbot only answered the math problems correctly half of the time. Its arithmetic computations were wrong 8 percent of the time, but the bigger problem was that its step-by-step approach for how to solve a problem was wrong 42 percent of the time.
The tutoring version of ChatGPT was directly fed the correct solutions and these errors were minimized.
The researchers believe the problem is that students are using the chatbot as a “crutch.” When they analyzed the questions that students typed into ChatGPT, students often simply asked for the answer.
There are a part here that sounds interesting
The students who used it did spectacularly better on the practice problems, solving 127 percent more of them correctly compared with students who did their practice work without any high-tech aids. But on a test afterwards, these AI-tutored students did no better.
Do you think that these students that used ChatGPT can do the exercises “the old fashioned way”? For me it was a nightmare try to resolve a calculus problem just with the trash books that doesn’t explain a damn fuck, I have to go to different resources, wolphram, youtube, but what happened when there was a problem that wasnt well explained in any resource?. I hate openAI, I want to punch Altman in the face. But this doesn’t mean we have to bait this hard in the title.
This isn’t a new issue. Wolfram alpha has been around for 15 years and can easily handle high school level math problems.
Except wolfram alpha is able to correctly explain step by step solutions. Which was an aid in my education.
Only old farts still use Wolfram
What do young idiots use?
ChatGPT apparently lol
I can’t remember, but my dad said before he retired he would just pirate Wolfram because he was too old to bother learning whatever they were using. He spent 25 years in academia teaching graduate chem-e before moving to the private sector. He very briefly worked with one of the Wolfram founders at UIUC.
Edit: I’m thinking of Mathematica, he didn’t want to mess with learning python.
Where did you think you were?
Taking too many shortcuts doesn’t help anyone learn anything.
ChatGPT lies which is kind of an issue in education.
As far as seeing the answer, I learned a significant amount of math by looking at the answer for a type of question and working backwards. That’s not the issue as long as you’re honestly trying to understand the process.
Maybe, if the system taught more of HOW to think and not WHAT. Basically more critical thinking/deduction.
This same kinda topic came up back when I was in middle/highschool when search engines became wide spread.
However, LLM’s shouldn’t be trusted for factual anything, same as Joe blows blog on some random subject. Did they forget to teach cross referencing too? I’m sounding too bitter and old so I’ll stop.
However, LLM’s shouldn’t be trusted for factual anything, same as Joe blows blog on some random subject.
Podcasts are 100% reliable tho
I’ve found AI helpful in asking for it to explain stuff. Why is the problem solved like this, why did you use this and not that, could you put it in simpler terms and so on. Much like you might ask a teacher.
I think this works great if the student is interested in the subject, but if you’re just trying to work through a bunch of problems so you can stop working through a bunch of problems, it ain’t gonna help you.
I have personally learned so much from LLMs (although you can’t really take anything at face value and have to look things up independently, but it gives you a great starting place), but it comes from a genuine interest in the questions I’m asking and things I dig at.
I have personally learned so much from LLMs
No offense but that’s what the article is also highlighting, naming that students, even the good, believe they did learn. Once it’s time to pass a test designed to evaluate if they actually did, it’s not that positive.
I mean…
私と日本語で会話したいか 😅
At the end of the day, I feel like it’s how you use the tool. “if you’re just trying to work through a bunch of problems so you can stop working through a bunch of problems, it ain’t gonna help you.” How do you think a bunch of kids using this are going to be using it when it comes to school work that they’re required to finish, but not likely actually interested in?
Yep. My first interaction with GPT pro lasted 36 hours and I nearly changed my religion.
AI is the best thing to come to learning, ever. If you are a curious person, this is bigger than Gutenberg, IMO.
That sounds like a manic episode
Youdontsay.png