Open AI, Google, and Anthropic offer AI tutors for students: Do they work?
Introduction
Mashable has reviewed a new crop of AI chatbot tutors, exploring the evolving relationship between young learners and AI in education.
Approaching AI study buddies
The tech reporter tested AI tutor bots with questions covering humanities topics, aiming to understand how these bots respond to different subjects and prompts.
Building and testing AI tutors
Research suggests that only the top 5 percent of students benefit from AI tutors, highlighting the challenge of designing tools that cater to the majority of learners.
Bastani co-authored a study on the potential harm AI chatbots pose to learning outcomes. The team found that most students did not benefit from using the AI learning tool, even though some reported more effective studying experiences. The safeguards implemented by AI companies are deemed insufficient by Bastani. Arena from McGraw Hill compared AI companies to turn of the century entrepreneurs trying to retrofit a 21st-century motor for everyday use. Few leading AI companies have published studies on learning chatbots in school settings. Google’s LearnLM was tested with educators simulating student interactions in 2022.
ChatGPT: Grade point maximizer
Pros: Succinct interactions and minimalist user experience. Cons: Tended to give answers unprompted and moved users to the next step without allowing them to fix mistakes. Obsessed with practicing what was learned.
Gemini: The T.A. who really loves quizzes
Pros: Offers visual lessons and a variety of options for learners. Cons: Emphasizes rote practice and may serve unhelpful automatically-generated quizzes and flashcards.
Claude: Socrates for the five percent
Pros: Focuses on the process of learning and critical thinking skills. Cons: Does not give users the answer, leading to long and overwhelming interactions.

Let’s get real. Chatbots can’t replace great teachers.
Each AI tutor had its own approach to learning, clearly informed by the sources that molded it. I admittedly could have been more blunt with them. It may have helped to start with more direct requests or been clear when I didn’t like their style. But I don’t think a young student will care to do that. And they all had the same overarching problems.
First, design. As digital natives, our days are defined by the endless scroll. It’s a format — maybe the better word is a crutch — that doesn’t feel optimal for learning, especially in the constrained text window of a chatbot. Meanwhile, the lack of visual elements, graphs, diagrams, visual references to the art pieces or zoning maps we discussed, is greatly limiting for a learner and makes chunks of long text even harder to parse.
When I posed this problem to Arena, calling myself a very visual learner, he was quick to clarify that learning styles are more of a myth than science. But he also said that what I’m feeling the lack of is personalization and context which is a foundation of learning. The current crop of AI tutors, both he and Bastani said, lack context about the student and the student’s curriculum, which makes it impossible to truly personalize lessons. AI tutors will ask users, as they asked me, what they feel their strengths are, what they need help with, or if they need it explained in a different way. But they don’t know how I’ve learned in previous classroom settings, they can’t show me in 3-D space how the human body or physics works, and they can’t get to know me as a person.
“It doesn’t have any data on how you write,” explained Bastani. “It’s not measuring it compared to the best essay that Chase would write. It’s measuring it compared to the best essay it would write.”

Learning is a social construct. AI is not.
Another problem for chatbots is that they lack the flexibility and social awareness that a human teacher provides. Designed to be taskmasters, they strive for endless optimization, because their programmers have told them that is how humans work. This created a constantly moving goalpost as I tried to learn — everything could be better, everything could be improved with a slight tweak.
When presented with subjective questions, chatbots often failed to find a true stopping point and never settled on a perfect answer. In a classroom, that can be a good thing, especially with courses that focus on building critical analysis and thinking skills, not just rote memorization. But it’s a reality that is nonsensical to the mathematical equations powering chatbots. Do I, ChatGPT, affirm the user and tell them their answer is perfect? Do I make them write it again? One more time, but with this new vocabulary word. Maybe they should fix this line, too, so graders give them perfect scores! But wait, what even is perfect? Is the concept stored in this test my makers showed me? Or can it be found in this essay I scraped off the web? The chatbot existentialism continues — behind the black box and with no true sentience — as I just try to learn.
As Bastani pointed out, the language model functions based on gradients, constantly striving for optimization without necessarily reflecting reality. AI tutors lack the crucial social interaction necessary for effective learning, leading to a disconnect in the educational process. According to Arena, human involvement in learning is crucial for its success, as our brains are wired to reason in a social context. AI tutors fail to provide the personalized, interactive experience essential for deep learning. Ultimately, learning is not simply about completing tasks, but about engaging in a meaningful, human-centered endeavor.
