Teaching AI Models To Learn Like Humans
You don’t have to be a keen industry watcher to know that this year’s breakout star in the tech world is the ChatGPT. The groundbreaking AI chatbot from OpenAI is supremely capable in many ways — taking mere minutes (sometimes seconds) to summarise reports, write school essays, draft travel itineraries, debug code, provide career advice…the impressive list goes on.
“Right now in machine learning literature, ChatGPT and other big models can do a lot of amazing things,” says Zhang Mengmi, a computational neuroscientist at Singapore’s Agency for Science, Technology and Research (A*STAR). “But there’s a big problem: they can only learn from the data you feed them.”
[caption id="attachment_284698" align="aligncenter" width="751"] ChatGPT is one of the most powerful AI tools available today — capable of drafting essays, summarising reports, and much more. (Image credit: Shantanu Kumar)[/caption]
To complicate matters, AI models are prone to catastrophic forgetting, she says, “meaning they can only recognise or learn the current knowledge you are teaching, while potentially forgetting about everything you’ve taught them before.”
Being unable to recall past information is far from desirable — especially if AI is meant to function effectively in the real world where “the environment is dynamically evolving and the data stream is always increasing incrementally over time,” says Zhang.
Take for example the case of a robot restocking shelves in a supermarket. “You want the robot to be able to adapt quickly and recognise new items, but at the same time you don’t want it to forget about older items on the shelf, otherwise that’s going to cause an issue,” says Zhang.
Similarly, you would want an AI model you train to recognise that Joe Biden is the new U.S. president, while remembering that Donald Trump preceded him.
To drive home her point about how catastrophic forgetting can negatively affect AI prediction performance, Zhang cites another, more recent, scenario: Covid-19’s tendency to rapidly evolve, with new variants popping up around the globe every couple of months. “You definitely want your AI model to be robust and be able to accurately diagnose patients with the correct variant,” she explains. “It’s critically important because time is precious and you don’t have to retrain your robots from scratch every single time.”
Human Brains As Inspiration
Zhang, who heads the DeepNeuroCognition lab at A*STAR, is working on ways to reduce such forgetfulness. In April 2022, she embarked on a three-year project funded by AI Singapore that is aimed at designing a continuous learning framework for AI algorithms. “The goal is to try and help models recognise objects or object classes incrementally over time, in a memory-efficient manner without repeated visits of the old data,” she explains.
The secret to solving this catastrophic forgetting problem, Zhang believes, lies in human brains. People are able to thrive in constantly evolving environments because they are able to continually extract knowledge from their surroundings, while retaining previously learnt skills and experiences and transferring them to new tasks at hand. Today’s artificial neural networks come nowhere close to emulating this ability.
“Our biological brains are so low-powered — we eat only three meals a day — and yet we can do so many amazing things,” says Zhang. “There must be something unique about us that we haven’t discovered yet which is also going to be relatively important for robots.”
[caption id="attachment_284699" align="aligncenter" width="731"] Zhang and other researchers believe they can solve the catastrophic forgetting problem of AI models by turning to human brains for inspiration. (Image credit: Alexandra Koch)[/caption]
While studying engineering as an undergraduate at the National University of Singapore, Zhang quickly became attracted to robotics. In her final year, she worked on a project involving drones. She recalls: “I realised those drones were so dumb in terms of their brains. You have to control them to fly anywhere, they don’t have their own perceptual system and can’t make their own smart decisions.”
And so she switched tracks upon graduation to study computer vision for her PhD. Halfway through, she came to another realisation: to excel, she also needed to study biological brains, and meld knowledge from the fields of psychology, neuroscience, and cognitive science together.
“I hope to take some of that knowledge and incorporate them into AI to help build smarter systems,” Zhang explains. “Basically, I’m interested in developing artificial brains.”
A Win-Win Scenario
As a first step, Zhang and her collaborators conducted studies to determine how humans are able to continuously learn to recognise new objects. Specifically, they were interested to see how knowledge transfer was impacted when learning was done via an incremental process. In other words, progressing from easier to harder data — similar to how children learn mathematics starting with addition and subtraction, before moving onto multiplication and division.
The researchers conducted a series of tests in close to 250 volunteers. Each study participant was shown images of objects, resembling a toy vehicle, from five different families that differed in the shape of their central bodies, as well as the colours and types of protrusions they had. They were then shown novel images and asked to identify which family the new objects belonged to.
From this, Zhang and her collaborators devised an algorithm that can automatically design curricula and assess the quality of the given curricula. For instance, instead of exhaustively running all the 120 curricula tests on the students, the Curriculum Designer could rank the top 30. Of these, three on average were found to be effective for teaching not only machines, but humans too.
The researchers then tested how compositionality — the notion of replaying only certain parts of an image or video, rather than its entirety — affects continuous learning in AI. “Our findings were surprising,” says Zhang. “It seems that compositionality plays a much more important role than I imagined.”
One would think that the more training data, the merrier, she explains. “But it turns out to be the opposite.” As a result, she and her collaborators devised an algorithm, which they call Compositional Replay Using Memory Blocks or CRUMB, that uses ‘memory blocks’ to reconstruct new stimuli, enabling replay of specific memories during later tasks — a process she describes as being similar to how “crumbs gathered together can form a loaf of bread”.
[caption id="attachment_284700" align="aligncenter" width="800"] One of the continuous learning algorithms Zhang and her collaborators have devised involves using parts of an image or video (rather than the entire thing) to train an AI model, similar to how crumbs can be combined to form an entire loaf of bread. (Image credit: Simon Bleasdale)[/caption]
The approach uses only 4% of memory and 20% of runtime when it comes to learning from a video stream, making it much more efficient. “It’s a bit like a student studying for an exam,” Zhang says. “He has his pile of lecture notes but he’s just going to look through the highlighted parts, rather than reading every single word or sentence.”
With two years remaining in her AI Singapore project, Zhang now wants to extend the compositionality learning studies to humans. “One of the things I want to do is study both human and AI learning,” she says.
Zhang imagines a win-win scenario: “My hope is that as human behaviour helps develop smarter AI models capable of effective continuous learning, these models will also come up with new strategies to enhance human learning abilities.”