ETH Zurich researchers reveal new ideas for artificial intelligence synthetic learning through modular hypernetworks

ETH Zurich researchers reveal new ideas for artificial intelligence synthetic learning through modular hypernetworks

Written By Adarsh Shankar Jha

From a young age, people display an incredible ability to recombine their knowledge and skills in new ways. A child can effortlessly combine running, jumping and throwing to invent new toys. A mathematician can flexibly recombine basic mathematical operations to solve complex problems. This talent for synthetic reasoning – constructing new solutions by mixing primitive building blocks – has proven a formidable challenge for artificial intelligence.

However, a multi-institutional team of researchers may have cracked the code. In a groundbreaking study to be presented at ICLR 2024, scientists from ETH Zurich, Google and Imperial College London reveal new theoretical and empirical insights into how modular neural network architectures called hypernetworks can discover and harness the hidden composition structure that lies beneath complex tasks.

Current state-of-the-art AI models like GPT-3 are remarkable, but they’re also incredibly data-hungry. These models require huge training data sets to acquire new skills, as they lack the ability to flexibly combine their knowledge to solve new problems outside of their training regimes. Synthesis, on the other hand, is a defining characteristic of human intelligence that allows our brains to rapidly build complex representations from simpler elements, enabling the efficient acquisition and generalization of new knowledge. Bringing this synthetic reasoning capability to artificial intelligence is considered a holy grail goal in the field. It could lead to more flexible and data-efficient systems that radically generalize their skills.

Researchers hypothesize that hypernetworks may hold the key to unlocking synthetic artificial intelligence. Supernetworks are neural networks that generate the weights of another neural network through modular, synthetic combinations of parameters. Unlike conventional “monolithic” architectures, hypergrids can enable and flexibly combine different skill modules by linearly combining parameters in their weight space.

QIVFoyMjdqjweoJdAyrBStVBDmAkb8YJGGOFqs9jIggXUAEmTh4jezSwo6lzrggHTrOm y8KnzBBrtUAji0epIZptp6Yi6dIGi9LIWBAH3ZBkTP0Li KgebqSSCJVZlLmDGFhcFltC7OEK39nnwhGu4

Think of each module as a specialist focusing on a specific skill. Hypergrids act as modular architects, able to assemble custom teams of these experts to address each new challenge that arises. The key question is: Under what conditions can hypernetworks recover their ground truth expert modules and composition rules simply by observing the results of their collective efforts?

Through a theoretical analysis that leverages the teacher-student framework, the researchers drew surprising new insights. They proved that under certain conditions on the training data, a hypernetwork learner can provably recognize ground truth units and their compositions – up to a linear transformation – from a modular teacher hypernetwork. The critical conditions are:

  • Synthesis support: All modules must be taken at least once during training, even when combined with others.
  • Linked support: No module can exist in isolation – each module must co-exist with others in all learning tasks.
  • No over-parameterization: The student’s ability cannot greatly exceed the teacher’s ability, or they can simply memorize each training task independently.

Remarkably, despite the exponentially many possible unit combinations, the researchers showed that the placement of only a linear number of examples by the teacher is sufficient for the student to achieve compositional generalization to any unseen unit combination.

The researchers went beyond theory by performing a series of clever meta-learning experiments that demonstrated the ability of hypernetworks to discover compositional structure in different contexts—from synthetic modular compositions to scenarios involving modular preferences and compositional goals.

NTEJfTeCr5vXpQn PVemimYkjJtsLfGgFY1Q5e0H3fFNKTfp4KNbAFZLiIn7QVL3oiM9W1 q5 GfYwwCHJ

In an experiment, they confronted hypernetworks with conventional architectures such as ANIL and MAML in a science fiction world where an agent had to navigate mazes, perform actions on colored objects, and maximize its modular “preferences.” While ANIL and MAML falter when extrapolating to unseen preference combinations, hypernetworks flexibly generalized their behavior with high accuracy.

Remarkably, the researchers observed cases where the hypernetworks could linearly decode ground-truth unit activations from their learned representations, demonstrating their ability to extract the underlying modular structure from sparse task representations.

While these results are promising, challenges remain. Over-parameterization was a key obstacle – too many redundant modules made supernetworks simply memorize individual tasks. Scalable synthesis reasoning will likely require carefully balanced architectures. This work has unveiled the veil that hides the path to artificial synthetic intelligence. With deeper insights into inductive biases, learning dynamics, and architectural design principles, researchers can pave the way to AI systems that acquire more human-like cognition – efficiently recombining skills to radically generalize their capabilities.


check it Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us Twitter. Join us Telegram channel, Discord Channeland LinkedIn Groops.

If you like our work, you will love our work newsletter..

Don’t forget to join us 39k+ ML SubReddits


this Vibhanshu Patidar

Vibhanshu Patidar is a Consulting Intern at MarktechPost. He is currently pursuing BS at Indian Institute of Technology (IIT) Kanpur. He is a Robotics and Machine Learning enthusiast with a talent for unraveling the complexities of algorithms bridging theory and practical applications.


You May Also Like

0 Comments