Researchers at Google AI present a machine learning-based approach to teach strong LLMs how to better reason with graph information

Researchers at Google AI present a machine learning-based approach to teach strong LLMs how to better reason with graph information

Written By Adarsh Shankar Jha

Imagine everything in your immediate vicinity, from your friends and family to the utensils in your kitchen and the parts on your bike. Each of them is related in some way. The word “graph” describes the relationships between entities in computer science. Nodes are the objects in a graph, while edges are the links between them that show their relationship. The very fabric of the Internet is a vast network of interconnected web pages. The information that search engines rely on is also structured like a graph.

A new Google study aims to train powerful LLMs to reason better with graph information. This is since graphs are ubiquitous and LLM technology is advancing. While LLMs are often trained in plain text, graphs provide a more effective means of organizing information. The goal is to test different approaches to find the most effective ones and gain real-world insight. Converting graphics into a language that LLMs can understand is extremely complex. The complexity of multi-node graph structures with complex webs of edges connecting them is at the root of the problem. This research focuses on methods of converting graphs into a language that LLMs can understand.

The researchers first created a benchmark called GraphQA to rigorously determine the optimal method for graph-to-text translation. Researchers rely on a chart type to create an exhaustive and realistic LLM test. Instead, they use a variety of graphs to guarantee a large number of connections. Certain types of graphs make these kinds of problems easier or harder to solve. In this approach, GraphQA can reveal biases in an LLM’s analysis of graphs, and the test becomes more representative of the real environment that LLMs may encounter.

Graph QA deals with elementary graph operations such as verifying the existence of an edge, counting the number of edges or nodes, determining which nodes are connected to a given node, and detecting cycles in a graph. Despite their apparent simplicity, these activities require familiarity with the connections between nodes and edges. To teach models how to effectively evaluate graphs, GraphQA covers a wide range of tasks, from finding patterns to generating new connections. More advanced reasoning on graphs, such as discovering communities or identifying salient nodes, relies on these fundamental operations. Additionally, GraphQA includes the generation of random graphs through several algorithms such as Erdős-Rényi, scale-free networks, Barabasi-Albert model, and stochastic block model. It also includes the creation of simpler graph structures such as paths, full graphs, and star graphs, offering a diverse collection of data for training.

The team investigated various approaches to converting graphs into text that LLMs can process. They conducted three important experiments: one to assess the performance of LLMs on graph tasks and two to learn about the effects of LLM size and graph shape on performance. All their experiments are conducted in GraphQA.

They evaluated the performance of pre-trained LLMs on graph tasks such as cycle detection, node degree estimation, and link recognition. The findings showed that much depends on encoding: There is a strong relationship between the textual representation of the graph and LLM performance. In a broad sense, “event” coding performed extremely well across the board.

The team conducted this experiment to determine whether LLM performance improves with increasing LLM size (number of parameters). To achieve this, they ran the same charting tasks on four different PalM 2 sizes: XXS, XS, S, and L. The findings are summarized here:

  • When it came to graph reasoning tasks, larger models often performed better. The additional parameters appeared to allow them to learn more complex patterns.
  • Interestingly, the edge-exists task, which involves determining whether two nodes in a graph are related, was less affected by size.
  • When it came to the problem of cycle checking—determining whether a graph has a cycle—not even the largest LLM could reliably outperform a basic basis solution. This demonstrates the ability of LLMs to excel in specific charting tasks.

The researchers also investigated whether LLMs’ problem-solving abilities on a given graph are affected by its “shape”—the connections between its nodes. The study shows that graph structure significantly affects LLM performance. For example, LLMs performed admirably on graphs with many closely connected edges (where cycles are abundant) but poorly on path graphs (where cycles never appear) in a practice test for the existence of cycles. It was interesting to see how offering a few different cases helped it adapt. For circle controls, for example, they included both circle-containing and circle-free cases as few shots in the prompt.

Findings from this research shed light on best practices for LLM graphic preparation. With the right coding methods, an LLM can improve their accuracy on charting issues by a factor of five to sixty plus. The researchers hope that their new benchmark, GraphQA, will encourage more studies in this area.


check it Paper and Blog. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us Twitter. Join us Telegram channel, Discord Channeland LinkedIn Groops.

If you like our work, you will love our work newsletter..

Don’t forget to join us 38k+ ML SubReddits


20221028 101632 Dhanshree Shenwai

Dhanshree Shenwai is a Computer Science Engineer with good experience in FinTech companies covering Finance, Cards & Payments and Banking with strong interest in AI applications. He is enthusiastic about exploring new technologies and developments in today’s evolving world that make everyone’s life easy.


You May Also Like

0 Comments