AI’s Ability to Solve Language-Based Puzzles Explored by NYU Researchers
Researchers at the NYU Tandon School of Engineering have delved into the question of whether modern natural language processing (NLP) techniques can effectively solve language-based puzzles. The team, including Julian Togelius, Assistant Professor of CSE and Director of Game Innovation Lab at NYU Tandon, focused on two AI methods: machine learning and high-level representation learning.
To test these methods, the team utilized GPT-3.5 and GPT-4, powerful language models developed by OpenAI. They also employed sentence embedding models like BERT, RoBERTa, MPNet, and MiniLM, which represent semantic data as vector representations. However, these models lack the comprehensive language understanding and generation skills of the LLMs.
While all the AI models displayed some ability to tackle the Connections puzzles, the challenge proved to be incredibly difficult. Nonetheless, certain models, such as GPT-3, exhibited notable achievements, surpassing others in the same category.
One interesting finding was the close connection between the models’ ability to categorize puzzle difficulty levels, mirroring human categorization from “simple” to “challenging.” By examining how LLMs struggle with the Connections problem, researchers can gain insight into the limitations of semantic processing in natural language.
The researchers also discovered that employing a piecemeal approach significantly improved GPT-4’s puzzle-solving ability, achieving an accuracy rate of over 39%. This highlights the effectiveness of “chain of thought prompts” in promoting structured thinking in vocabulary, a concept previously identified in prior research.
To conduct their study, the team utilized an online jigsaw archive featuring 250 puzzles from June 12th, 2023, to February 16th, 2024. With their investigation, the researchers aim to push the boundaries of AI capabilities and shed light on the potential of language models in completing complex tasks.