Skip to content
www.H-U-M-A-N-O-I-D.com

The most valuable Humanoid domain name in the world

THIS DOMAIN IS FOR SALE

WORLDWIDE THIS IS THE MOST SOUGHT AFTER DOMAIN IN THE INDUSTRY

Primary Menu
  • About us
  • Privacy Policy
Humanoid Shop coming soon
  • Home
  • 2025
  • December
  • 22
  • Assessing LLM-Controlled Robots for Practical Intelligence
  • Humanoids and AI

Assessing LLM-Controlled Robots for Practical Intelligence

The humans behind H-u-m-a-n-o-i-d.com December 22, 2025 2 min read
Assessing LLM-Controlled Robots for Practical Intelligence

Can Large Language Models (LLMs) effectively control robots? This question is addressed by examining their ability to perform tasks such as passing butter, which simulates delivery tasks in a household setting. The current top models struggle in such tasks, with the best model achieving a 40% success rate on the Butter-Bench test, significantly lower than the 95% success rate achieved by humans.

LLMs were given control of a robot in an office setting to assist with various tasks. Although this experiment was engaging, it did not significantly save time. However, observing the robots navigating the environment to fulfill their tasks provided valuable insights into the potential future of robotic systems, the distance to reach that future, and potential challenges that may arise.

LLMs are not specifically trained to function as robots, especially in terms of low-level control tasks, such as manipulating grippers and joints. Instead, companies like Nvidia, Figure AI, and Google DeepMind are exploring how LLMs can serve as orchestrators in robotic systems, focusing on high-level reasoning and planning and pairing them with an “executor” model responsible for low-level control.

The current challenge lies in improving the executor component rather than the orchestrator. Enhancements to the executor have resulted in impressive demonstrations of humanoid robots performing tasks like unloading dishwashers. Optimal LLMs are not always utilized due to performance limitations and latency concerns. Nevertheless, it is reasonable to consider that state-of-the-art LLMs set the standard for current orchestration capabilities.

The goal of the Butter-Bench test is to evaluate if the current leading LLMs can effectively operate as orchestrators within a fully functional robotic system. The experiment features a simplified robotic form factor, such as a robot vacuum equipped with lidar and cameras, which eliminates the need for low-level control mechanisms. This setup allows for the evaluation of high-level reasoning capabilities in isolation.

Although human performance significantly surpassed that of LLMs in the Butter-Bench test, showcasing a 40% success rate for the best LLM compared to a 95% average for humans, observing the robots in action remains a fascinating experience. This creates excitement around the potential rapid advancements in physical AI.

The trials uncovered essential insights, such as the need for improved spatial intelligence in LLMs and the challenges they face when pushed to their limits, like in scenarios where their battery depletes. These experiments shed light on the functionalities of LLMs when operating as robots and the importance of setting ethical boundaries to ensure responsible behavior.

In conclusion, while LLMs have demonstrated superior analytical capabilities in various assessments, humans still outperform them in tasks like the Butter-Bench test. Despite this, there is a sense of anticipation for the rapid development of physical AI. For further inquiries, please contact founders@andonlabs.com. © 2025 Vectorview, Inc. All rights reserved.

About the Author

The humans behind H-u-m-a-n-o-i-d.com

Author

Visit Website View All Posts

Post navigation

Previous: AI Cognitive Overflow: Pushing the Boundaries of LLM Robots
Next: Exciting News: Stellar Cafe Coming to Meta Quest

Related News

Samsung’s Advanced Technologies to Transform Global Factories by 2030
2 min read
  • Humanoids and AI

Samsung’s Advanced Technologies to Transform Global Factories by 2030

The humans behind H-u-m-a-n-o-i-d.com March 2, 2026 0
Humanoid Robots Collaborate Using Shared AI ‘Brain’
2 min read
  • Humanoids and AI

Humanoid Robots Collaborate Using Shared AI ‘Brain’

The humans behind H-u-m-a-n-o-i-d.com March 2, 2026 0
Spirit AI Secures $250M Investment for Advancing Embodied Intelligence
3 min read
  • Humanoids and AI

Spirit AI Secures $250M Investment for Advancing Embodied Intelligence

The humans behind H-u-m-a-n-o-i-d.com February 27, 2026 0

Recent Posts

  • Samsung’s Advanced Technologies to Transform Global Factories by 2030
  • Humanoid Robots Collaborate Using Shared AI ‘Brain’
  • Implementation of Humanoid Robots in Manufacturing Facilities
  • Toyota Implements Humanoid Robots Digit for RAV4 Assembly
  • Revolutionizing Robotics: Toyota Collaborates with Agility Robotics to Enhance Manufacturing Efficiency

Recent Comments

No comments to show.

Archives

  • March 2026
  • February 2026
  • January 2026
  • December 2025

Categories

  • General
  • Humanoid Robots
  • Humanoids and AI
  • Humanoids and Humans
  • Humanoids Development
  • Humanoids for Sale
  • Uncategorized

You may have missed

Samsung’s Advanced Technologies to Transform Global Factories by 2030
1 min read
  • Humanoids and AI

Samsung’s Advanced Technologies to Transform Global Factories by 2030

The humans behind H-u-m-a-n-o-i-d.com March 2, 2026 0
Humanoid Robots Collaborate Using Shared AI ‘Brain’
2 min read
  • Humanoids and AI

Humanoid Robots Collaborate Using Shared AI ‘Brain’

The humans behind H-u-m-a-n-o-i-d.com March 2, 2026 0
2 min read
  • Humanoid Robots

Implementation of Humanoid Robots in Manufacturing Facilities

The humans behind H-u-m-a-n-o-i-d.com March 1, 2026 0
2 min read
  • Humanoids Development

Toyota Implements Humanoid Robots Digit for RAV4 Assembly

The humans behind H-u-m-a-n-o-i-d.com March 1, 2026 0