新石器公园: Humanoid Robots: Revolution or Hype? The 50-Year Breakthrough

Is the hype surrounding humanoid robots justified, or is it another tech bubble inflated by capital? This summary explores the burgeoning field of embodied AI, including robots, robot dogs, and self-driving cars, examining its history, present state, technological underpinnings, and future potential. We'll analyze whether the current surge in humanoid robot development is truly different from past attempts and if these machines will impact our lives in the coming decades.

Quick Takeaways:

Early robots like Honda's ASIMO were impressive but impractical due to high costs and limited capabilities.
Traditional robotics relies on complex, pre-programmed movements, unlike human adaptability.
Recent advancements leverage AI and machine learning, particularly reinforcement learning, to enable robots to learn tasks.
Companies like Boston Dynamics have faced challenges, highlighting the difficulties in creating truly functional robots.
The rise of large language models has spurred renewed interest and investment in humanoid robotics, aiming for more adaptable, intelligent machines.

This series examines if this time is different and if robots will truly enter our homes.

The Humanoid Robot Hype: A Capital-Fueled Tech Bubble?

As humanoid robots, representing embodied intelligence, frequently make headlines in 2024, skepticism grows. Investment firms are pouring capital into this sector, driving startup valuations to billions. However, the actual performance of these robots often falls short, raising questions about whether embodied intelligence is simply a carefully constructed Ponzi scheme orchestrated by capital.

The Reality vs. The Hype

The sluggish movements of these robots, coupled with the need for extensive rehearsals for simple tasks like picking up a coffee cup, are a stark contrast to the hype. Hospital-purchased "smart" nursing assistants end up needing care themselves. This discrepancy between investor enthusiasm and practical application has fueled doubts about the maturity and viability of the technology.

Capital is pouring in, while reality lags.
Past attempts faced similar challenges.

A History of Unfulfilled Promises

The concept of embodied intelligence, encompassing robots, robot dogs, autonomous driving, and drones, isn't new. Past attempts have struggled to live up to expectations. The Honda ASIMO, released in 2000, represented the pinnacle of biomimicry at the time.

ASIMO: Despite its advanced capabilities, including running, climbing stairs, and interacting with humans, its core algorithms relied on meticulously programmed code and a prohibitive cost of millions of dollars. The project was terminated in 2018.
Boston Dynamics: Known for impressive feats like robotic backflips, the company faced challenges, changes in ownership, and declining valuations. The military found the diesel-powered "Big Dog" too noisy for combat.

A New Cycle Begins?

In 2021, Elon Musk announced his entry into the robotics field, sparking a new wave of interest in humanoid robots. The emergence of ChatGPT in 2023 further accelerated this trend, with numerous humanoid robot companies emerging, particularly in China.

2023 Momentum: Robots were showcasing impressive advancements, such as dancing, playing basketball, running marathons, and performing acrobatics.
Question: Is this time different?

Unpacking Embodied Intelligence

To understand the reality of embodied intelligence, a deep dive into its history, current state, technology, platforms, and the evolution of our understanding of intelligence is needed. This includes considering how humans understand intelligence, as well as its potential future impacts. The goal is to understand if robots can integrate into daily life, and if machines can truly mimic human intelligence.

The Fundamental Challenge: Control

The most basic understanding of robots revolves around the concept of control: how to control machines made of steel to behave like humans.

Early Attempts at Automata

Da Vinci's Robot Warrior (1495): A design for a water-powered robot capable of sitting, standing, and waving its arms.
Swiss Clockwork Writer (1774): A complex automaton with over 6,000 parts that could write, blink, and even dip its pen in ink.

These early attempts, while impressive, were essentially automated machines rather than true robots, as they lacked the ability to sense their environment or make autonomous decisions. The advent of modern computing was necessary for robots to exhibit any real intelligence.

The Turing Test and Its Limitations

In 1950, Alan Turing proposed the Turing test, suggesting that machines could eventually compete with humans in all purely intellectual fields. He considered equipping machines with the best sensory organs and teaching them to understand and speak English. While machines have since surpassed humans in areas like chess and even achieved a degree of language understanding and object recognition, true robots remain elusive.

The Three Branches of Embodied Intelligence

Pre-21st century, scientists believed a robot needed a full understanding of its environment to navigate successfully. Embodied intelligence was thus split into three segments:

Environmental Perception: The robot's ability to "see" and understand its surroundings.
Decision-Making: The robot's capacity to process information and make choices.
Action: The robot's ability to physically execute its decisions.

The Challenges of Environmental Perception

In 1972, Ichiro Kato at Waseda University created WABOT-1, the first full-size humanoid robot. It used cameras for eyes, microphones for hearing, and touch sensors in its hands to explore its environment and act using hydraulics. However, its movement was extremely slow, taking 45 seconds to rebuild the environmental model for each step of roughly 10 centimeters.

Laser LiDAR vs. Visual Perception: Modern robots rely on either laser LiDAR (Light Detection and Ranging) or visual perception to sense their surroundings. LiDAR uses laser beams to map the environment, while visual perception relies on cameras.
Traditional Machine Vision: Early visual systems relied on a "reductionist" approach. They focused on using mathematics to detect edges and segment images by light and shadow to reverse engineer depth and surface to build a 3D understanding.

The Complicated Problem of Movement and Balance

A robot's ability to move is dependent on controlling each joint to move to a desired location.

Forward Kinematics: Calculating the end point of a robotic arm based on the angle of each joint.
Inverse Kinematics: Determining the angle of each joint needed to place the arm at a specific point in space. Requires computing each rotation to achieve a desired goal.

Solving the Inverse Kinematics Problem

The core idea behind inverse kinematics calculation is decoupling - calculating each joint angle individually. Multiple solutions exist which makes choosing the right one difficult. More complex, high degree of freedom robots further complicate the calculations. To resolve the complexity, a key is to reduce the components.

Paaden-Kahan Subproblems: Bradley Evan Paaden in 1985 showed that certain geometric subproblems occur frequently in machine arm kinematics. By identifying them and using them as subroutines, solving common problems is easier. This approach is repeated iteratively. These subproblems include:
- Single Axis Alignment
- Dual Axis Collaboration
- Planar Constraints

From Theory to Practice

While the Paaden-Kahan subproblems offer a theoretical solution, real-world applications face challenges due to sensor and drive errors, design complexities, and algorithm limitations. Industrial robots operating in controlled environments have achieved success using these algorithms. In contrast, humanoid robots need to dynamically adjust to their surroundings. To do so, they must iterate their angles to get closer to the desired goal. However, it is a computationally intensive approach. It requires continual recalculations to gradually reach the goal.

Adding Dynamics

While calculating location, direction, and angles are necessary for movement, they don't consider velocity, momentum, and acceleration. The force needed to support the robot, the speed required to avoid falling, and the power for movements also need to be calculated.

Dynamic Calculation: The process that also considers friction, inertia, gravity, and mass when calculating movement.
Balance: Robots must continually calculate their center of gravity to remain stable.

Diverging Paths: AI vs. Robotics

While AI and robotics were initially intertwined, their paths diverged. AI has embraced machine learning and neural networks. Robotics has remained stuck in the mud.

Robotics as Pre-Programmed Instruction: Real-world robots lack the sophistication and intuition seen in science fiction. These robots perform actions through pre-programmed instructions.

A Shift in Perspective: Learning from Experience

In 2004, AI scientist Andrew Ng had Peter Abbeel as a student at Stanford. Abbeel introduced reinforcement learning into robot motion control.

Reinforcement Learning: Robots are placed in a simulated world and are rewarded when they perform in the desired way.

Abbeel used reinforcement learning to program a robot arm to fold towels. AI exploded after the success of Alex Knight at the imagine night competition, while robotics remained stagnant.

The Promise of the Future

In 2016, OpenAI released OpenAI Gym, a reinforcement learning open source tool kit. Soon after, a student from Shanghai University established a company called YuShu Technology. New technology could lead to breakthroughs in the field of robotics.

Looking Ahead

The humanoid robot industry faces many challenges and unanswered questions. Future video will discuss these problems.

Humanoid Robots: Revolution or Hype? The 50-Year Breakthrough

Summary

Quick Abstract