Humanoid robots, also known as anthropomorphic robots, possess human-like perception, decision-making, behavior, and interaction capabilities. They have human-like appearances, sensory systems, intelligent thinking methods, control systems, and decision-making abilities, ultimately exhibiting "human-like behavior."
• Humanoid robots involve engineering and control science, integrating research achievements from fields such as electronics, mechanics, automation control, and computer science. They cannot simply achieve humanoid functions by purchasing and assembling components.
• Humanoid robots are classified by height into large humanoid robots and medium-small humanoid robots.
Research on humanoid robots began in Japan and has now entered a high dynamic movement development stage. Reviewing the development history of humanoid robots, there are three important milestones:
• First stage: The early development stage represented by the humanoid robot from Waseda University;
• Second stage: The system integration development stage represented by Honda's humanoid robot;
• Third stage: The high dynamic movement development stage represented by Boston Dynamics' humanoid robot;
Japan was the first to initiate research on humanoid robots, achieving bipedal walking.
• In 1971, Professor Kato from Waseda University introduced the hydraulic system-based bipedal robots WL-3 and WL-5, achieving a walking stride of 15 cm and a cycle of 45 seconds for static walking.
• The subsequently designed motor-driven WL-9R and WL-10DR achieved dynamic walking through ankle joint torque control, shortening the single-step cycle to 1.3 seconds.
• In 2006, Professor Takashi Jun from Kato's lab introduced the humanoid robot WABIAN-2R (with 41 degrees of freedom), achieving a walking speed of 1.8 km/h and adapting to various ground conditions.
HONDA's Asimo represented the most advanced technology level at the time.
• In 1996, Japan's HONDA developed the first humanoid robot P1, followed by P2, which could walk on ordinary roads, and later P3.
• On November 12, 2000, the most representative motor-controlled bipedal robot Asimo was released, standing 120 cm tall, weighing 52 kg, and walking at a speed of 0 to 1.6 km/h.
• The third-generation ASIMO robot was released in 2011, with a walking speed of up to 9 km/h, capable of climbing stairs, kicking a soccer ball with one leg, and jumping on one leg, with a walking stride that can be continuously adjusted, achieving 57 degrees of freedom, making it suitable for service robot applications in fixed environments.
Cassie embodies a new drive design, enriching the drive technology route.
• In 1997, researchers including Grizzle from the University of Michigan developed the underactuated bipedal robot RABBIT, which can achieve dynamic walking without feet.
• Based on RABBIT, a series of underactuated walking robots including MEBAL, MARLO, and ATRIAS were developed, achieving three-dimensional underactuated walking.
• In 2017, the robot Cassie was released, priced at about $70,000, with its drive motors positioned high and springs added to the legs, achieving efficient gait while being able to stand still.
• In 2022, Digit was launched based on Cassie, featuring robust walking and running gaits, with the ability to climb stairs and autonomous navigation capabilities, applicable for package handling.
Source: CNKI, Zhejiang University, 1997, Grizzle and others from the University of Michigan developed the underactuated bipedal robot RABBIT.
The HRP series robots can achieve stable walking and collaborate with humans.
• In 1998, the National Institute of Advanced Industrial Science and Technology in Japan began leading the HRP series project, aimed at developing "humanoid robot systems that can coordinate and coexist with humans in human work and living environments, capable of completing complex tasks."
• HRP-2 and HRP-3 can walk stably and perform various dexterous movements (such as Japanese dance), collaborate with humans to lift objects, overcome obstacles, pick up objects from the ground, protect themselves when falling, and stand up again.
Source: Company website, Zhejiang University, CITIC Construction Investment, Japan National Institute of Advanced Industrial Science and Technology launched HRP system bipedal robots.
Atlas uses a self-designed hydraulic drive system, with the world's leading mobility capabilities.
• Boston Dynamics developed the hydraulic-driven quadruped robot BigDog prototype with funding from the Defense Advanced Research Projects Agency (DARPA).
• In October 2009, Boston Dynamics released PETMAN, a military device designed for the U.S. experimental protective clothing, featuring strong self-balancing capabilities and motion performance, able to adjust its gait and maintain balance in response to external environmental disturbances.
• Since its release in 2013, Atlas has undergone three major iterations, with its all-terrain adaptability representing the current highest level.
IIT launched WALK-MAN, influential in Europe.
• IIT launched the WALK-MAN firefighting robot, incorporating force control to form a torque-controlled hand joint, sacrificing some rigidity of the robot.
• In 2008, IIT manufactured the open-source humanoid robot iCub for research on perceptual learning and human-robot interaction, featuring excellent human-robot interaction capabilities. It is designed to resemble the size of a three-and-a-half-year-old child, standing 1 meter tall, with 53 degrees of freedom, capable of walking and balancing on one leg.
• In 2012, the bipedal robot COMAN was developed, with SEA drives used for all joints in the forward plane.
Swiss research institutions applied passive flexibility to further enhance jumping and terrain adaptability.
• In 2011, the Robot Systems Laboratory at the Institute of Robotics and Intelligent Systems at ETH Zurich developed the single-leg robot ScarlETH based on SEA joints, utilizing the robot's passive flexibility to achieve high-energy-efficient jumping and terrain adaptability.
• Based on this, a motor-driven quadruped robot StarlETH and ANYmal were developed.
HUBO won first place in the DRC competition, promoting research and development in Asia.
• The bipedal robot HUBO from KAIST (Korea Advanced Institute of Science and Technology) won first place in the 2015 DRC competition with its hybrid movement method of wheels and feet.
• With the help of Rainbow Robotics, HUBO2 became the world's first commercial humanoid robot platform. It was purchased by leading research institutions (MIT, Google, etc.) as a research platform.
• The "HUBO2" robot walks at a speed of 1.4 km/h with straight knees and can run at a speed of 3.6 km/h.
HUBO won first place in the DRC competition in 2015 with its hybrid movement method of wheels and feet.
The University of Tokyo launched a new version of Schaft, reducing costs and energy consumption.
• In 2013, the humanoid robot team Schaft, acquired by Google, won the championship in the DRC 2013 competition. It stands 1480 mm tall, weighs 95 kg, and has functions such as walking and climbing stairs.
• In 2016, a new low-cost, low-energy humanoid robot was released, capable of carrying 66 kg.
Small humanoid robot development is in full swing, enriching and expanding application scenarios.
• France's Aldebaran Robotics launched the NAO typical robot, with sales exceeding 10,000 units. The company has consistently pursued a commercialization path, significantly differing from Boston Dynamics and Asimo, and was later acquired by Japan's SoftBank. Subsequently, Pepper and Romeo robots were launched.
• Among small bipedal robots under 50 cm in height, the Darwin-OP robot from South Korea's Robotis is quite famous for its stable walking and color recognition.
• South Korea's Hitec company launched Robonova-1, while the domestic Leju (Shenzhen) robotics subsidiary launched the "Aleos" robot.
Domestic research on humanoid robots started relatively late, mainly led by universities and research institutions.
• Tsinghua University, Zhejiang University, Shanghai Jiao Tong University, Beijing Institute of Technology, and the Chinese Academy of Sciences have successively conducted research on humanoid robots.
• The National University of Defense Technology started early, developing the "Pioneer" in 2000 and the Blackman in 2003, which stands 1.55 m tall, weighs 63.5 kg, has 36 degrees of freedom, and can achieve a maximum walking speed of 1 km/h, with in-depth research on robot turning and walking on uneven surfaces.
• In 2002, Tsinghua developed the THBIP-I robot, which is 1.7 m tall and weighs 130 kg, capable of stable walking and climbing stairs.
• In 2022, Beijing Institute of Technology launched the "BHR-1," achieving independent walking without external cables for the first time; in 2005, BHR-2 broke through technologies for stable walking and complex motion planning.
Domestic humanoid robot research started relatively late, mainly led by universities and military research.
An average adult typically has 206 bones and nearly 230 joints, forming 244 degrees of freedom controlled by 630 muscles.
• If the human body is to be accurately modeled, the work will be extremely complex. Hanavan proposed simplifying the human body model, usually dividing it into 15 parts corresponding to the head, chest, upper arms, forearms, hands, thighs, calves, and feet.
• Humanoid robots are highly flexible, strong nonlinear dynamic systems, typically analyzed through a combination of multi-body dynamics systems and numerical simulations for dynamics and kinematics analysis.
• In robot motion analysis, both dynamics analysis and kinematics analysis are included, with kinematics divided into forward kinematics and inverse kinematics.
Humanoid robots not only possess some human-like shapes, such as upper limbs and heads, but should also have human-like lower limb structures and bipedal walking capabilities.
• In the design process of bionic mechanisms, the degrees of freedom composition is determined based on target specifications, and the types and numbers of joints are decided. Structurally, they are usually composed of multiple single-degree-of-freedom rotary joints.
• Sensors are typically used to simulate human perception of the environment, such as machine vision, pressure sensors, touch sensors, directional microphones, sonar rangefinders, etc.
• The NAO robot has a total of 25 drive motors, 2 cameras, 9 touch sensors, 4 directional microphones, 8 pressure sensors, 2 sets of infrared receivers and generators, and sonar rangefinders.
Joint drive route one: Hydraulic drive has high force and strong explosiveness.
• Advantages: High output power, no need for a reducer, strong force, strong explosiveness, and high capacity to withstand mechanical shocks and damage.
• Disadvantages: Hydraulic systems are prone to oil leaks, large in size, noisy, and have high power consumption, requiring a hydraulic source.
Joint drive route two: Motor drive is the most traditional, with a simple structure and wide application.
• Advantages: Simple structure, precise position servo.
• Disadvantages: Poor torque servo, high transmission loss, and less explosiveness than hydraulic drives.
Joint drive route two: Motor drive + flexible software enhances energy storage cycle capability.
• Advantages: High torque precision, passive flexibility, capable of energy storage cycles.
• Disadvantages: Poor position servo, limited response bandwidth.
Joint drive route two: Direct motor drive scheme achieves high position accuracy and fast response.
• Advantages: High torque precision, high position accuracy, fast response.
• Disadvantages: Motors need to be customized, and motor size is large.
Joint drive route three: Pneumatic drive is lightweight and low-cost, but control precision is not high.
• Advantages: Pneumatic artificial muscles are lightweight, low-cost, easy to maintain, and have a larger power-to-volume ratio and power-to-weight ratio compared to cylinders.
• Disadvantages: Control precision is not high, work efficiency is low, and work speed stability is poor.
Each of the three driving methods has its characteristics; motor drive is the most traditional, while hydraulic drive is the most expensive.
• Hydraulic, motor, and pneumatic drive methods each have their characteristics, with motor drive being the most traditional, rapidly advancing in technology, and widely applied globally; hydraulic drive is challenging, with high difficulty in hydraulic valves and extremely high system costs, providing the best robot motion performance; pneumatic drive performance is between hydraulic and motor direct drives, currently applied relatively less.
Balance control directly affects walking performance, and companies usually develop core control algorithms independently.
• The core issues of robot state estimation include: sensor selection and layout, sensor data calibration, modeling of the robot body, and multi-sensor data fusion.
• In the design choices of controllers, control strategies are usually selected based on the robot's own state and model, followed by executing control commands. The design of the controller is the most critical part of robot design.
To achieve good human-robot interaction performance, algorithms, AI technologies, and sensors are essential.
• In motion planning and interaction design based on environmental perception, a good understanding and cognition of the environment are required, calculating feasible areas, reasonably selecting contact points (such as bipedal, dual-hand, or using hands and feet), as well as selecting step lengths and optimizing models.
Humanoid robot batteries: Estimating the basic parameters of battery packs from limited performance indicators.
• Boston Dynamics' Atlas robot has a maximum power of 5 kW and an overall weight of 80 kg. The mounted 48V lithium-ion battery pack weighs 5-10 kg, with a mass energy density of 200-250 Wh/kg and a volume energy density of 500 Wh/L estimated. The discharge rate of this battery pack is 2C-5C, with a volume of 2-5 L and a mass power density of 0.5-1 kW/kg, and a volume power density of 1-2.5 kW/L.
• Based on the performance ranges of mass energy density and mass power density, we estimate that the battery pack used by Atlas is similar to high-performance power battery packs.
Latest developments in power batteries: CTP3.0 "Qilin Battery" is on the horizon.
• According to CATL's official website, the "Qilin Battery" using CTP3.0 technology can achieve a mass energy density of 255 Wh/kg (ternary) or 160 Wh/kg (iron lithium), a volume utilization rate of 72%, 4C fast charging, 5-minute hot start, and multiple performance indicators with no thermal diffusion.
Looking ahead: What are the material demand directions for humanoid robot batteries?
• It can be seen that humanoid robots do not have high requirements for discharge rates and cycle life, but they have high requirements for mass and volume energy density, and there are potential requirements for fast charging capabilities.
• Therefore, batteries and battery materials with high energy density, preferably with fast charging capabilities, are the demand direction for humanoid robot batteries.
• High nickel/mid-nickel high-voltage ternary cathodes belonging to layered oxide cathodes are currently the preferred choice, and lithium-rich manganese-based cathodes may also occupy a place in the future.
Looking ahead: What are the material demand directions for humanoid robot batteries?
• Lithium supplementation in lithium battery material systems involves introducing high lithium content substances into the battery material system, allowing these high lithium content substances to effectively release lithium ions and electrons to compensate for the loss of active lithium.
• Whether in the anode or cathode, pre-lithiation can improve the actual energy density of the battery, even if lithium consumption still exists, the capacity of the battery's active material vacancy no longer exists.
If solid electrolytes can be made lightweight, thin, strong, and highly stable, it will significantly enhance battery energy density.
• Humanoid robot batteries have relatively low requirements for cycle life but may have high safety requirements, making them a potential high-quality application scenario for high-energy-density solid-state batteries.
Components and Materials Exclusive to Humanoid Robots#
High-explosive motors, high-performance chips, precision reducers, high-precision sensors, long-lasting batteries, and other core components will build a more stable and high-performance hardware system for humanoid robots.
Artificial intelligence empowers humanoid robot design.
AI for Design of Humanoid Robots
Based on artificial intelligence technologies such as neural networks, graph grammar, and evolutionary algorithms, humanoid robot modules such as legs, arms, and trunks can be automatically constructed according to scene and task requirements, achieving coordinated optimization of form and control.
Motion Intelligence of Humanoid Robots
p Walking on Complex Terrains: Humanoid robots are expected to adapt to complex terrains and narrow environments built for humans, such as slopes, steps, and thresholds, achieving stable, adaptive, and anti-interference walking.
p Cooperative Operation of Dual-arm: In the case of unstable lower body, humanoid robots are expected to complete high-performance operation tasks with collaborative dual arms using human tools and equipment.
p Compensation for Hardware with Software: When the hardware performance of humanoid robots is subpar and the sensory information is lacking, this technology systematically seeks and fully utilizes environmental and information constraints to compensate for the performance of hardware, achieving high-level task execution.
Multimodal Large Model for Humanoid Robots
p By integrating multimodal information such as voice, images, text, sensor signals, and 3D point clouds, humanoid robots will have stronger multimodal understanding, generation, and association capabilities for cognitive and decision-making planning, enhancing their generalization ability in complex scene tasks.
Large-Scale Dataset for Humanoid Robots
p Based on simulation synthesis or data collection from physical robots, large-scale, standardized humanoid robot datasets will be constructed, benefiting the design, simulation training, and algorithm transfer capabilities of humanoid robots.
Embodied Intelligence for Humanoid Robots
p Embodied intelligence is a high-quality, high-performance intelligent system capable of making rapid and precise responses under high variability; it is neither a simple computer simulation in a virtual environment nor a purely physical electromechanical system, but is closely related to humanoid robot systems.
Humanoid Robots Inspired by Human Anatomy and Neural Mechanisms#
p Unlike most existing methods in humanoid robot research that simulate human functions from the outside in, this approach simulates the human musculoskeletal system and neural mechanisms from the inside out, exploring the essential mechanisms by which humans achieve high dexterity, high compliance, and high intelligent behavior. As a new avenue for humanoid robot research, it is expected to build a more efficient and stable system closer to humans.
Open Source Community for Humanoid Robots
p This community will gather experts and scholars in the field of humanoid robots globally, promoting technical discussions, information exchange, and multi-party cooperation, facilitating deep integration and collaborative development of the upstream and downstream of the industrial chain.
‘Manufactory’ of Humanoid Robots
p This will connect software environments based on analytical technologies and large models for body design-control-intelligent algorithm development, rapidly and customarily designing and processing high-quality, intelligent humanoid robot systems according to performance requirements, achieving hardware system verification through software-hardware consistency and the development of new components.
Applications of Humanoid Robots#
Humanoid robots possess versatility and intelligence, seamlessly using human tools, ensuring that their application scenarios continue to expand and deepen, profoundly transforming human production and lifestyle, leading society into a new stage of intelligent development, and bringing disruptive changes to various industries.
In the industrial sector, they will widely participate in dangerous work production processes, greatly improving production efficiency and safety; in special fields, they will become an important force for scientific exploration, disaster relief, and security inspections in extreme environments; in the livelihood sector, they will fully integrate into people's lives, from providing domestic services to participating in medical assistance, becoming an indispensable presence.
The development history of humanoid robots: When dreams come true, commercialization is imminent.
Multimodal large models endow robots with generalization capabilities, and the dawn of embodied intelligence is emerging.
◼ General large models bring revolutionary potential for embodied intelligence. The hardware of humanoid robots determines the flexibility of movement, with components mostly migrated from applications in other industries; cost issues can be addressed through large-scale production in the industrial chain. Software algorithms act as the "brain" of the robots, determining the upper limits of their applications and being the main bottleneck for the commercialization of robots. Previously, robots relied on inherent program settings to perform tasks, making it difficult to have universally applicable algorithms across various scenarios, limiting the practical applications of robots. In recent years, the development of general large models such as LLM, VLM, and VNM has endowed the robot body with powerful generalization capabilities, allowing robots to adapt to more complex scenarios without requiring programming by non-professionals, accelerating the commercialization process of humanoid robots. "Embodied intelligent" robots are no longer mechanically completing single tasks but are capable of autonomously planning, deciding, acting, and executing based on perceived tasks and environments, with capabilities such as language interaction, intelligent decision-making, autonomous learning, and multimodal perception.
1.3 Tesla leads the way, and tech giants accelerate entry to drive industrial innovation.
◼ Tech giants are accelerating their entry to drive industrial innovation. 1) Tesla: On September 30, 2022, Tesla launched the humanoid robot prototype Optimus, and in 2023, Musk stated that Tesla's long-term value would come from AI and robotics; 2) OpenAI: In March 2023, OpenAI invested in Norwegian humanoid robot company 1X Technologies; in May 2024, OpenAI announced it had restarted its robotics team two months prior; 3) Samsung: In January 2023, Samsung invested 59 billion KRW in South Korean robot manufacturer Rainbow Robotics; 4) NVIDIA: In May 2023, Huang Renxun stated that the next wave of AI would be embodied intelligence; in February 2024, NVIDIA established a research department for general embodied intelligent agents; in March 2024, NVIDIA released the humanoid robot large model Project GR00T; in June 2024, Huang emphasized that "the next wave of AI is physical AI, and the era of robots has arrived"; 5) Figure AI: Established in 2022, Figure AI received a total of $675 million in investments from tech companies including NVIDIA, Microsoft, OpenAI, and Intel in February 2024.
1.3 Tesla Optimus progresses beyond expectations, and the industry begins a new round of "arms race."
◼ Tesla Optimus is rapidly iterating, leading a new wave of technological revolution. Musk proposed the humanoid robot concept Tesla Bot at the 2021 AI DAY, then began rapid development and iteration. In February 2022, the development platform was completed, and in October 2022, the prototype Optimus was officially launched at AI DAY, capable of simple actions such as walking, carrying, and watering. In December 2023, Optimus-Gen2 was launched, significantly evolved compared to the first generation, with improved perception, brain, and control capabilities. Tesla's humanoid robot can form a complete industrial closed loop, and commercialization is worth looking forward to: Optimus reuses autonomous driving-related technologies, rapidly evolving from a concept machine to an intelligent and flexible robot. The production and sales of Tesla cars also provide preliminary scenarios for the commercialization of humanoid robots, and the advantages of the industrial chain offer possibilities for cost reduction, with a long-term mass production price target of $20,000 per unit.
Humanoid robots will first land in factories and will be applied in commercial services and family companionship in the future.
◼ Humanoid robots will gradually move from factories to homes, transitioning from B2B to B2C. From the strategic planning of mainstream robot manufacturers, humanoid robots will first be applied in the industrial manufacturing sector, and after accumulating maturity, will expand into commercial services, family companionship, and other scenarios. This is mainly because factory manufacturing scenarios are relatively simple, and the demand for machines to replace humans is more urgent, while commercial and family scenarios are complex, requiring high software and hardware standards for humanoid robots.
◼ The "Guiding Opinions on the Innovative Development of Humanoid Robots" points out three major demonstration scenarios: special services, manufacturing, and people's livelihood, outlining a deep integration with the real economy by 2027. The application of humanoid robots in China will proceed in two steps: the first phase aims for initial applications in special services, manufacturing, and people's livelihood by 2025; the second phase aims for accelerated large-scale development of the industry by 2027, with richer application scenarios and related products deeply integrated into the real economy, becoming an important new engine for economic growth, and humanoid robots are expected to deeply integrate into daily life.
Disassembly of Tesla's humanoid robot: 14 rotary joints + 14 linear joints + 12 hand joints#
Disassembly of Tesla's humanoid robot: rotary joints, linear joints, and hand joints.
◼ Rotary joints: Mainly composed of "actuator + torque sensor + encoder + frameless torque motor + harmonic reducer + bearing + mechanical clutch," similar to collaborative robot joint modules. Data is transmitted to the actuator through input sensors, controlling the motor, and the harmonic reducer amplifies the output torque, with output sensors providing position feedback and optimizing algorithms.
◼ Linear joints: Mainly composed of "actuator + torque sensor + encoder + frameless torque motor + screw + bearing," where the actuator drives the frameless torque motor to rotate, converting rotational motion into linear motion via the screw.
◼ Hand joints: Mainly composed of "actuator + encoder + sensor + hollow cup motor + planetary gearbox + worm gear," featuring adaptive capabilities and non-reversible drive capabilities, capable of bearing 20 pounds, using tools, and accurately grasping parts.
Estimated single-unit cost of humanoid robots and potential supplier overview (taking Tesla Bot and related domestic components as examples)#
Estimated cost distribution of various links/components of Tesla's humanoid robot (based on domestic component prices)#
Frameless torque motors: High efficiency, compact structure, easy maintenance, used for humanoid robot linear and rotary joints.
◼ Frameless torque motors are a special type of permanent magnet brushless synchronous motor, lacking shafts, bearings, casings, feedback, or end caps, consisting only of stator and rotor components, with the rotor made of a rotating steel ring assembly with permanent magnets, directly mounted on the machine shaft; the stator is the external component, with gear-like steel sheets and copper windings to generate electromagnetic forces tightly adhering to the machine casing.
◼ Frameless torque motors have advantages of high efficiency, compact structure, and easy maintenance. 1) High efficiency: Directly integrating the motor into the rotating shaft component reduces overall system inertia, thus lowering the torque required for motor acceleration and deceleration, improving control over the motor's motion and stability time, increasing system bandwidth, and enhancing machine efficiency; 2) Compact structure: Increasing torque density reduces footprint and weight; 3) Easy maintenance: Fewer mechanical components, with no easily worn or maintained parts.
Precision reducers include RV reducers, harmonic reducers, and planetary reducers. Reducers are transmission components made up of multiple gears, using gear meshing to change motor speed, torque, and load capacity, and can also achieve precise control. There are many types and models of reducers, which can be divided into general transmission reducers and precision reducers based on control precision. General transmission precision reducers have low control precision and can meet basic power transmission needs of mechanical equipment. Precision reducers have small backlash, high precision, long service life, and are more reliable and stable, applied in high-end fields such as robots and CNC machine tools, specifically including RV reducers, harmonic reducers, and planetary reducers.
◼ Humanoid robot rotary joints will use harmonic reducers, while hand joints or some low-precision body joints may use planetary reducers. RV reducers are larger in size and have limited applications in humanoid robots. Harmonic reducers are small, have a large reduction ratio, and high precision, and will be used for humanoid robot body rotary joints; planetary reducers are small, lightweight, have high transmission efficiency, long lifespan, but lower precision than harmonic reducers, and will be used for humanoid robot hand joints or body joints with lower precision requirements.
Tesla's humanoid robot includes three categories of a total of 14 linear actuators distributed in the arms and legs. Tesla Optimus has 14 linear actuators, specifically including three types, with output/weight ratios of 500N/0.36kg, 3900N/0.93kg, and 8000N/2.20kg; the distribution locations are in the upper arms (21), forearms (22), thighs (22), and calves (22).
◼ The cost of screws is currently high, but there is potential for future reduction. Linear actuators consist of "actuator + frameless torque motor + screw + torque sensor + encoder + bearing," where the screw is an important component. According to our estimates, the current cost of screws accounts for about 23.4% of the cost of Tesla's humanoid robot, with the final cost share expected to be 13.9%. In terms of types, screws used in humanoid robots can be divided into trapezoidal screws and roller screws, with trapezoidal screws used for forearms and roller screws used for higher load-bearing requirements in upper arms, thighs, and calves.
Compared to ball screws, roller screws have higher load capacity, longer lifespan, larger speed and acceleration, and smaller lead, making them more suitable for humanoid robots. Screws are transmission accessories that convert rotational motion into linear motion, and can be divided into sliding screws, rolling screws, and hydrostatic screws based on friction characteristics, with rolling screws further divided into ball screws and planetary roller screws. The distinction lies in that the load transfer unit of planetary roller screws is a threaded roller, which is a typical line contact; while the load transfer unit of ball screws is a ball, which is a point contact. Compared to ball screws, planetary roller screws have more contact points, thus being able to withstand higher static and dynamic loads, with static loads three times that of ball screws and lifespans fifteen times that of ball screws; they also have stronger rigidity and impact resistance, allowing for greater speed and acceleration; and a wider range of pitch designs, with smaller leads.
Screws: Standard roller screws are suitable for high load, high-speed scenarios and are widely used.#
Planetary roller screws can be divided into five categories based on their structural composition and the relative motion relationships of components: standard, reverse, circulating, bearing ring, and differential types. Standard roller screws are suitable for harsh environments, high loads, and high speeds, primarily applied in precision machine tools, robots, and military equipment, and are currently the main application type.
Screws: High manufacturing precision through cutting processes, including turning, milling, grinding, and other core processes.#
The core components of roller screws, including screws, rollers, and nuts, are precision threaded parts with small pitches, and the processing steps are generally consistent. Traditional processing methods can be divided into two main categories: cutting and rolling.
✓ Cutting: Using the center holes at both ends as the processing reference, completing the process through heat treatment, turning, grinding, and more than 10-20 steps, achieving a manufacturing precision of up to P1 level, capable of realizing positioning and transmission functions.
✓ Rolling: Using forming rolling molds to induce plastic deformation in the workpiece to obtain threads, with high automation in the mold opening process, low cost after batch production, high efficiency, but lower manufacturing precision, generally around P7 level, only achieving transmission functions.
The rough processing of roller screws has diverse technical routes, while grinding remains essential in the finishing process. The cutting process of roller screws can be roughly divided into steps: rough cutting, preparatory heat treatment (annealing), rough processing, final heat treatment (quenching), finishing, and assembly inspection. Rough processing includes turning, milling, and grinding three process routes (which can be used individually or in combination), while the finishing process is grinding. New processing techniques such as "turning instead of grinding" and "swirling milling" theoretically can replace grinding and improve processing efficiency, but the technology is still maturing, and finishing still requires grinding technology and grinding machines.
Dexterous Hands: Hollow Cup Motors/Brushless Synchronous Motors are the core power sources.#
Dexterous hand motors mainly use hollow cup motors or brushless synchronous motors with slots. Micro-special motors have characteristics such as small size, high power density, and low noise, making them more suitable for the compact space and load capacity requirements of humanoid robot dexterous hands compared to traditional motors. Hollow cup motors and brushless synchronous motors with slots are currently the mainstream solutions for dexterous hands.
4 Dexterous Hands: Hollow Cup Motors - Core barriers lie in coil design, winding, and equipment.
The three core barriers of hollow cup motors are coil design, coil winding, and automation equipment. The rotor of the brushless hollow cup motor consists of an annular magnetic steel, a rotating shaft, and its fixing components, while the stator is made of annular silicon steel sheets and hollow cup coils bonded together, with the core process being the design and manufacturing of the coils. Common winding methods for hollow cup motors include straight winding, saddle winding, and inclined winding, with winding methods divided into manual winding, semi-automated (winding type), and one-time automated winding. Foreign countries mainly use one-time winding forming production technology, with a high degree of automation, capable of processing wire diameters of 0.08-0.2 mm for motors below 400W; while domestic production mainly uses winding type, relying on manual labor, with low production efficiency and limited wire diameter, and one-time forming winding equipment needs breakthroughs.
◼ The hollow cup market is steadily growing, and humanoid robots open new spaces. Hollow cup motors are mainly applied in high-precision, high-speed response, and compact efficient scenarios, such as aerospace, instrumentation, industrial robots, and medical fields. According to QYResearch data, the global hollow cup motor market size was approximately $810 million in 2023, expected to grow to $1.19 billion by 2028, with a CAGR of 8% from 2023 to 2028. According to MarketResearch data, in 2021, the market size of hollow cup motors in China and Europe accounted for 34.8% and 25.85%, respectively.
Sensors are the medium through which robots perceive the world and can be divided into internal and external sensors. Sensors convert the physical quantities perceived by robots regarding internal and external environments into electrical outputs. Depending on the detection objects, they can be divided into internal sensors and external sensors. Internal sensors are used to measure the robot's own state, such as position, speed, and acceleration; external sensors are used to measure the external environment related to the robot's operations, such as vision, hearing, touch, and smell.
Chart: Robot sensor schematic.
Sensor classification and main functions:
Internal Sensors
Photoelectric encoder for motor angle/rotation speed measurement, mileage measurement.
Inertial measurement unit for measuring the posture of mobile robots.
Accelerometer for measuring acceleration.
External Sensors
Vision sensors for recognizing objects, navigation, and mapping tasks, including cameras, LiDAR, infrared sensors, etc.
Auditory sensors for receiving sound signals to recognize and understand language, including microphones and speakers.
Tactile sensors for perceiving contact force and contact area information between the robot and external objects, including force sensors and pressure sensors.
Olfactory sensors for sensing odor information in the surrounding environment, used for environmental monitoring, hygiene inspections, etc.
Chart: Robot sensors can be divided into internal and external sensors.
Torque sensors are important components for robotic arms to perceive force. Torque sensors, also known as torque transducers, can detect torsional forces on various rotating or non-rotating mechanical components, converting physical changes in torque into precise electrical signals, with advantages such as high precision, fast frequency response, good reliability, and long lifespan. Torque sensors are one of the key components of robotic arms, providing real-time force and torque information to assist robotic arms in completing precise and intelligent operational tasks.
◼ In humanoid robots, six-dimensional torque sensors are mainly used in wrists and ankles where compliance control is required. Based on measurement dimensions, torque sensors can be divided into one-dimensional, three-dimensional, and six-dimensional torque sensors, with one-dimensional, three-dimensional, and six-dimensional sensors being the most common. Six-dimensional force/torque sensors are used to accurately measure force information in the X, Y, Z directions and torque information in the Mx, My, Mz dimensions. In humanoid robots, six-dimensional torque sensors may be used in wrists and ankles where compliance control is required, while other body joints will use joint torque sensors (one-dimensional).
Source: ATI, Tesla AI Day, Kunwei Technology, AVIC Securities Research Institute.
4.1 Force
The development and production of six-dimensional force/torque sensors are challenging, but cost reduction is expected to continue after scaling. Compared to one-dimensional force sensors, multi-dimensional force/torque sensors must address issues of monotonicity and consistency sensitive to the measured force components, as well as inter-dimensional interference caused by structural processing and process errors, dynamic and static calibration issues, and decoupling algorithms and circuit implementations in vector calculations, requiring high standards for equipment and materials, making the research and manufacturing difficulty far higher than that of one-dimensional force sensors. The main raw materials for strain-type force sensors include metals, chips, and strain gauges. For example, in 2023, the direct material cost of the main products of Keli Sensors reached 74%; the number of strain gauges required for six-dimensional force/torque sensors is several times that of one-dimensional force sensors, and due to the high production difficulty, their costs are far higher than those of one-dimensional torque sensors. According to Baidu's procurement data, the unit price of the ATI FC-NANO17 six-dimensional force/torque sensor is 20,000 yuan. We believe that with the improvement of domestic strain gauge and related industrial chain research and production capabilities, as well as the opening of downstream demand, there is significant room for cost reduction of six-dimensional force/torque sensors.
Encoders are high-precision sensors used for detecting rotational positions, with a value of approximately 8,550 yuan per humanoid robot encoder. An encoder is a sensor used for motion control, utilizing optical, electromagnetic, capacitive, or inductive principles to detect the mechanical position of an object and its changes, converting this information into electrical signals, which are then transformed into transmittable and storable signal forms, finally fed back to various motion control devices. Encoders are applied in Tesla's humanoid robot's rotary joints (142), linear joints (141), and hand joints (12*1), with a total value of approximately 8,550 yuan.
◼ Encoders can be divided into optical, magnetic, and capacitive types based on their working principles. 1) Optical encoders have high precision, good stability, and strong anti-interference capabilities, suitable for high-precision and high-speed measurements, but are relatively expensive and easily affected by the environment; 2) Magnetic encoders use magnetic code discs instead of grooved optical code discs, making them more durable, resistant to vibration and shock, suitable for measurements in harsh environments, but with relatively lower resolution and precision; 3) Capacitive encoders have high reliability, high precision, and long lifespan, suitable for battery-powered applications.
Tesla's Optimus pure vision solution reuses underlying technologies from autonomous driving, with the core being massive data, self-developed chips, and algorithm training. Tesla's Optimus pure vision solution is equipped with the same FSD computer and Autopilot-related neural network technology as Tesla cars, but the actual application scenarios are more refined than those for cars, requiring more data accumulation and algorithm training. In the progress video of the humanoid robot released by Tesla in September 2023, it was shown that Optimus can accurately determine object positions and eliminate interference using only vision and joint position encoders. The "end-to-end" neural network runs locally, outputting commands directly from visual input images without needing to connect to the internet or manual operation, successfully reusing the logic of autonomous driving in robots. Tesla's pure vision solution can accurately perceive depth, speed, and acceleration information, significantly reducing hardware costs compared to the usual LiDAR fusion solutions, while "algorithms + computing power + data" build a high competitive barrier.
Tactile sensors are important components for robots to interact with the external environment, giving robots a sense of touch. Touch is a form of perception through the skin that humans use to sense the external environment. Robot tactile sensors primarily perceive physical quantities such as temperature, humidity, pressure, and vibration when in contact with the external environment, as well as the softness or hardness of target materials, object shapes, and sizes, enabling precise positioning of objects and execution of various operational tasks.
Applications of sensors.
- Specific scheme comparison of dexterous hands: Dexterous hand = fingers (drive + transmission + sensors) * degrees of freedom + shell.
With the continuous advancement of industrial automation and artificial intelligence technologies, robots are gradually transforming from single repetitive task executors to intelligent agents capable of performing complex and variable tasks. In this transformation process, dexterous hands, as important tools for robots to interact with the external environment, are becoming increasingly significant. The design inspiration for dexterous hands comes from the complex structure and functions of human hands, enabling robots to perform diverse tasks such as grasping, manipulating, and even sensing, greatly expanding the application range and operational capabilities of robots.
The composition of dexterous hands is the foundation for achieving their multifunctionality. A typical dexterous hand system usually consists of several key components:
(1) Drive system: Responsible for providing power to enable fingers to perform various movements. The drive system includes motors, pneumatic, and hydraulic types.
(2) Transmission system: Converts the power generated by the drive system into the movement of finger joints. The transmission system includes screws, gears, linkages, ropes, and tendons.
(3) Sensor system: Includes tactile, force, and position sensors, used to perceive the contact state and force between the hand and external objects, as well as the position and motion state of the hand itself.
(4) Control system: Precisely controls the drive and transmission systems through algorithms and software to achieve predetermined hand movements and task execution.
This article will analyze the technical schemes, future development directions, competitive landscape, and value of components such as drives, transmissions, and sensors in Tesla's dexterous hand patents.
1.1 Technical route analysis of dexterous hands: Electric drive + composite transmission + force and tactile sensing as the leading direction.
1.1.1 Number of degrees of freedom: There is a trend of increasing degrees of freedom.
The human hand has a total of 24 degrees of freedom. According to "Robot Dexterous Hands - Modeling, Planning, and Simulation," the 24 degrees of freedom of the human hand include 5 degrees of freedom for the thumb, 4 degrees of freedom for each of the other four fingers, and an additional 3 degrees of freedom for wrist abduction, wrist flexion, and palm curvature.
The more degrees of freedom, the greater the design difficulty. One of the challenges is how to place numerous actuators to make the dexterous hand's size close to that of a human hand. Currently, the known dexterous hand with the most degrees of freedom is the Shadow Hand, which has 24 degrees of freedom. The first generation of Tesla's humanoid robot has 6 degrees of freedom in one hand, while the second generation has 11 degrees of freedom, overall moving towards higher degrees of freedom. Since 2014, at least four dexterous hands have achieved 21 degrees of freedom, with tendon, linkage, and gear transmission methods being used.
Comprehensive comparisons show that motor drive is the most suitable method for mass production of dexterous hands. This is mainly due to advancements in motor design, processing technology, and electronics, which can provide small-sized, high-output micro motors for dexterous hands. Additionally, the ease of obtaining and storing electrical energy provides a foundation for motor applications. Possible motors include hollow cup motors and brushless synchronous motors.
The hollow cup motor scheme is highly efficient and suitable for battery-powered dexterous hands that require long-term operation. Hollow cup motors use ironless rotors, eliminating energy losses caused by eddy currents formed by iron cores, thus achieving higher efficiency, smaller rotational inertia, and easier control. According to "Research Progress on Hollow Cup Micro Motors and Coils," hollow cup motors mainly have the following characteristics: (1) Energy-saving characteristics: The energy conversion efficiency is very high, with maximum efficiency generally exceeding 65%, and some products can reach over 90% (iron core motors generally do not exceed 75%); (2) Control characteristics: Rapid start and stop, extremely fast response, with mechanical time constants less than 28 milliseconds, and some products can achieve under 10 milliseconds (iron core motors generally exceed 100 milliseconds); (3) Fluctuation characteristics: Very reliable operational stability, with minimal speed fluctuations. As a micro motor, the speed fluctuation of hollow cup motors can easily be controlled within 2%. Therefore, hollow cup motors are particularly suitable for battery-powered applications requiring long-term operation, such as bionic hands, humanoid robots, and handheld electric tools.
The main challenges for hollow cup motors lie in winding design, dynamic balance design, and capital investment. Therefore, new entrants have shallow technical accumulation and find it difficult to meet the high efficiency requirements in the robotics field.
(1) Winding design: The winding needs to ensure high density and consistent arrangement of coils, enabling the product to have high power and torque density. The diversity of winding forms directly affects production yield, but most technologies are patented by foreign companies, further increasing the difficulty for domestic companies to break through.
(2) Dynamic balance design: Rotor dynamic balance is an extremely important process in motor production, directly affecting whether the motor's noise and vibration performance meet standards. Differences in rotor dynamic balance are caused by different companies using different magnetic materials, leading to uneven mass distribution of the rotor.
(3) Capital investment: The equipment prices in the automated production lines for motors are relatively high, with single winding equipment costing over a million yuan, requiring customized development from equipment manufacturers, placing high capital demands on hollow cup motor manufacturers.
The brushless synchronous motor scheme with slots is a feasible way to reduce costs. Motors that can be used in finger parts can be divided into brushless synchronous motors with slots and brushless motors without slots:
- With slots: Most brushless DC motors adopt a slot design, with coils wound in the slots on the stator;
- Without slots: Hollow cup motors belong to the category of motors without slots; in slotless motors, there are no slot structures on the stator, and coils are separately wound and fixed directly on the surface or inside of the stator.
Due to the characteristics of small diameter and minimal torque fluctuations, hollow cup motors currently dominate in robotics. Brushless synchronous motors with slots have greater power than hollow cup motors, but their larger diameter means they can only be installed in the thumb (which has a higher spatial tolerance) in the short term. Compared to hollow cup motors, brushless synchronous motors have torque fluctuations, leading to greater fluctuations in speed and torque, and cannot operate at high speeds with iron cores; hollow cup motors can achieve high speeds and small diameters, primarily relying on the palm structure to bear weight. From this perspective, brushless DC motors have greater power and can be placed on the thumb, but will not be placed on other fingers in the short term, as fingers need to be smaller and lighter.
The future development direction of motors revolves around cost reduction and efficiency enhancement, mainly achieved by reducing size and weight, such as harmonic magnetic field motors and integrated technology motors.
- Harmonic magnetic field motors: Achieve reduced size and increased power density by changing the internal design structure of the motor; 2) Integrated technology motors: Achieve reduced size and increased power density by integrating reducers and other products.
1.1.3 Transmission methods: The layout of screws is becoming the development trend for hand transmission.
Transmission methods mainly include tendons, screws, gears, and linkages. Early dexterous hands used gears and linkages as transmission mechanisms, but due to issues such as size and mass, and lack of flexibility in movement, they have gradually been eliminated. Tendon transmission, which mimics the tendon structure of animals, is currently widely used in dexterous hands. According to various companies' mid-year reports for 2024, listed companies are focusing on the research and development of hand screws, and the layout of screws is becoming the development trend for transmission.
Tendon transmission uses ropes to simulate the tendon structure of human hands, allowing large actuators to be positioned away from the execution mechanism, reducing the load and inertia at the end, and increasing grasping speed, with flexible arrangements, making it suitable for transmission scenarios that require many degrees of freedom in confined spaces.
Linkage transmission uses multiple linkages in a mixed series-parallel form to transmit motion and torque. The motion and power of the fingers are transmitted by rigid linkages, capable of grasping large objects with a compact structural design, enabling enveloping grasping. The downside is that it is difficult to control over long distances, prone to ejection, and has limited grasping space.
Ball screw transmission, according to "Design of Control Systems for Space Five-Finger Dexterous Hands," places the motor and ball screw externally in the arm, with the motor driving the ball screw through a reducer. The rotational motion of the motor shaft is converted into the translational motion of the screw nut, which pulls the tendon, connecting the other end to the finger bones, causing the finger joints to rotate around the joint axis, resulting in finger bending motion. According to Tesla's public information, Tesla will subsequently install the drive device in the arm rather than inside the fingers.
The transmission device of dexterous hands can generally be divided into three levels: (1) The first level: Located on the motor side, mainly consisting of reducers, serving to adjust precision; (2) The second level: The most important, responsible for action execution; (3) The third level: Connecting the driver and the end of the joint, mainly consisting of tendons and linkages. From market cases, the first-level transmission mainly uses belts, the second-level transmission mainly uses screws or bevel gears, and the third-level transmission generally uses tendons and linkages.
Comprehensive comparisons show that each scheme has its strengths, but due to the high load-bearing requirements in factory labor scenarios, screws may become the mainstream transmission scheme in factory scenarios.
1.2 Disassembly of Tesla Gen1 patents: Actuators and gearboxes are core components.
According to Tesla's hand patents, the hand uses approximately 14 core components. Sorted by value, actuators, gearboxes, and Hall effect sensors are higher:
- Actuators: Hollow cup motors (we estimate a domestic production cost of 1,000 yuan/unit after mass production, totaling 13×1000=13,000 yuan) or brushless synchronous motors with slots (estimated domestic production cost of 160 yuan/unit after mass production, totaling 160×13=2,080 yuan).
- Planetary gearboxes: 304a-304f (gearbox = 1 gear + 1 worm), we estimate the value of a single degree of freedom to be about 100 yuan, totaling 1,300 yuan for 13 degrees of freedom.
- Finger joint components: Proximal 402, distal 408, 420, for the casing, we estimate the value to be small.
- Ends of the finger joints: 410, 412, for the casing, we estimate the value to be small.
- Axles (including pins, shafts, etc.): 406, 414, we estimate the value to be small.
- Cables: 416, 418, 512, we estimate the value to be small.
- Channel structures: 424, 426, we estimate the value to be small.
- Torsion springs: distal 436, proximal 434, we estimate the value to be small.
- Spring brackets, pins, we estimate the value to be small.
- Others: Fingernails, tendons, automatic tensioners, manual tensioners, flange bearings, we estimate the value to be small.
- Pipes: 514, 516, we estimate the value to be small.
- Worm gears: 704 (including pulley 706), already calculated in the gearbox.
- Gears: 702, already calculated in the gearbox.
- Hall effect sensors: Composed of sensor 802 and magnetic field source 804 (the processor 114 determines the position or rotation angle of the finger joint components through the magnetic field measured by the Hall effect sensor), with a mature industrial chain and small value.
According to Tesla's GEN1 dexterous hand patent, the basic working mechanism of the fingers is achieved through the collaborative action of the cable drive system and actuators. When the actuator is activated, it pulls the cable, which is guided through the channel structure of the finger, driving the joints of the finger to rotate around the pivot. The cable maintains appropriate tension through the gearbox, ensuring smooth joint movement. Additionally, the torsion spring provides extra rebound force, allowing the fingers to naturally return to their initial state after completing operations. This cable-driven design not only reduces complex mechanical components but also enhances the flexibility and durability of the fingers. Through precise positioning by Hall effect sensors, the fingers can adjust their movements in real-time, achieving highly refined operational tasks.
The steps for disassembling Tesla's dexterous hand patent are as follows:
The first step, according to Tesla's dexterous hand patent, shows that its single hand has 5 fingers, each containing two joints (proximal joint 206 + distal joint 208), with each finger secured to the palm by fasteners. This step adds finger joint components.
The second step, according to Tesla's dexterous hand patent, shows that its single hand has 6 degrees of freedom, thus containing 6 actuators and gearboxes. The actuators are mainly placed in the palm (due to the increase in degrees of freedom in the third generation, the palm's capacity is insufficient, so the actuators are loaded into the larger-capacity arm). This step adds actuators and gearboxes, with 6 for each hand.
The third step, according to Tesla's dexterous hand patent, shows that each finger has 2 pivot structures and 2 torsion springs. The two pivots are located at the proximal (406) and distal (414) ends of the finger, typically made of pins or bearings, allowing the finger to rotate freely within a specific angular range. The distal torsion spring (436) and proximal torsion spring (434) are located at the distal and proximal joints of the finger, providing additional stability and helping the finger provide feedback force when returning to its initial state. This step adds pivot structures and torsion springs.
The fourth step, according to Tesla's dexterous hand patent, shows that each finger has 2 cables and 2 channel structures, with one end of the cable connected to the actuator and the other end routed through complex channel structures (424, 426) inside the finger. When the actuator moves, the cable can bend the dexterous hand. The channel structure provides a guiding path for the cable's movement, ensuring that the cable can move freely without tangling when the finger bends. This step adds cables and channel structures.
The fifth step, according to Tesla's dexterous hand patent, shows that the dexterous hand has 6 gearboxes, with each actuator controlling the movement of the cable through a gearbox. The gearbox typically consists of a worm and worm gear, with the pulley in the gearbox connected to the cable, ensuring that the tension of the cable remains within a stable range. This step adds gearboxes (including gears, worm gears, worm gears, and pulleys).
The sixth step, according to Tesla's dexterous hand patent, shows that each finger is also equipped with 1 Hall effect sensor to monitor the rotation angle and position of each joint of the finger. The Hall effect sensor is connected to the processor, and when the finger rotates, it determines the exact position of the finger by measuring changes in the magnetic field, providing real-time feedback. This step adds Hall effect sensors.
1.3 The core of cost reduction for dexterous hands lies in hollow cup motors and screws.
According to Tesla's public information, the main changes in GEN3 dexterous hands compared to GEN2 are: (1) The number of degrees of freedom in the hand has increased from 11 to 22, and we estimate that the corresponding number of motors will increase from 6 to 13-17; (2) The actuators are now loaded in the wrist.
Tactile Sensors: Resistive and capacitive types are commonly used, with large arrays and flexibility being the main developments.#
Capacitive, resistive, and piezoelectric types are common tactile sensors. Tactile sensors can be mainly divided into capacitive, resistive, piezoelectric, magnetic-sensitive, and fiber-optic types based on their principles. Among these, resistive sensors are suitable for monitoring constant pressure changes, capacitive sensors have simple structures and are widely used in wearable and healthcare devices, and piezoelectric sensors are suitable for detecting frequently changing pressure scenarios.
◼ Large arrays, flexibility, multifunctionality, multi-dimensionality, and self-powering are important development trends for tactile sensors. Large arrays: The larger the contact area of tactile sensors with an object's surface, the more information can be obtained. Arrayed, high-density tactile sensors can capture tactile information from different positions and times. Flexibility: Flexible sensors can cover irregular and uneven surfaces, making them easy to carry and install. Multifunctionality: Multifunctional tactile sensors can simultaneously measure various parameters such as pressure, tension, temperature, and surface roughness.
![图片](ipfs://bafkreicmgpgua7bugqhh726wnzwulkukkkmulpjtj3cm4mv