On the Ground in China’s Humanoid Robotics Moment
China’s Humanoid Robots Are Moving Fast. Paul Triolo Explains Why.
Things That Caught Our Attention
Since we last published, Xiaomi, Tencent, Alibaba, Kimi, and the highly anticipated DeepSeek v4 have all released new models. The pace is dizzying, and it speaks to the depth of the talent bench. Both the Xiaomi and Tencent teams are only about six months old under their new leadership, and the fact that they pushed out models this quickly, and ones well-reviewed by international peers and users, is a testament to how many people in China now genuinely know what they are doing in this space, chip-constrained or not. And the DeepSeek shock keeps coming. After announcing that they would run inference on Huawei chips in the second half of this year and cut prices accordingly, they have already cut prices, making the model 97% cheaper than OpenAI’s GPT-5.5. Talk about involution!
The $2Bn Meta acquisition of Manus will be unwound at the request of the Chinese government. This outcome was outside our initial expectations but became increasingly likely as the investigation dragged on. The unwinding is likely meant to set precedent, signaling to companies that the obligation to notify and seek permission from Chinese regulators in cross-border transactions should be taken more seriously. Manus is not a strategically critical company in China’s AI plans, but the deal was high-profile enough, arguably more so than it should have been, that closing the “loophole” it represented required making an example of someone. The loophole in question: “de-China-fying” one’s origins by relocating headquarters to Singapore and hiring new employees abroad. As AI becomes increasingly wrapped up in national security and geopolitics, it is unfortunate that the company ended up as collateral damage.
A Note
We’ve been working on a few new reports that are on their way to you, with a particular focus on robotics.
Over the past two weeks, we’ve taken two groups to China, about 30 investors, founders, operators, and executives from 10 different countries. As usual, the agenda covered EVs and AI, but the real center of gravity this time was robotics.
We’re also preparing a piece that we’ll share in the next few days to complement what you’ll read below. A meaningful portion of what we saw and heard was off the record, so there are limits to what we can attribute directly. Still, stepping back, there are several dynamics unfolding in Chinese robotics that are not immediately intuitive to Western observers. A lot of it comes down to how China builds industrial capacity: structured, iterative, and deeply integrated across the supply chain in ways that compound over time.
In the meantime, please enjoy this piece from Paul Triolo. He joined us on the second trip, and he is someone we respect a great deal. He’s generously agreed to let us share his work here. If you find it useful, we encourage you to subscribe to his Substack.
We’ll have more for our paid subscribers in the coming weeks as we continue to synthesize what we’ve been seeing on the ground.
Separately, for any of the companies mentioned, you can explore our robotics tracker at robotics.techbuzzchina.com. We’ll be formally launching it alongside a dedicated report. This has been a joint effort with our data partners at IT Juzi and the Shenzhen Robotics Association, who also hosted the Fair Plus convention last week in Shenzhen, with a strong focus on humanoids.
As always, thank you for your support.
Rui
A Deep (but Partial) Dive into China’s Humanoid Robotics and Embodied AI Sector
Highly dynamic, rapidly developing, and well-funded sector still searching for scalable use cases
Over the past year there has been considerable attention paid in Western media and elsewhere to the rise of China’s humanoid robotics sector. The issue is complicated by the emphasis in Chinese government programs on the concept of embodied AI, including in the most recent 15th Five Year Plan.1 I have wanted to write about this issue for some time, but felt I needed to first directly experience some of the issues related to China robotics AI and embodied AI. This is just a preliminary attempt to baseline the sector and draw some linkages to the broader issue of US-China competition in AI. The below captures a few of the many discussions I had in China in this complex and fast-changing sector. Much detail has been left out of what were for the most part off-the-record discussions.
After an intense five-day deep-dive trip to China, with stops in Shanghai, Liuzhou, and Shenzhen, attending the Shenzhen AI and Robotics Fair featuring 400 companies, visiting many Chinese robotics companies focused primarily on humanoid robotics, and lengthy discussions with Chinese companies engaged in developing embodied AI, I now feel at least minimally qualified to discuss some of the issues and challenges that these technologies pose both for China’s development and for US-China tech competition.
The perception that the US is falling behind China in a race for dominance in robotics, and particularly in the humanoid robotics sector, has also been much discussed within US Congress and the Trump administration. There is considerable fear of Chinese robots coming to the US and dominating the US robotics sector—again, particularly humanoid robots. As I have written previously, humanoid robots fall under the category of connected devices. The debate in the US around how to handle Chinese robots, in particular humanoid robots, has been conducted at a very low level, as I noted here. Let’s now dive into these issues in more depth, based on firsthand experience in China and also stepping back to look at the bigger picture around China, robotics, and embodied AI.
All photo credits: author
Humanoid Robotics: Early Days, but Industry Ecosystem Moving Fast






During my swing through the humanoid robotics sector in China in late March, at one level it appeared to me that the sector, which has received considerable media attention recently, is situated in a position similar to the early days of EVs and connected vehicles. Indeed, there are many similarities between the physical components, systems, and supply chains of EVs and humanoid robots, such as batteries, sensors, actuators, magnets, and control systems designed for movement through space. The overlap between supply chains for these two important sectors (the EV sector is more than a decade older) is one of the principal reasons for the early success of humanoid robots in China.
Because of the nature of the human body, companies trying to develop humanoid capabilities are stuck in a rather limited range of sizes for their humanoid robots. Most of the robot makers we talked to make two versions: a smaller version with a low center of gravity that can do more complicated movements such as dancing, and a larger version, the size of a more mature human, that can be used for broader applications in a factory setting, having more reach and being more versatile than the smaller versions. Anything smaller than the half-size humanoid robot would appear to look like a toy, while anything bigger than an average human would be both intimidating and difficult to stabilize because of the center of gravity issue. One of the things that was apparent during my tour through many humanoid robot companies was the close attention paid to the outer appearance of the robots and how intimidating they were or were not, depending on the end-use case.
Despite many similarities, there are also critical differences between the supply chains for humanoid robots and EVs, dictated by the space within which humanoid robots move and the end-use cases. This is perhaps most salient for bipedal humanoid robots. While there was no real doubt about the end-use cases for EVs and autonomous vehicles, we are in the very early stages of determining where the best use cases will be for humanoid robots. This became critically clear during the entire week as we discussed these issues with a variety of leading Chinese humanoid robot companies. Currently there are different use-case buckets that each company is targeting for initial sales of its humanoid robots, with models containing some movements programmed in or primed for programming limited movements, as progress continues on the much more difficult “big brain” problem. These include research, education, tour guides, security, and factory automation, and Chinese humanoid AI companies are working hard to develop specific humanoid robots to address each of the use cases, which require different types of humanoid robots and critically, things like robotic arms, joints, and hands. While there is overlap in the type of functions robots will require, particularly bipedal humanoid robots, this depends on the use case; for example, industrial use robots that will be deployed in an industrial or factory setting versus non-industrial-type humanoid robots which might function as a tour guide, security guard, or sentry. There is much commonality in terms of the basic functions of the physical humanoid robot body, but a lot of nuance and complexity among companies, and evolving business models amid fierce competition.
Here is a more detailed breakdown of use cases. Each comes with different requirements at the mechanical, small-, and big-brain levels.
Automotive/EV and electronics manufacturing — strongest near-term fit; China’s core advantage is factory integration.
Logistics, warehousing, loading/unloading, and material handling — huge labor-replacement value, but technically harder than demos suggest.
Inspection, maintenance, hazardous-environment work — power, chemicals, mines, emergency response; politically attractive and easier to justify.
Commercial service/reception/retail/tour-guide roles — already being deployed, but often more showcase than productivity.
Eldercare and household assistance — strategically important because of demographics, but probably slower to scale than policy rhetoric implies.
We saw numerous cases of companies selling dozens or even hundreds of humanoids to research organizations, including government-backed labs designed to collect shareable training data. In other cases, companies were selling humanoids to factories owned or operated by their own investors. These somewhat closed-loop commercial relationships reveal an industry still searching for scalable, profitable use cases. The home/companionship use case in China remains strategically compelling—given demographics and policy signaling—but is still better understood as a frontier R&D domain rather than a near-term commercial vertical. Unlike factories or warehouses, domestic environments are unstructured, safety-critical, and socially embedded, which imposes a much higher bar on both hardware and software.
On the hardware side, achieving human-safe physical interaction requires advances in compliant actuation, high-resolution tactile sensing (“electronic skin”), and ultra-fast reflexive control loops to manage contact forces, slip detection, and unexpected collisions—particularly for tasks involving frail elderly users where skin sensitivity, pressure thresholds, and injury risk are non-negotiable. We saw a lot of effort going into trying to solve some of these challenges, but multiple breakthroughs will be required before viable use cases can be seen in this area.
Current humanoids also still struggle with dexterous manipulation of soft or deformable objects (clothing, bedding, food), reliable grasping across varied geometries, and sustained operation without failure. On the software side, robust perception in cluttered, dynamic home settings—combined with long-horizon task planning and naturalistic human–robot interaction—remains brittle outside heavily curated demos that are a staple of AI and robotics exhibitions. Layer on privacy constraints, continuous learning requirements, and the need for near-zero failure rates, and the result is a use case where technical readiness in China lags policy ambition. This is why, despite visible Chinese prototypes targeting household chores and companionship, the more bankable deployments remain in structured industrial and commercial settings for the foreseeable horizon.
During our tour of leading Chinese humanoid robot companies in late March—including AgiBot, Unitree, Fourier, UBTech, and EngineAI—and many other key players such as robotic hands developers and big brain platforms, we were struck by the many different types of requirements at play that can be placed on a human-like robotic hand in terms of carrying/bearing loads, degrees of freedom for particular tasks, other areas of flexibility, and things like torque required to deliver a punch or rapid movement. We can already clearly see significant specialization within the Chinese humanoid robotics sector in some of these functions, particularly around hands, feet, and joints of the body that are key to the functioning of humanoid robots and distinguish them from things like electric vehicles. While electric vehicles are used on well-defined roads and landscapes, for example, humanoid robots, depending on the use case, must be able to navigate a much less regimented and regular landscape to function properly.
One key distinction that became clear during the week is that the movements of individual limbs and hands, or combinations of both, are governed at a much lower level in the humanoid robotic AI stack. This is the so-called “small brain.” The robots can be trained to execute these movements in a variety of ways, all of which are evolving from simple human-guided movements through capture of video and trajectory motion via much more sophisticated tools. Once captured or trained, these types of movements can be executed autonomously or remotely through the cloud or a local controller. The elaborate dancing sequences seen in Chinese New Year videos, for example, featuring dozens or hundreds of robots all executing the same movements in unison, are executing locally on the robot but are initiated remotely via the cloud or a local controller. At this level, various robotics companies distinguish themselves via the design of the joints, particularly the hip joints and the feet, for smoothness of motion and degrees of freedom that allow more complicated movements.
Of the companies we visited, for example, EngineAI—a company that has just completed a Series B funding round of $200 million2 and is pursuing development of robots for use as tour guides, security guards, and eventually industrial applications—featured the most smoothly functioning robots, with clearly superior joint movements that allow very elaborate dancing, martial arts, and other complex movements. EngineAI designs its own joints and actuators, outsourcing the manufacturing of these products and then assembling the joints into its own robots. Its intellectual property or secret sauce is tied up in the design and iteration of the joints and the way it captures video and movements to program the overall robot through a sequence of movements. At EngineAI, we discussed a convergence toward hybridized motion-generation pipelines that blend traditional optical motion capture with increasingly scalable video-based inference. Rather than relying exclusively on expensive motion capture rigs, for example, EngineAI is using models that extract trajectory data directly from video inputs, dramatically accelerating data acquisition and reducing dependence on controlled capture environments. The workflow is modular and toolchain-driven: video → trajectory inference model → 3D reconstruction (often via tools like Blender) → skeletal retargeting to robot kinematics → format conversion → downstream model training or execution. Critically, the firm emphasizes that the underlying training logic is consistent across data sources; the differentiator is speed and scale of data ingestion, not the paradigm itself. This aligns with a broader industry shift toward “model-of-models” architectures, where foundation models, VLM/VLA stacks, and world models interact, with performance ultimately governed by data quality, training paradigm, and reward-weight tuning. The implication is a move toward faster iteration cycles and synthetic/weakly supervised data pipelines—key for closing the gap between demo-grade motion and deployable, task-specific robot behavior.
The company does not design the hands, and can use hands from other companies depending on the application. Indeed the robotic hand sector is an almost completely separate supply chain and sector than that of humanoid robot bodies, on that is more closely linked to the “small brain” but that also requires close coupling to the big brain, which allows the robot to function and have awareness in a broader environment. EngineAI, for example, appears to treat hands as an important but not yet fully solved layer of the stack: locomotion and balance are presented as comparatively mature, while manipulation remains a relative weakness and a major focus for industrial deployment. The company says it has its own hands but also integrates third-party hands when appropriate, using AI-interface adaptation to make external end-effectors work with its robots. This flexible approach reflects the broader bottleneck: human hands have many more degrees of freedom than most robotic hands, while current actuator energy density and packaging constraints limit how many joints and motors can be integrated into a humanlike form factor. So EngineAI’s near-term strategy is pragmatic: optimize locomotion and full-body control in-house, while treating hands as a modular manipulation subsystem that can be upgraded or swapped as better dexterous hardware becomes available.
Humanoid robots are best understood as hierarchical control systems that separate high-level cognition from low-level execution. The “big brain” handles task planning, reasoning, and global motion—deciding what to do and generating coarse trajectories for the body—e.g., walking, reaching, positioning. These operations run at relatively low frequencies and rely on abstract representations of the environment and task goals. However, the big brain does not directly control joints or actuators in real time; instead, it delegates execution to intermediate controllers—model predictive control or whole-body control that ensure stability, balance, and coordination under physical constraints. For example, we pushed and kicked EngineAI’s small robot from all sides and it was very good at maintaining stability, adjusting its feet and leg positions quite rapidly.
In fact, fine manipulation—especially hand and finger movements—is governed by “small brains” operating at much higher frequencies with tightly closed feedback loops. These local controllers handle contact-rich dynamics such as grip force, slip detection, and object reorientation using dense tactile and proprioceptive data—proprioceptive refers to sensing the internal state of a body—specifically the position, motion, and forces of its limbs and joints—without relying on external vision or touch of the environment. Here, autonomy shifts decisively downward: once the big brain issues a command like “grasp the cup,” the detailed execution is handled almost entirely at the edge. The result is a division of labor in which the system is globally directed but locally autonomous, with precision and responsiveness concentrated in the small brains and generalization and planning residing in the big brain.
Companies in the hand part of the humanoid robotics sector, such as Agilink and Linkerbot, distinguished themselves via a variety of methods including degrees of freedom load bearing for the fingers and the ability to do complex movements with strength. Another key factor is torque and torque sensors, which are a fast and growing part of the humanoid robotic hand sector. Torque and torque sensors will be required for humanoid robots that will do more delicate and complex interactions with humans, and also in more complicated factory and industrial environments where they need to be able to sense and get feedback from the things they are touching. At the fair, they noted that some joints will be equipped with as many as 28 torque sensors, which will be primarily used to measure torque in a single direction. The main force that matters is the one that the robot can sense, and only when these functions are well developed will it be possible to ensure safety in human-robot interaction, for example.





Taking the six-axis sensor as an example, it mainly detects the six directions, that is, XYZ as well as the rotational torque around XYZ. At the end of the robot, such as at the wrist or ankle, the sensor can perceive multi-dimensional force directions and, together with the hand or other parts, work to sense force collectively.—Senior robotics company official
Finally, there is an acute awareness about the many tradeoffs involved in the mechanical small brain, humanoid robot body interaction. For example, the size of the battery on a small (and even on a larger) robot is limited, and complicated maneuvers and actions can quickly use up battery power. Currently much of the focus in the development of the mechanical parts of the humanoid robot is on driving down the energy use of things like actuators and other sensors, and on reducing the size and increasing the energy density of the battery pack. Depending on the application, though, humanoid robots could still be tethered in a factory setting, depending on their range of motion and what exactly they are doing. But the humanoid robots we saw were largely being designed for autonomous operation, and some could even recharge themselves easily by swapping out their own battery packs. Again, the energy use of the small brain in humanoid robots is highly dependent on the application and the period needed before recharging. For higher level intelligence, much of the actual compute capacity and energy use will be off of the actual humanoid robot and run in the cloud, much like some AI applications are inferenced on your phone when necessary and in the cloud when applications are more complicated. Depending on the need, for example, for verbal communication, a very sophisticated and advanced large language model capability will not likely be run locally on the robot, but a lot of these decisions will depend on the overall use case and on other issues such as data security.
Already though, highly capable small brain robots with the most advanced joints, such as EngineAI models including the full-size T800, equipped with robust big brain speech capacity, have a retail market. EngineAI is clearly transitioning from a prototyping phase to early-scale commercialization, for example, with a stated near-term target of ~5,000 units annually against per-line capacity exceeding 10,000 units, implying deliberate underutilization as they stabilize quality and cost curves. The production model remains hybrid: core IP—especially joints, motion control systems, and key mechanical subsystems—is designed in-house, while selected components—such as servo motors—are outsourced for fabrication and then integrated internally, with ~200 workers on assembly lines and ~75% of staff in engineering roles. Model differentiation appears less about radically distinct architectures and more about configuration tiers and use-case targeting: larger platforms—such as the TA/T800-class systems—are priced roughly $40K–$80K and emphasize power, stability, and performance, with higher weight and torque output—while smaller variants—RMB 88K entry-level—target education and light-duty deployment. The company frames “stage one” —locomotion, balance, performance demos—as largely solved, with current roadmap priorities shifting toward manipulation, cost reduction, and scenario-specific solutions—especially industrial deployments supported by government-backed “robot data centers” used to generate task-specific training data at scale.
The Evolution of the Big Brain, and Embodied AI
The big brain, operating in the cloud and providing situational awareness and interactive interface, is where a lot of effort is currently being devoted, and this is where some key aspects of embodied AI and robotics meet. I was struck by how much innovation is happening in this part of the stack, around things like capturing motion and modelling the complex factors governing the motion of the human body and its interaction with complex physical environments. (More on the “big brain problem” in future posts.)
The big brain is likely to include multiple different types AI models. Clearly large language models will be used for robot communication, and we were impressed by the number of humanoid robots that already had quite flexible and excellent ability to interact and converse with humans using large language models in the cloud. At the same time, those robots were also limited in their ability to perceive what was going on around them—for example, to count the number of people in the room—because they (at least the ones we were able to interact with) were not yet equipped with more advanced multimodal models such as a world model or a full array of sensors and detailed ability to manipulate and interact in a generalized context with other humans.


We spoke with some of the leading companies in China that are working on the big brain problem, including Sensetime and Tencent Robotics. Chinese AI model developers trying to build the “big brain” for humanoid robots face a harder problem than training another chatbot: the model has to fuse language, vision, depth, force, tactile, proprioceptive, and temporal memory into a system that can reason, plan, and act safely in messy physical environments. The robots we interacted with from AgiBot and EngineAI had excellent voice features and delivered quite complex answers to our questions. The core challenges though are data scarcity (especially high-quality robot trajectory and manipulation data); cross-embodiment generalization across different robot bodies and hands; sim-to-real gaps; low-latency inference on constrained hardware (we heard a lot about the latency problem); and robust long-horizon planning with recovery when specific actions fail. The hardest bottleneck is closing the loop between perception and action: a humanoid must not just “understand” a scene, but translate that understanding into precise locomotion, dexterous manipulation via different types of hands, balance, collision avoidance, and task sequencing. Chinese firms also face compute and software-stack constraints, which we discussed at length, especially around GPU access, robotics middleware, Vision-Language-Action (VLA) models training3, and edge deployment, making the “embodied AI race” less about one giant foundation model and more about integrating perception, planning, memory, control, and embodied data pipelines into a reliable platform.
Each “ big brain” company is coming at the above challenges from a different position, bringing different types of strengths. Sensetime, for example, has considerable experience with computer vision. SenseTime’s most relevant experience is not humanoid hardware per se, but the software stack that maps onto the humanoid “big brain,” i.e., perception, multimodal understanding, navigation, world modeling, and task-level interaction. Its Wu Neng embodied intelligence platform is explicitly pitched for humanoid agents and quadrupeds, drawing on SenseFoundry visual AI for environmental recognition and segmentation, SenseAuto-style navigation for path planning and obstacle avoidance, and SenseNova for natural-language interaction, memory, contextual awareness, and expressive communication. We discussed the SenseNova NEO architecture, which is innovative because it moves away from the standard “vision encoder + projector + language model” stack and treats multimodality as native architecture, not a bolt-on extension of an LLM. SenseTime claims that NEO uses native patch embedding, 3D rotary position encoding, and native multi-head attention to fuse vision and language more deeply inside the model, rather than merely aligning image tokens to a language model after the fact. That matters for robotics because a humanoid “big brain” needs spatial understanding, object relations, scene detail, and language-grounded reasoning in the same representational space.4
The more direct robotics play is ACE Robotics, led by SenseTime co-founder Wang Xiaogang. The company’s stated focus is building the robot “brain:” models, navigation, and operation capabilities, with a human-centric data strategy that collects vision, touch, force, and behavioral data from people rather than relying only on robot-body-specific datasets. That is directly applicable to the big brain problem because humanoids need world models that understand physical interaction, not just image recognition. SenseTime also claims transferable advantages from autonomous driving: large-scale data flywheels, safety/data-quality systems, and navigation/perception modules that can move from fixed cameras or vehicles onto mobile robots.
Tencent Robotics’ TAIROS is positioned as a full-stack embodied AI platform that attempts to solve a core fragmentation problem in robotics: the lack of a unified architecture spanning perception, reasoning, and action across heterogeneous robot forms. The system integrates three tightly coupled modules—Embodied Perception, Embodied Planning, and Perception-Action—built on top of LLMs, VLMs, and VLA models, and organized under a Sensorimotor Latent Action Policies (SLAP)-derived architecture5 that fuses sensing and action for real-time responsiveness while reserving deliberative planning for complex tasks. Technically, its differentiation lies in 1) a hierarchical scene-graph memory that continuously fuses multi-modal inputs into structured, queryable representations; 2) an LLM-based planning layer using CoT, Monte Carlo Tree Search (MCTS), and tool-calling for long-horizon task decomposition; and 3) a dual execution stack combining VLA-based manipulation and RL-based locomotion. Crucially, Tencent emphasizes cross-embodiment generalization—deployment across humanoids, quadrupeds, and industrial arms—and modular accessibility via APIs/SDKs, enabling both end-to-end agent deployment and standalone service invocation. The platform has already been validated across industrial and domestic scenarios and integrated with multiple third-party robot OEMs, signaling an early push toward ecosystem standardization rather than vertically integrated hardware dominance.
From a business model perspective, TAIROS aligns closely with Tencent’s broader strategy of platformization and cloud-mediated service layers rather than direct hardware leadership. The modular API/SDK design mirrors Tencent’s playbook in gaming, social, and cloud—creating a middleware layer that aggregates developers, data, and services—while enabling monetization through cloud inference, simulation environments, and developer tooling. In this sense, TAIROS could function as an “Android layer for robotics,” particularly if Tencent leverages its existing strengths in WeChat ecosystems, cloud infrastructure, and AI services to drive developer adoption and data flywheels. Compared to its competitors, this contrasts with Huawei’s more vertically integrated stack—Ascend + CANN + hardware systems—and Alibaba’s model-centric cloud approach—Qwen + enterprise AI—positioning Tencent in a coordination role across what are clearly now fragmented robotics OEMs. If successful, TAIROS could anchor a multi-sided platform linking robot manufacturers, application developers, and enterprise users—particularly in service robotics and smart environments—while also reinforcing Tencent’s cloud and AI inference businesses. The key execution risk is whether Tencent can achieve sufficient standardization and developer lock-in before competing ecosystems—especially those tied to proprietary hardware stacks—consolidate the market. We did not have a lengthy discussion on this issue, but will in the future, as things are clearly in the early stages here. Companies such as Tencent, other hyperscalers, industry players, and investors are trying to determine the level of resources like scarce compute to devote to the big brain issue absent major use cases and clearer revenue streams, and in the face of massive compute demand from other parts of their businesses.
Looking Ahead: Compute, World Models, Revenue Generation
It is clear that the sector is very competitive and there is significant pressure to generate revenue and get investment from outside sources. Government support is available for promising companies, but industrial planners are not choosing winners, given all the uncertainties. One senior executive at a leading Chinese humanoid robot company told us that, unlike the sector in the US, where investors are willing to wait three to four years before getting a return on investment through some type of equity event, in China there is considerable pressure from investors to go public as quickly as possible, given that the sector, for now, is red hot. This is likely going to be the case over the next year for some of the companies specialized in certain key parts of the supply chain, such as actuators or hands, where they may have a perishable IP advantage over their rivals and have already been able to raise considerable funding through several rounds of investors. Significant investment in the sector appears to be coming from places like the Middle East, while US investors remain conflicted because of concerns about how the US government will treat investment in the humanoid robotics sector, which is not specifically AI but AI-adjacent.
We asked many of the Chinese humanoid robot companies what they thought about competition in the US from makers such as Tesla and other companies. All noted that Tesla had still not produced Optimus 3 for public scrutiny and were skeptical about Tesla’s ability to ramp up high-volume production of Optimus in the coming year. EngineAI’s view of Tesla/Optimus appears skeptical but watchful: they acknowledged Tesla’s prominence but note that they have not yet seen a convincing real-world robot demonstration comparable to what Chinese firms are commercializing, and they frame Musk’s mass-production timelines as still largely promissory, with meaningful scale perhaps not visible until around 2028. The contrast they draw is between the US/ model of longer research cycles before commercialization and China’s faster, investor-pressured push from prototype to market, where companies like EngineAI are expected to commercialize, gather field data, iterate quickly, and even pursue IPO paths on compressed timelines.
In fact, all US humanoid robotics companies still have some heavy dependence on Chinese companies for key parts of their supply chain if they are to reach high-volume production of humanoid robots. For humanoid robots specifically, US developers remain structurally dependent on Chinese supply chains for the core “body stack” components that determine cost, scalability, and reliability at production volumes. The most critical is rare-earth permanent magnets—surprise, surprise—which underpin high-torque, compact electric motors used in nearly every joint; China’s dominance in NdFeB magnet production—and its ability to impose export licensing—has created a direct chokepoint for scaling humanoid deployments. Closely linked are integrated actuator/joint modules, where Chinese suppliers are rapidly industrializing motor–gearbox–sensor assemblies at price points Western firms struggle to match, making them central to any viable path to sub-$20k–$30k humanoids.
All of this is reinforced by dependence on precision reducers—harmonic and rotary vector (RV) gears—and planetary roller screws6, both essential for smooth, high-load motion and still heavily sourced from China at scale. Batteries and power electronics form another constraint, as mobile humanoids require dense, cost-efficient energy systems tied to China’s dominant lithium-ion ecosystem. Finally, robot-specific sensors—force/torque sensors, encoders, and increasingly lidar/vision modules—round out the stack, with Chinese firms offering competitive performance and rapid iteration. The net effect is that while US firms may have some limited advantages in terms of the AI “big brain,” Chinese firms, including many we spoke to, retain disproportionate leverage over the electromechanical substrate that ultimately determines whether humanoid robots can be produced affordably and at scale.
The six Chinese Optimus suppliers most often cited as having moved into Tesla qualification or validation, for example, are Sanhua Intelligent Controls and Ningbo Tuopu for actuator/joint modules, Green Harmonic for harmonic reducers, Wuzhou New Spring and/or Xinjian/Seenpin-type transmission suppliers for planetary roller screws, Shuanghuan Transmission for precision reducers, and Keli/Coliy Sensor for six-axis force/torque sensing. The list remains somewhat fluid because Tesla is still iterating Optimus hardware and has not publicly confirmed a definitive Tier-1 roster, but the pattern is clear: Tesla is leaning heavily on China’s EV-to-robotics supply chain for the “body” of Optimus—actuation, transmission, precision machining, sensors, and fast cost-down manufacturing. That dependence is even sharper for rare-earth permanent magnets, because Optimus actuators use compact high-torque motors that require NdFeB magnets, often with heavy rare-earth inputs such as dysprosium or terbium for heat resistance. Elon Musk acknowledged in April 2025 that Optimus production was affected by China’s rare-earth magnet export licensing regime, and the controls covered both rare-earth materials and finished magnets, requiring MOFCOM licenses. In other words, even if Tesla can qualify multiple Chinese mechanical suppliers—or move some final assembly to the US, Mexico, or Thailand—the magnet bottleneck gives Beijing a potential choke point over Optimus scale-up unless Tesla can redesign motors, secure licensed civilian-use supply, or develop non-Chinese magnet sources at comparable cost and quality.
In the other direction of potential geopolitical pressure on the sector in China, NVIDIA’s robotics edge platforms—especially Jetson Orin modules and developer kits—remain widely available in China and are already embedded in parts of the Chinese robotics stack, including humanoid and autonomous-machine prototypes that need onboard perception, sensor fusion, motion planning, and local AI inference rather than data-center-scale training. A number of the companies we spoke with use this platform. NVIDIA classifies Jetson modules and developer kits under ECCN 5A992.c, a lower-control encryption category, and NVIDIA’s own materials list China among supported regions for newer Jetson products; these systems are therefore not currently caught by the main China advanced-computing GPU controls aimed at H100/H200/Blackwell-class accelerators. But this could become politically exposed: as humanoid robots move from demos to industrial deployment and potential dual-use autonomy, Orin/Jetson-class edge AI hardware may draw more scrutiny as a “robotics enabler,” particularly if there is an attempt to expand beyond controlling advanced AI training compute to controlling embodied-AI compute, autonomy stacks, and high-performance edge inference.
Finally, US congressional concern is that China is turning humanoid robotics into a strategic industry faster than the United States: combining state support, low-cost manufacturing, dense component supply chains, embodied-AI deployment, with attention now on firms such as Unitree, AgiBot, and UBTech. Lawmakers increasingly frame the issue not just as commercial competition but as a security problem: Chinese humanoids could collect sensitive data, be remotely updated or controlled, and enter US institutions through procurement channels. Recent bipartisan proposals would bar federal use of Chinese humanoid robots, while hearings have emphasized that the United States has overinvested in AI software while underinvesting in robotics hardware, actuators, sensors, batteries, rare-earth magnets, etc. and manufacturing scale. The bottom-line fear is that China could dominate the “physical AI” stack the way it already leads in drones and parts of EV supply chains. Given that, as our discussions showed, we are very early in the development and use-case cycle, these fears seem at best premature, and at worst, an arena for fear to be generated around China-origin technology. Arguably, like EVs and batteries, humanoid robotics should be an area where US companies collaborate with globally leading Chinese firms, as Tesla is already doing. The sector is not yet fully caught up in US-China competition—Unitree robots are already on the ground in the US and there are no rules yet that would prevent imports. Encouraging Chinese investment in this sector in the US should be a “no-brainer” during trade talks, but even more than Chinese EVs, having Chinese humanoid robots like the T800 roaming the streets of US cities will require some effort.
The author wishes to thank Tech Buzz China, Rui Ma, and the team for arranging such a great series of visits and deep discussions with leading Chinese humanoid robotics companies. Tech Buzz China maintains an excellent database of all the companies in China that are part of the expanding supply chain for humanoid robots here.
From Tech Buzz China: Fair+ 2026 which we attended last week in Shenzhen sits at the intersection of industrial automation and the humanoid robotics boom. Of the 34 humanoid robot makers in our database exhibiting at the Fair, cumulative capital raised exceeds $6 billion, dominated by UBTech, Unitree, X Square, LimX Dynamics, and the Zhejiang Humanoid Robot Innovation Center. The supporting cast — motor, reducer, sensor, and end-effector suppliers—represents another 100+ exhibitors spanning everything from public-company giants (Kinco,Leader Harmonic, Zhaowei, Keli, Ampron) to sub-$10M seed-stage specialists.
The following exhibitors are foreign giants, major Chinese public companies, or specialized startups.
In China’s 15th Five-Year Plan framing, embodied AI is treated less as a narrow robotics category than as a strategic “future industry” and new growth engine, placed alongside quantum technology, brain-computer interfaces, 6G, and fusion. The core idea is to fuse AI models with physical systems—humanoid robots, dexterous hands, mobile platforms, industrial robots, and smart vehicles—and use them to upgrade manufacturing, logistics, services, and eventually household applications. Beijing’s emphasis is on full-chain industrial cultivation: sensors, actuators, servo drives, materials, control systems, edge chips, software stacks, and application scenarios. In this sense, embodied AI becomes both a showcase for “AI+” and a lever for broader supply-chain localization and industrial upgrading. We saw evidence of this framing throughout the week.
EngineAI’s Series B should be read as a state-backed, industrially anchored financing round, not simply a venture bet on humanoid robotics hype. The participation of Henan Investment Group–affiliated funds points to provincial-level industrial-policy support and likely incentives around local manufacturing, supply-chain buildout, and robotics cluster formation, while Luxshare-affiliated capital gives EngineAI a potentially valuable bridge into high-volume electronics manufacturing, precision assembly, component sourcing, and downstream industrial customers. That combination suggests EngineAI is being positioned less as a pure “robot foundation model” company and more as a hardware-systems integrator with a credible path to scaling production and deploying humanoid robots in factories, logistics, inspection, and other labor-intensive settings. The round therefore signals a broader Chinese playbook for embodied AI: marry local-government capital, manufacturing champions, and fast-moving robotics startups to accelerate commercialization before the underlying “big brain” problem is fully solved.
Vision-Language-Action (VLA) models are the critical but still immature bridge between cognition and control in humanoid systems. In the Chinese context, the challenge is not simply training a large VLA model, but building one that can generalize across tasks, environments, and robot embodiments while maintaining real-time reliability. VLA systems must learn mappings from high-dimensional visual and linguistic inputs to continuous control outputs—gripper trajectories, joint torques, locomotion adjustments—often with sparse, noisy, and expensive-to-collect training data. Unlike pure LLM or VLM scaling, VLA performance is bottlenecked by embodied data: teleoperation datasets, multi-view sensor streams, and action trajectories, which remain limited in China despite aggressive simulation efforts. This forces heavy reliance on synthetic data, sim-to-real transfer, and data augmentation pipelines, all of which introduce brittleness when deployed in unconstrained environments.
Technically, VLA models must solve several tightly coupled problems: 3D spatial grounding (understanding object geometry and relative positioning), temporal coherence (maintaining task state over long horizons), and cross-modal alignment (linking language instructions to visual affordances and feasible actions). Architectures like diffusion policies, transformer-based action tokenization, and flow-matching models improve performance, but they are compute-intensive and difficult to optimize for low-latency edge deployment—especially on domestically available hardware stacks. For Chinese developers, this is compounded by weaker software ecosystems compared to CUDA-centric tooling, making training, debugging, and deploying large-scale VLA models more complex. As a result, most current systems—like Tencent’s TAIROS—still decouple planning (LLM) from execution (VLA + RL), reflecting that a fully unified “end-to-end” embodied foundation model remains out of reach in the near term.The practical claim is that NEO improves the efficiency-performance tradeoff: SenseTime says it can reach strong visual perception with about 390 million image-text pairs, roughly one-tenth the data volume of comparable models, while supporting arbitrary-resolution and long-image inputs and extending toward video and embodied intelligence. For humanoids, the key relevance is that this architecture is better suited to spatial reasoning, video/world modeling, and task-level embodied interaction than older modular VLM designs, while smaller 2B/9B open models point toward edge or robot-side deployment.
A SLAP-derived architecture refers to a robotic learning and control framework built on the principles of Sensorimotor Latent Action Policies (SLAP)—a paradigm in which a model learns to map sensory inputs (vision, proprioception, etc.) into a latent action space that encodes reusable, temporally extended behaviors rather than issuing raw, low-level motor commands.
In practice, a SLAP-derived system has three defining characteristics. First, it uses a latent action representation—a compressed, learned space of “skills” or motion primitives (e.g., grasp, push, reorient) that sit between high-level intent and joint-level control. Second, it is sensorimotor and multimodal, fusing visual and proprioceptive inputs directly into the policy so that actions are conditioned on real-time state, not just abstract goals. Third, it is typically hierarchical, where a higher-level policy (the “big brain”) selects or modulates latent actions, while lower-level controllers (the “small brains”) execute them through fast feedback loops.
The result is a system that is more data-efficient and generalizable than end-to-end torque prediction: instead of relearning control for every task, the robot composes and adapts a library of latent skills. This makes SLAP-derived architectures particularly relevant for humanoids, where the combinatorial space of tasks and contact dynamics makes direct low-level policy learning brittle and inefficient.The simplest way to frame it: harmonic gears are the precision compact rotary solution; RV gears are the heavy-duty rotary solution; planetary roller screws are the high-force linear-actuation solution. In humanoids, the arms and wrists often favor harmonic reducers because they need compactness and fine positioning; hips, waist, and high-load rotary joints may favor RV/cycloidal reducers because stiffness and shock loading matter more; and leg architectures increasingly use planetary roller screws inside electric linear actuators because they can deliver hydraulic-like force density without hydraulics.
For China, the important point is that humanoid robotics is exposing where the domestic industrial base is strongest and weakest. China is already relatively strong in motors, bearings, harmonic reducers, casting/machining, batteries, sensors, and low-cost assembly, which is why Chinese humanoid firms can push aggressive price points. Chinese robotics firms benefit from dense component supply chains and government-backed industrial support. But the highest-end planetary roller screws, precision bearings, encoders, torque sensors, and long-life reducers remain areas where foreign suppliers or foreign process know-how still matter. That is why Tesla Optimus supply-chain discussions keep coming back to Chinese actuator and transmission suppliers: the “body” of the robot is increasingly an advanced manufacturing problem as much as an AI problem.








