Are thought and movement truly different?
A recent article published in OhmyNews examines the current technological level of humanoid robots and emphasizes that human physical labor is, in essence, an area that machines find difficult to replace. The article presents as its main grounds the sophistication of sensorimotor skills centered on the hand, the depth of experience that humans have accumulated over a long period of time, and the difficulty of fully transferring learning from virtual environments into the real world. Such claims resonate with intuitive human experience and easily lead to a somewhat skeptical judgment about robotic technology. This perspective is closely connected to what is commonly known as Moravec’s paradox, namely the idea that cognitive operations such as calculation and reasoning are relatively easy, whereas bodily skills that combine sensation and movement are extremely difficult. In addition, the so-called ‘data gap’ argument is also presented, according to which robots find it difficult to secure sufficient data in the real world.
However, such reasoning remains at a standpoint that presupposes the separation of body and cognition, without sufficiently reflecting recent research achievements in neuroscience and artificial intelligence. First, it is necessary to reconsider the assumption that sensorimotor skills are inherently more difficult than cognitive operations. The traditional skeptical view has held that, no matter how advanced a robot’s computational abilities may be, there are fundamental limits when it comes to performing precise actions in the physical world. Yet recently developed models suggest that this distinction may stem from earlier AI architectures that were designed by separating perception, language, and action. In structures that integrate visual information, linguistic context, and physical action within a single system, action does not simply appear as the result of a command but functions as part of the process of understanding. Under such conditions, the system does not first think and then move; rather, it continuously forms an understanding of the world in the very process of moving. In this kind of structure, it becomes difficult to distinguish ‘thought’ and ‘movement’ as separate stages. Accordingly, the conventional assumption that sensorimotor skills belong to a fundamentally different domain from cognition can no longer be maintained as it stands.
Next, the claim that the vast experience accumulated by humans over a long time is lacking in machines also needs to be reconsidered from the same perspective. The skeptical position holds that it is impossible for robots to catch up, in a short period, with the quantity and depth of experience that humans have built through evolution and learning. However, recent research shows that experience does not necessarily have to be accumulated in the same way. Robots can construct virtual environments that include various physical conditions such as gravity, friction, and changes in objects, and through these environments they can form vast amounts of experience within a very short time. Moreover, by integrating data collected by different robots operating in different environments, a structure is emerging in which actions not directly learned in a specific situation can be generalized on the basis of other forms of experience. This approach goes beyond the conventional view that understands experience solely in terms of ‘long-term accumulation.’
One is reminded of a scene in which an excavator operator with decades of experience said, ‘The sense in my hands, built up through years and years of time, can never be imitated by an AI excavator.’ His words contain the pride and conviction that arise from long labor. At the same time, however, they may also express a kind of desperation: only by believing this can human labor still be seen as necessary, and only then can his own livelihood be secured. Perhaps for him, artificial intelligence and automation are not future technologies that have yet to arrive, but rather an unfamiliar ‘new time’ that may overturn the order of time in which he has long lived, and thus something perceived as a fearful darkness.
The so-called ‘sim-to-real gap,’ the problem that learning in virtual environments does not transfer well to the real world, can also be reconsidered in a similar way. In traditional views, the complexity and uncertainty of the real world have been regarded as major limitations on robotic learning. Recently, however, methods have developed that repeatedly train systems while varying physical conditions. The aim of such methods is not to learn actions tailored to a single environment, but to form more robust structures capable of responding to a variety of changing situations. As technologies that combine not only visual information but also tactile and force feedback are added, the precision of physical interaction is also gradually improving. This approach treats the uncertainty of reality not as a mere obstacle but as part of learning, thereby suggesting a direction different from that of traditional skepticism.
These technological changes align well with findings from neuroscience that illuminate the structure of human cognition. Recent studies show that human thinking does not occur in a state where sensation and movement are separated, but rather is formed within tightly interconnected networks. As tasks become more complex, cognitive processing areas and motor-related areas are activated simultaneously, and as tasks become more difficult, the connectivity between them becomes even stronger. This suggests that movement is not merely a stage that executes the result of thought, but is itself part of the thinking process. Therefore, setting human bodily abilities as an independent domain separated from cognition does not adequately reflect the actual mode of human functioning.
In the end, the skeptical judgment presented in the OhmyNews article is grounded in an intuitive understanding of human experience and bodily ability, but its central premise—that cognition and the body are separate—can no longer be regarded as self-evident in light of recent scientific research. Artificial intelligence research that is developing toward integrated structures of vision, language, and action, environments in which data are generated and transferred, and neuroscientific understandings of cognition in which sensation and movement are intertwined all point in the same direction. That direction is the need to reconstruct the existing framework that divides ‘thinking’ and ‘acting’ into separate domains. From this perspective, the claim that human physical labor cannot, in principle, be replaced by machines is, despite its strong intuitive appeal, difficult to regard as having sufficient theoretical grounding.
Yu DaeChil (Thomas Philosophia Schola Ockham Institute & Happy Workers)
