Towards Robust and Adaptive Decision-Making in Autonomous Robots - A Framework for Synchronized Digital Twins

by Hannes Philip Voß and Luca Jedelhauser

Mar 25, 2025 digital twins, reinforcement learning, autonomous mobile robots

Introduction

As part of a cooperation project with the Deutsches Elektronen-Synchrotron (DESY), a safe, learning-enabled control framework is to be developed for an autonomous mobile robot that is to be used for radiation measurement and maintenance tasks in particle accelerator tunnels. It should be able to recognize obstacles independently, avoid collisions with them and carry out tasks independently. Conventional robotics approaches work with predefined trajectories, which restrict them to an isolated work area and a controlled environment free of obstacles [1]. The environment of particle accelerator tunnels is considered semi-dynamic because, although it is a self-contained environment with a clear radius of action, obstacles such as cables or highly sensitive equipment can still occur in the vicinity of the robot [2]. In order for the robot system to navigate and operate safely, it must be able to adapt to unforeseen situations such as the occurrence of obstacles and avoid any collisions with systems and equipment [2]. A promising approach for the development of such a robotic system in a semi-dynamic environment with unforeseen obstacles are so-called synchronized predictive digital twins. They combine the safe exploration of environments in a simulation environment without causing damage in the real environment with the predictive and adaptive nature of machine learning algorithms (here in particular reinforcement learning). In order to be able to develop such a synchronized predictive digital twin, however, the challenges of synchronized digital twins must first be understood.

In this blog post, an overview of the key technical and computational challenges in the development and operation of synchronized predictive digital twins will be presented. Furthermore, a conceptual framework for a synchronized predictive digital twin will be presented that allows real-time behaviour adaptation to unforeseen obstacles while addressing these challenges in the best possible way.

Synchronized Digital Twins

Before diving into challenges, it is important to understand what constitutes a synchronized digital twin. According to the research, a synchronized digital twin can be defined as “a ‘physical entity’ consisting of objects, processes, interacting ambience and exogenous conditions, which are digitally reproduced in a counterpart ‘digital entity’, and a bidirectional information flow between the physical and digital entity ensures the state and control information exchanges between them, supporting synchronous or asynchronous behavioral influence on each other.” [7].

Challenges

Now that we know the characteristics of synchronized digital twins, we can take a closer look at their key challenges. In order to compile a comprehensive list of challenges, we selected five recent survey and review papers. For this we used Google Scholar and the following narrowing query:

(“Bidirectional” OR “Synchronized”) AND “Digital Twin” AND “Robotics” AND “Challenges” AND (“Survey” OR “Review”)

We filtered the results by publication date (all publications after 2022) in order to create a list of the most recent and thus still relevant challenges. Due to the time frame of this project, the first five publications were selected for further analysis [3-7]. The challenges identified from the selected publications were divided into the following five categories:

Technical and Computational,
Security and Reliability,
Economic and Organizational Barriers,
Network and Infrastructure,
Ethical and Social Concerns.

In this blog post, due to the early stage of the to be developed framework and to promote readability, only the category of technical and computational challenges will be considered.

Technical and Computational Challenges

We were able to identify a total of six distinct technical challenges, which we will describe here in more detail.

Synchronization Accuracy and Delay

One of the key challenges of synchronized digital twins is the real-time synchronization between the physical robot system and its digital twin. The framework must process continuous sensor data streams (such as streams of the joint angles of a robot arm) at high frequency. This requires communication protocols that are optimized for speed and can buffer data efficiently to minimize delays. For a robotic arm, joint angle sensors may need to transmit data at 1000 Hz or higher to capture rapid movements accurately. The system should be able to process this high-volume data stream [7].

Data Reliability and Consistency

To keep the data between the physical system and the digital twin consistent, the implementation of a synchronization logic that coordinates the state changes between the two is useful. Such inconsistencies can be caused by network issues (e.g. Wi-Fi interference that causes packet loss), sensor failures (e.g. a failing temperature sensor on a welding robot) or computing latencies (e.g. through complex calculations in the digital twin like physics simulations that introduce lag), for example. The architecture must therefore integrate mechanisms for conflict resolution and plausibility checks in order to ensure a reliable reflection of the current system status. In addition, continuous validation and filtering of the sensor data is necessary in order to provide simulations and predictive models with the most accurate input possible [3].

Standardization and Modularity

Another key challenge is the missing standardization and modularity of digital twin systems. Modular software design principles, standardized interfaces and interoperable communication protocols (e.g. by defining common APIs through ROS interfaces for different robot types (e.g. articulated arms or autonomous ground vehicles) to interact with the digital twin system) would allow individual robot systems to be seamlessly integrated into larger networked ecosystems which are common in industrial settings. This not only promotes flexibility and expandability, but also facilitates cross-manufacturer use and interaction with external digital twins or cloud platforms [3].

Resource Management

Digital twins with bidirectional synchronization and complex simulation models can require immense computing resources. The architecture should therefore implement adaptive resource management that switches between cloud, edge and local computing approaches depending on the application scenario (e.g. less critical sensors of a mobile robot like ambient light might be processed on-board, while complex path planning calculations are offloaded to a cloud server). Distribution of computational load and prioritization of computationally intensive processes could allow achieving an optimal balance between reaction speed and processing effort. In addition, efficient use and storage of training data is essential in order to minimize the training effort of learning agents (e.g. only saving processed object detection results instead of storing raw video feeds from a robot’s camera to reduce storage requirements) [4].

Safety and Error Handling

Mechanisms for error detection and handling are required to ensure continuous operability. The architecture must be able to detect and independently compensate for synchronization interruptions (e.g. handling of synchronization after loss of network connectivity), network problems and hardware and software errors without jeopardizing the integrity of the system. Precise detection of collisions or potential danger zones is particularly important in safety-critical applications in which robots interact with humans or sensitive hardware. The system should autonomously activate safety mechanisms, for example by switching to safety modes in real-time or taking automated evasive maneuvers [5].

Bridging Simulation and Reality

Another key challenge with digital twins is the sim-to-real gap - the discrepancy between the simulated and real-world behavior of the robot. These differences are caused by physical interactions that cannot be modeled exactly, differences in lightning (e.g. reflective surfaces can cause agents to perform differently in simulation versus reality), material deviations or unpredictable environmental factors (e.g. a simulated robotic gripper might not accurately model the deformation of soft materials, leading to differences in grasping behavior). To close this gap, domain adaptation or domain randomization techniques can be integrated into the digital twin model. In addition, models that incorporate real sensor data into the simulation at runtime can improve the match between the digital and physical systems [6].

Next, we analyzed similar approaches in the field of synchronized predictive digital twins for the previously identified challenges to find out what strategies they use to address the challenges.

Matulis et al. [8]

The approach of Matulis et al. [8] focuses on using synchronized digital twins for control of a robotic arm using reinforcement learning, primarily leveraging Unity for simulation environments. The authors mention the connection between the virtual and the physical robotic arm environment, but do not address the challenges of real-time synchronization. They only mention “the delay that existed between the Arduino physical twin controller and Unity communication”, but do not discuss how to adress this issue. They describe the use of the “academy tool […] ml-agents” which aims to “track and observe the environment” and send “data to TensorFlow via an external communication system”. It can be assumed that the tool used implements sufficient consistency mechanisms. However, there is no detailed description of how these mechanisms are implemented or how the system deals with sensor failures or network delays. The paper focuses on a specific use case and does not mention any general modular design principles or standardized interfaces. However, their framework is using inherently modular components like TensorFlow for learning and Unity for the creation of their virtual environment which indicates at least a minimal amount of modularity. In addition, a statement of intent is made to explore their approach towards “collaborative communication” with multiple instances of their system, which would presumably require a decent level of modularity and thus indicate efforts to modularize their framework. The authors mention the use of Unity as a simulation environment and TensorFlow to train their model, which indicates a high computational load. A declaration of gratitude to NVIDIA for “providing a Titan Xp series” graphics card also indicates resource-intensive training. However, they do not explain how resources are managed or how load distribution is optimized. There is no detailed mention of efficient use and storage of training data, only the mention of using an encoder to simplify and denoise sensor data suggests minimal considerations regarding reduction of computational load. The authors mention mechanisms for error detection. They implement a curriculum learning paradigm which includes stages where a collision detection mechanism is used. However, they are not discussing it in more detail. Also to be mentioned is the reward function which is used for training. It incorporates physical robot limits like maximum angles of the robotic arm. They also mention a mechanism to perform checks if the performed action in simulation is viable in the real-world. However, they dropped this approach due to a drastic increase in training time. The authors do not mention any domain adaptation or domain randomization techniques to improve the match between the digital and physical system. However, they discuss the use of parallelized trainings in simulation environments and how it could “improve stability through fault tolerance” which indicates that they consider domain randomization techniques in the future.

Kousi et al. [9]

The approach of Kousi et al. [9] focuses on the use of digital twins for flexible manufacturing, using real-time updates to optimize production lines. The authors mention that their approach “was deployed towards integrating and real time updating the virtual world” based on real-time sensor data and resource data. The approach is based on the use of ROS as a communication layer. This indicates an effort to reduce latency and enhance synchronization capabilities, but no specific details on protocols or data buffering are given. They describe fusion of the sensor data of their robot to create a virtual simulation environment of a factory floor. However, they do not explain in detail how data consistency is maintained between the physical robot, the digital twin and the factory floor obstacles (virtual and phsical). They also do not elaborate on how the system deals with potential inconsistencies, sensor failures or network delays. The introduction of a “Unified Resource Data Model”, that consists of relevant mobile object data, and a subcomponent called “Resource status monitoring”, which is used for “real-time monitoring the status and location of each mobile resource”, is in contrast allowing the assumption of some kind of inheritant inconsistency prevention mechanism. The approach introduces different managers (Resource Manager, Sensor Manager, Layout Manager), which “support[s] the registration of multiple ressources”. Further, the use of ROS interfaces simplifies the addition of custom resources. Explicit reference is made to interoperability with other systems or standards, as they state the “quick integration with existing robotic applications” is focused. They have validated their implementation on one PC with Ubuntu and ROS Indigo, which indicates low computational ressource needs. A discussion about adaptive resource management, load balancing between cloud, edge and local computing approaches or efficient use of training data is missing. The authors do not mention any mechanisms for error detection or handling in the event of synchronization interruptions, network problems or hardware/software errors. However, they adress the avoidance of collisions or potential danger zones through the use of an “occupancy grid using OctoMap library”. They mention the goal of enabling “online re-planning”. However, neither the learning procedure nor specific domain adaptation or randomization techniques are discussed in order to reduce the sim-to-real gap.

Mo et al. [10]

The Terra-Framework of Mo et al. [10] is designed for robot navigation in dynamic environments, integrating multiview multimodal sensing, real-time synchronization, and adaptive decision-making. The authors emphasize the importance of a real-time information feedback loop between the physical system and the digital twin. The use of multi-view intelligent perception and mapping to capture the state of the robot and the environment on-the-fly indicates a focus on low latency. Actively updating the action policy and sending it back to the agent in real time and the calculation of individual delays between all hardware components indicate some form of synchronization and latency management, althogh a deeper discussion on that is missing. The paper describes a “comprehensive DT representation” that includes both a realistic virtual simulation of the physical environment with independently modeled noise and real-time status monitoring of the deployed agent. They mention the communication of states via Wi-Fi and state that “a closed information connection is established”. This connection remains active until the robot’s destination is reached. A deeper discussion of how to handle data inconsistencies due to network issues or other problems is missing. The authors state that their approach follows a modular design principle, although a deeper discussion on that is missing. However, the framework itself seems to be heavily tailored to the specific use case, even though they mention that they use ROS in their communication layer, which allows extensions for other robot hardware/platforms. The authors mention that the used robot has minimal computing power and sensor capabilities but the robot’s navigation can be ”handled by networked external sensors and using cloud computing”. Although there is no detailed discussion of adaptive resource management, load balancing or efficient use of training data, the authors claim flexibility to shift loads between systems. They focus on “avoiding collisions and achieving tasks in challenging environments”, which implicitly addresses the need for error detection and handling. However, no specific mechanisms for detecting synchronization breakdowns, network problems or hardware/software failures are mentioned. They are measuring the deviation of the optimal navigation path of the robot, which is intended to keep minimal. In case of a too large deviation, “the robto will either collide with the obstacles […] or fall off from the edge of the scene”. This indicates that a collision detection mechanism is missing. The authors explicitly aim to close the gap between simulation and reality by implementing both real-to-sim and sim-to-real information flows. They mention the use of environmental noise (lightning, texture and movement noise) in the digital twin representation, which helps to make the simulation more realistic. The framework actively updates the action policy based on real-time data, which enables adaptation to the real world.

All of these results should be treated with a grain of salt, as we were unable to check the implementations of the frameworks and some valuations are based on assumptions.

We have summarized our results in the following table:

Challenges	Matulis et al. [8]	Kousi et al. [9]	Mo et al. [10]	Our Approach
Synchronization Accuracy and Low-Latency	-	✓	✓	✓
Data Reliability and Consistency	✓	✓	✓	✓
Standardization and Modularity	✓	✓✓	✓	✓
Resource Management	✓	-	✓✓	-
Error Handling and Safety	✓	-	-	✓✓
Bridging Simulation and Reality	-	-	✓✓	✓✓

Proposed Framework

Our framework uses a simulated digital twin as an active companion that collaborates with a real robot. This twin runs in a GPU-supported simulation environment (constructed in NVIDIA IsaacSim) and can be instantiated multiple times. This means that several copies (with identical or varying parameters) of it can be created. This not only enables training runs where dozens of samples are collected at once, but also efficient randomization of parameter values (mass, friction, actuator properties, etc.) or states and confronts the agent with a variety of similar but unique situations. This is a common strategy to reduce the sim-to-real gap and make the strategies more robust. The digital twin serves as a training, validation and prediction environment in which the network can act and obtain an estimate of what will happen in the real world as a result of its behavior.

Workflow

In contrast to traditional simulation-based pre-training, our framework accompanies the robot beyond the initial training process and into real execution. Dangerous or undesirable states are tracked, and when they occur, targeted simulated training runs are initiated to adapt the behavior to less critical states. As soon as critical states are detected in the real world, the state is recorded using the sensory data and then transferred to the digital twin. From here, multiple copies of this state are created and randomization of the area is added. Parallel training is then initiated, allowing efficient exploration explicitly around this critical state. Once this state can be handled sufficiently well, another simulated validation run is performed against the real state to obtain a prediction of whether the adapted strategy will now handle this situation better. Once the strategy has been validated in the simulation, it runs on the robot (with parallel state matching). This workflow is illustrated below.

At the beginning of the workflow, the framework requires a pre-trained strategy to a certain extent. It is not necessary for it to be very powerful, but a very unstable and unsuitable strategy will lead to frequent switching back and forth between the real robot and the simulation. The weaker the strategy, the more likely it is that it will maneuver the robot into states that require very complex recovery strategies. Since the twin is already in a simulator that enables efficient training, it can be used directly to perform extensive pre-training before the described workflow begins.

Let’s summarize to what extent our proposed framework adresses the identified challenges of synchronized digital twins:

Synchronization Accuracy and Delay In our framework, the digital twin is synchronized with the physical robot in real-time. To achieve this, we use ROS2 for communication, which uses Data Distribution Service (DDS) middleware, which is known for low latency data transmission. However, we are not able to fully assess the capabilities due to missing real-world evaluation.
Data Reliability and Consistency To keep the position of the digital twin and the physical robot consistent, we use a method to reduce the drift between the two systems (this flow is illustrated below). This allows keeping the states consistent.
Standardization and Modularity The presented framework was developed according to modular software design principles to enable a more general use. It uses NVIDIA IsaacSim to provide a simulation environment and utilizes the widely used Robot Operating System (ROS) in its second version for communication between hardware and the framework. The use of ROS2 enables easy expansion with new robotics hardware due to their standardized interface design and thus contributes to the modularity of the system. However, a fully modularized framework implementation is still pending.
Resource Management The framework does not currently use adaptive resource management, which switches between cloud, edge and local processing. An implementation of this is being considered.
Error Handling and Safety We provide various safety layers for error detection and handling, each of which can implement different safety checks. One example of this is collision detection based on a torque threshold. If such critical states are detected, execution in the real environment is stopped immediately.
Bridging Simulation and Reality The framework implements domain randomization during the learning process as a technique to address the sim-to-real gap. Additionally, training with observations from the real-world allows the robot to adapt to the environment specifics, which mitigates the sim-to-real gap.

Discussion and Outlook

The presented results provide a guideline for the development and deployment of synchronized predictive digital twins of autonomous mobile robots in semi-dynamic environments. Even though the first attempt to analyze related approaches with respect to the identified challenges has been implemented, a deeper and more comprehensive analysis of the frameworks (including the one presented here) is needed. Furthermore, some of the identified challenges can only be reasonably evaluated once the presented framework is fully implemented and functional.

In the next phase, the implementation of the framework is to be fully completed so that an evaluation of the framework’s capabilities with regard to the unresolved challenges is possible. Subsequently, the framework implementation will be evaluated with regard to its adaptation behavior in a real scenario with unforeseen obstacles. However, due to the existing risk of collisions in a real particle accelerator environment, the first step will be to test the system on test benches at DESY.

References

Grau, A., Indri, M., Bello, L. L., & Sauter, T. (2020). Robots in industry: The past, present, and future of a growing collaboration with humans. IEEE Industrial Electronics Magazine, 15(1), 50-61.
Seppänen, A., Vepsäläinen, J., Ojala, R., & Tammi, K. (2022). Comparison of Semi-autonomous Mobile Robot Control Strategies in Presence of Large Delay Fluctuation. Journal of Intelligent & Robotic Systems, 106(1), 28.
Mazumder, A., Sahed, M. F., Tasneem, Z., Das, P., Badal, F. R., Ali, M. F., … & Islam, M. R. (2023). Towards next generation digital twin in robotics: Trends, scopes, challenges, and future. Heliyon, 9(2).
Mihai, S., Yaqoob, M., Hung, D. V., Davis, W., Towakel, P., Raza, M., … & Nguyen, H. X. (2022). Digital twins: A survey on enabling technologies, challenges, trends and future prospects. IEEE Communications Surveys & Tutorials, 24(4), 2255-2291.
Ramasubramanian, A. K., Mathew, R., Kelly, M., Hargaden, V., & Papakostas, N. (2022). Digital twin for human–robot collaboration in manufacturing: Review and outlook. Applied Sciences, 12(10), 4811.
Zafar, M. H., Langås, E. F., & Sanfilippo, F. (2024). Exploring the synergies between collaborative robotics, digital twins, augmentation, and industry 5.0 for smart manufacturing: A state-of-the-art review. Robotics and Computer-Integrated Manufacturing, 89, 102769.
Baidya, S., Das, S. K., Uddin, M. H., Kosek, C., & Summers, C. (2022). Digital twin in safety-critical robotics applications: Opportunities and challenges. In 2022 IEEE International Performance, Computing, and Communications Conference (IPCCC) (pp. 101-107). IEEE.
Matulis, M., & Harvey, C. (2021). A robot arm digital twin utilising reinforcement learning. Computers & Graphics, 95, 106-114.
Kousi, N., Gkournelos, C., Aivaliotis, S., Giannoulis, C., Michalos, G., & Makris, S. (2019). Digital twin for adaptation of robots’ behavior in flexible robotic assembly lines. Procedia manufacturing, 28, 121-126.
Mo, Y., Ma, S., Gong, H., Chen, Z., Zhang, J., & Tao, D. (2021). Terra: A smart and sensible digital twin framework for robust robot deployment in challenging environments. IEEE Internet of Things Journal, 8(18), 14039-14050.

digital twins reinforcement learning autonomous mobile robots