- Joined
- May 13, 2025
- Messages
- 13
Hello Leaders!
Thank you for the requests and your patience in waiting for updates on our Replay Reliability initiative. It is due time to inform you of where we are with it and where we are going. Replay consistency is of the utmost importance to us; we want to not only make it correct but keep it that way for the future.
The Problem
We use a “deterministic client re-simulation” approach for replays in DomiNations. This means that an actual replay holds the bare necessities to set up the attack, then feeds user inputs to the simulation engine (example: A tank was dropped here and now). Your device works from those inputs to recreate the original attack (The tank will move here and attack this target). If your device does not simulate those inputs exactly as they originally played out, the results diverge from the original attack reality. The earlier in the simulation this happens, as well as the magnitude of the differentiation, can cause a dramatic difference in the outcomes of the replay (The ‘Butterfly Effect’ from chaos theory illustrates this phenomenon, albeit using discrepancies in input, not simulation, but the final outcomes are similar).
One initial question/suggestion we often see is “Why not just save the state of the original attack as it progresses into the replay?”. The answer is simply “for efficiency”. We are a global game, with many users playing on cellular connections, capped data, or paying for data usage. In addition, the smaller the file, the faster we can download it to your device and begin the playback. Storing vast amounts of data for each replay will make for an unbearable experience for many of our players.
DomiNations has a built-in replay validation mechanism. It knows when a replay differs, and in some cases, can tell us roughly where things went wrong. This was very useful in the early days of the game. We often, with this information, could deduce the cause of the error and fix it rapidly. After a decade of live operation, complexity has caught up to us, and this current system no longer suffices to pinpoint the exact causes of replay ‘drift’. The combination of variables and possibilities is just too great. It began to take multiple hours and sometimes days to try to find a single issue, and eventually, the complexity overcame our ability to keep up, and replay success rates began to drop despite scheduled efforts to find and fix them.
Our Current State
Replay accuracy has dipped so low that they have become unreliable tools for attack assessment, and can do more harm than good when trying to view them for both offensive and defensive strategies. We recognized this some time ago and took multiple actions to strengthen the existing replay validation system. Sadly, after considerable investment, it became clear that salvaging the older method was not producing the efficiency and results that we needed.
We instead took a step back, a deep breath, and rewrote the entire mechanism. This work is finally considered complete enough that we have started using it to track down issues. We have a new internal toolchain and system that can tell us the exact frame in which a replay begins to diverge, why it is diverging, and often points us to the root cause. The fixes can still take time to implement, and ultimately must be internally tested and run through our Quality Assurance (QA) department before making it out to you. In addition, since an early divergence will ‘mask’ a later one, often one fix just leads to the next. It doesn’t make sense to release a replay fix to you if we know you’ll just hit the next one in line! We started from the early ages and worked our way through the game, fixing issues as we found them. We are happy to say that as of right now, a large portion of our internal replays run with no detectable issues found! The work isn’t done; there is still a large suite of test cases to try to identify anything and everything that we may have missed.
We are about to submit code into QA with an estimated 40+ bug fixes for replays across all ages and units in the game. The hope is that you will see these fixes in our next major release (Update 12.18) and begin to notice a difference in replay quality. We will continue searching and fixing additional issues as we find them. The goal is to have a marked improvement in replay quality every major release until we hit 100% accuracy.
The Future
Where do we go from here? Our first step is to reach a point where we cannot find any more replay issues. We are hopeful that this goal is in sight, but our work will not be complete even then.Our next step, already under development, is to package this replay validation framework into our server cloud, so that we can send replays to the system for immediate analysis. We hope to use this capability for a variety of services, including:
- Automated replay testing of all new code as it runs through our development and QA stages. We want to identify any new sources of replay failure before they ever make it out to you, much as we already do with other sections of the game code.
- Automated policing of attacks. If we know replays are correct, we can instantly validate key, prominent, and hopefully at some point ALL attacks, and take immediate action, even mid-war, if we find anyone trying to alter outcomes of battles.
We would like to thank our community both for your patience in this issue, as well as your vigilance in demanding that we improve.