Technical solution to eliminate desync in single-player sessions

"
genericacc wrote:
"
Mysterial wrote:

Also, some of you arguing TCP/UDP is hilarious. Games use UDP; otherwise they'd stall for seconds at a time when packets are lost or misordered. The protocol has nothing to do with routers or NAT traversal; in all but pathological cases router issues are about hosting, not clients.


Well, let's count the myriad ways in which this post is wrong.

1. There exist games that use TCP and games that exist UDP.
2. You don't stall for "seconds at a time" with UDP any more than you do with TCP because it's possible to implement reliable packet transmission/packet ordering on top of UDP too.
3. You don't always need packet ordering for games because certain events (e.g. skill use) don't depend on ordering much.
4. You don't always need reliable transmission for games. Trivially if you send entire game state you only need the latest one and resending an old state is generally a waste of time.

There's probably a few more but I think that's good.


You apparently read my post backwards.

1. Games that use TCP are games that can't tolerate packet loss for other reasons, e.g. lockstep RTSes.
2. You stall with TCP because that is in fact the specification of TCP; you don't get data until the underlying network architecture guarantees it is the exact data that was sent. The reason almost all games use UDP is because that delay is unacceptable and the effects of minor packet loss irrelevant.
3. Hence why games use UDP. Again, try reading my post again.
4. ++

"
Mysterial wrote:

2. You stall with TCP because that is in fact the specification of TCP; you don't get data until the underlying network architecture guarantees it is the exact data that was sent. The reason almost all games use UDP is because that delay is unacceptable and the effects of minor packet loss irrelevant.


TCP SACK has been implemented for ages. Even with regular ACK schemes you don't stall for seconds unless you're using tin cans and string to transfer data.

If you mean checksums, UDP is checksummed too.

Also yeah, sorry, I read that backwards.
"
ScrotieMcB wrote:
@RogueMage:
If you're going to stream it with UDP, why even bother with a TCP fallback mechanism?

1. You need a fallback mechanism to handle cases where the UDP stream is interrupted or the client diverges too far from the server to correct.
2. The TCP-based resync mechanism is already working and tested and there's no compelling reason to replace it with a different fallback.

"
A resync doesn't have to be formally requested to be a resync. However, it isn't desync prevention because the UDP information is always at least a round-trip-ping time late.

No it's not a round-trip, it's simply delayed by a single ping as I explained above. In practice, the amount of latency isn't critical so long as the client can estimate the ping accurately enough to avoid instability.

"
However, the idea that you can use data from the past to make predictions on the future is... well, it doesn't work so well. Let's work out an example using your system.

0ms: User commands character to move forward. Character immediately starts moving forward on the client.

1500ms: Character on client reaches aggro range of enemies. They do nothing because no UDP update from them yet.

Appears that you don't quite understand how this type of feedback works. The client does not wait for any such UDP update before proceeding with its simulation. The client is not dependent on the UDP updates for any vital information at all, it is fully capable of working as an open-loop simulator just as it does today.

"
1675ms: UDP update with information regarding monster movement reaches the client... The client sees from the timestamp that the server sent this information at 1575ms. Now what does the client do at 1675ms? Does it instantly teleport monsters to the predicted position on the server?

No, there are no instant teleports, that would be a type of resync. What happens instead:

1. Client uses latency estimate to correlate server's UDP position data to client's positions 1575ms ago.

2. Client predicts server's current simulation progress based on the server's reported UDP position data of 1575ms prior.

3. Client calculates discrepancies between its current positions and server's predicted current positions.

4. Client calculates trajectory corrections (angle and velocity) scaled to reduce the calculated discrepancies.

5. Client continues its own simulation with player and enemy trajectories corrected to more closely converge towards the server's.

Note that this discrepancy-reduction loop does not wait for any particular event to occur on either the client or the server. It operates continuously to provide server position data that enables the client to incrementally reduce discrepancies before they accumulate to the point where a resync is required.
Last edited by RogueMage#7621 on Nov 24, 2013, 12:17:02 AM
"
Rhys wrote:
Every time I come by to post, I find that this thread has grown by another 30 pages, and I then spend all my time reading it (and the linked articles, which are fascinating) and end up running out of time for actually posting... And so the cycle continues...

So here are my current thoughts.

Floating-point determinism
It is possible to perform floating-point arithmetic in a deterministic manner (which is absolutely REQUIRED for qwaves' proposal), but it is very difficult. Even developers who have made it work call it "a constant struggle" to maintain.

On a side note, I am actually quite interested in implementing this for the random level generation, if possible. Every day, a small-but-significant number of players get booted to the login screen with the message "the client's terrain generation is out-of-sync with the server". They will always get this kick+message when they try to re-enter the bad instance, and the only option they have is to either ctrl-click to force a new instance (which not everyone knows about, sadly) or to wait for it to time out naturally. This is a very annoying bug, which is caused by floating-point inconsistencies, and one I would love to stamp out for good.

Movement and combat
In our current system, movement is intrinsically linked to combat. If you click a monster to attack, you first move into range, then begin the attack. For melee, the range is small, so you must get quite close to the monster. For ranged attacks, such as spells and bow attacks, this distance is greater. Several skills involve movement as a part of their function, such as Shield Charge and Leap Slam. Some skills, like Heavy Strike, cause Knockback, which changes a monster's position. Combat also affects monster AI, such as how some monsters flee when set on fire, or hit by skills linked to the Chance to Flee support gem. Also, some monsters, like archers, try to stay at a certain range from their target.

So, I don't think it is possible to use a deterministic system to govern only movement or only combat. The two systems are just too interconnected to separate like that.

qwaves' suggested system

Client state hash?
After rereading the first post and the code sample (as of time of writing), I noticed something. There is no mention of a state hash.

Under the system as presented, the server never validates the client's state. It doesn't check if the client is in sync: it just assumes it is and makes sure the client isn't doing anything "illegal". This means that if the client DOES get desynced, the server won't actually notice until the client is SO desynced that it tries something the server doesn't like. A "state hash", sent with every snapshot, would fix this.

A state hash is just a hash of the client's entities and RNG data, which the server can also calculate and compare to check if the client is desynced or not.


But the real problem is the issue of singleplayer. We simply aren't going to rewrite half the game engine just to try to improve singleplayer. If we were going to go that far, we would want to make sure it worked for multiplayer, too. Also, there's no way we'd have both systems (old and new) running in parallel, falling back to the old one at times. That would be a bottomless rabbithole of doom.

Now, I'm not saying the system can't be adapted to work for multiple clients. However, doing so breaks some of its fundamental assumptions:
State hash
Even though I just suggested this, it doesn't quite work in multiplayer. Because each player receives information about the other players that is delayed by latency, a client's state hash would rarely be valid. How to fix this? Well, first we need to decide who's "in charge".

Client authority
When there are multiple clients active, who has authority over the correct state of the game world? You could designate one client to be "host" who has authority and the other clients must obey the host. This is quite bad for PvP, though, since it creates an uneven playing field.

Better to make the server the authority, which is also more secure. (cheating is still possible, though)

Snapshot validation
Of course, with multiple clients sending in data at different times and latencies, the proposed system of "simulate and validate each snapshot" falls apart. One client may send in a perfectly valid snapshot, and then next frame another client with worse latency sends in a valid snapshot that invalidates the first client's action.

So the server must be able to accept "snapshots" from clients, knowing that they may be invalidated sometime in the future. This requires the server to be running the master simulation in parallel to the clients, and to be able to backtrack in time, mix in the client actions (which may or may not fail, now) and update the clients, then simulate back to the present. The clients, then must ALSO be able to backtrack, apply new actions from other players, then simulate back up to the present state.

If we are allowing the client to effectively perform actions "in the past", how do we know how far back to allow actions? The server can measure the latency to each client, which gives an estimate for how far back a legitimate action could have been started, but what about lag spikes? The server has to be lenient enough to handle average network instability (though this allows cheating) but it also has to resync a client that lags too much, since it can't accept the action.

The server effectively has a rolling time-frame of, say, 3 seconds into the past, where anything that has happened earlier than that has "officially" happened and cannot be altered. Anything within the 3 second window is can be potentially undone.

Client state hash, again
So the state hash trick actually does work in multiplayer, as long as the state being hashed is the state from (in this example) 3 seconds ago, a state which cannot be changed. This makes the client quite similar to the server, in that it has a rolling 3-seconds-ago window.

There might be other concerns, or problems I haven't thought of. It is late, and I am tired, after all. And I don't know how well this would hold up with 6 players all together spamming dozens of skills all at the same time...

WHAT DOES IT ALL MEAN?!?!?!
qwave has suggested a radical change to our core game systems. The concept is not impossible, I think, but it isn't really feasible. It's simply too big of a change. It also has some security concerns. But it was fun to think about and discuss.


I disagree about it not being feasible.

First off, since the person is solo then you can generate the entire loot table when the person enters a zone based on, I think, 3-5 variables which is less work than item rolls take presently.

Second off, these maps could be distributed to the client in much the same way that distributed computing projects distribute their work units. The problem is that you'd need very esoteric encryption for these work units or else you'd have those security concerns you were talking about.

Thirdly, if players really wanted this implemented then I don't think they would like the client security needed to prevent abuse by things like speed hacks and map hacks.

Fourthly and finally, the method and means to achieve a product with seamless gameplay between online and offline mode would be better spent on a new product, as people that think this game should be playable offline and solo are quite possibly overdosing on all of the paste they ate.
IGN : Reamus
"
Andromansis wrote:

I disagree about it not being feasible.

First off, since the person is solo then you can generate the entire loot table when the person enters a zone based on, I think, 3-5 variables which is less work than item rolls take presently.


Why would you generate the entire loot table when the map's created? You generate it lazily when you need to actually roll for drops.

"
Second off, these maps could be distributed to the client in much the same way that distributed computing projects distribute their work units. The problem is that you'd need very esoteric encryption for these work units or else you'd have those security concerns you were talking about.


It has to be able to be decrypted or it's useless. Let me introduce you to ReadProcessMemory() or CreateRemoteThread().

Also, most games like this just send the seed value needed to generate the map instead of the entire map itself.

"
Thirdly, if players really wanted this implemented then I don't think they would like the client security needed to prevent abuse by things like speed hacks and map hacks.


Speed hacks are relatively easy to limit: if player movement goes past a certain bubble in a certain timeframe you disconnect them. In practice nobody will care in this game because you can always use Faster Attacks Leap Slam if you need to go somewhere.

"
Fourthly and finally, the method and means to achieve a product with seamless gameplay between online and offline mode would be better spent on a new product, as people that think this game should be playable offline and solo are quite possibly overdosing on all of the paste they ate.


Yes, seamless offline and online gameplay has never happened before. What's a Diablo 2? (Mind, open bnet was hack-ridden)
*disclaimer: only read some of this massive post and im not a programmer.

Ok so what I have seen, specifically from the devs and the programmer gamers here is more or less a proposal from gamers on how to fix the issue being rebutted by the developers as to why it wont work/not worth it.


How about a laymen's suggestion?

Do whatever every other game out there is doing that doesn't have epic desync.

This system you have now might be epic in stopping cheaters, but its making all your non cheating players suffer...and when it comes down to it....im not seeing some hacker in pvp decimating me....im still getting gold farmer bot spam...I don't care if someone ill never see in game botted to level cap and farmed awesome stuff.

I just want to play the game without fighting the fact that at no time does my game client and server know exactly where the hell I am and where the hell mobs are in game.

I cant think of a single other ARPG or mmorpg for that matter...even online FPS...that have this issue (yeah im sure there is desync but its not effecting game play on my end)


Regardless, its becoming a running joke about this game...its the number one issue with the game...as it effects everyone (even if you have fanboys who refuse to admit it) and it creates a worse gaming experience than knowing some twit is out there hacking his critical strikes or whatever.


Make fixing this a priority. New content doesn't mean much when the core of the game is busted.

Its the only thing stopping your game from being a fantastic game in the eyes of people who are not total fanboys.
just for try, for see and for know
"
Regardless, its becoming a running joke about this game...its the number one issue with the game...as it effects everyone (even if you have fanboys who refuse to admit it) and it creates a worse gaming experience than knowing some twit is out there hacking his critical strikes or whatever.


Yep, my point ive been trying to make the whole time. Fixing desync is way more important than the side effect of hackers potentially being able to crit more, etc. Online games were not meant to be 'fair'. People play 18 hours per day and RMT, how are you going to compete with them? You can't, so why cry over spilt milk.

Top priority should be an enjoyable game experience where you don't randomly die due to desync. Second priority should be to ban/detect hacks/hackers on the server-side.

As long as dupes, speed hack, god-mode, and other things aren't possible (which can disrupt the economy), then the game experience will not change.
Last edited by qwave#5074 on Nov 24, 2013, 1:43:44 AM
"
RogueMage wrote:
"
ScrotieMcB wrote:
@RogueMage:
If you're going to stream it with UDP, why even bother with a TCP fallback mechanism?

1. You need a fallback mechanism to handle cases where the UDP stream is interrupted or the client diverges too far from the server to correct.
No, you wouldn't. Stream interrupted? Restart stream. Problem solved.
"
RogueMage wrote:
"
A resync doesn't have to be formally requested to be a resync. However, it isn't desync prevention because the UDP information is always at least a round-trip-ping time late.
No it's not a round-trip
Yes, it is. Remember that the client simulation as a whole is a one-way-ping ahead of the server; thus, whenever the client simulation depends on the server to send it something, it's a round-trip ping behind.
"
RogueMage wrote:
The client does not wait for any such UDP update before proceeding with its simulation. The client is not dependent on the UDP updates for any vital information at all
You are literally arguing against your own suggestion, saying that the client is not dependent on it and that it is not vital.
"
RogueMage wrote:
1. Client uses latency estimate to correlate server's UDP position data to client's positions 1575ms ago.
Either "1575ms ago" is a typo or you're crazy. There's far too many possibilities for divergence for the client to predict that far in advance.
"
RogueMage wrote:
Note that this discrepancy-reduction loop does not wait for any particular event to occur on either the client or the server.
It fucking should wait for something: some degree of user input. The player isn't an AI, it's something with the power to make prediction-shattering decisions in real time, which is something you seem to want to ignore.
When Stephen Colbert was killed by HYDRA's Project Insight in 2014, the comedy world lost a hero. Since his life model decoy isn't up to the task, please do not mistake my performance as political discussion. I'm just doing what Steve would have wanted.
Last edited by ScrotieMcB#2697 on Nov 24, 2013, 2:07:57 AM
I thought about this some more.

Player:

I still think at least the position of the player should be resynced every 500ms.

That would result in an additional bandwith usage of maybe 32 Bytes per second.
If there is actually X, Y and Z coordinates it would be something like 48 Bytes per second.
That won't stop the mobs from desyncing but it will prevent yourself from desyncing.

Mobs:

When it comes to mobs I also could imagine kind of a request system.
Instead of resyncing everything on screen it could just resync on request like:

Resync mouse-overed mob immidiately.

Maybe include a timer for this so the mouse-overed mob doesn't get resynced nonstop.
Once every 2-3 sec should be enough.

Resync mobs that should be in range of an aoe attack according to client.

If the client says: This mob should be in range of an aoe attack: Resync it.
If the client can't tell something like this the server still could do this automatically just not that frequent and only if attacks are involved and the mob is in their range.
Again with a timer of 2-3 sec so it doesn't happen nonstop.
This should happen regardless of the mob being successfully hit or not.

Nonstop resync for special situations every 2-3 sec, but only for certain mobs.

Here I'm thinking about situations like boss fights.

For example vaal.
In case of a party of 6 people the vaal fight would result in something like 224 Bytes every 2-3 seconds. Or in case of X, Y and Z coordinates: Something like 336 Bytes every 2-3 seconds.
This treatment should be done for each unique mob.

Another example: Dominus Fight
I never actually counted them but there are like 5-6 unique mobs during the vaal finght.
So again: In case of a party of 6 people this would result in a bandwidth usage of something like 384 Bytes every 2-3 seconds; Or for X/Y/Z something like 576 Bytes every 2-3 seconds.

Conclusion:

Even if this isn't the fail-prove system everybody wants to have at least it would be nice temporary fix.
Most of the dangerous situations should be covered.

The three possibilities to resync mobs could also have a shared timer so a mob doesn't get resynced because of an aoe attack when it was just resynced because of a mouse-over.
Or in case of special situations like boss fights if mobs are resynced nonstop mouse-over and attack resync would be deactivated completely.

What do you think?

This should very well be possible with a modern server.
Last edited by grasmann#3903 on Nov 24, 2013, 5:10:57 AM
In general, UDP is a far superior choice for doing stuff like updating the position of monesters and your own characters, and its typically how games use UDP. TCP is obviously completely reliable (assuming the physical connection still exists), but it has a ridiculous overhead, anyone who studied uni level networking and had a look at the TCP standard would understand this

This is pretty irrelevant because current PoE, if I am not mistaken, doesn't actually stream positions of monesters, it does frequent checks (which is not the same thing) to see if the client and/or server is in sync. UDP would make sense if PoE mechanics were mainly server side, but they aren't

Report Forum Post

Report Account:

Report Type

Additional Info