Technical solution to eliminate desync in single-player sessions
" How am I supposed to play then? Your client needs to see game state to, well, show the game. There is no "block all but client", it's either "block all" or "allow client+all". " Processing. Overhead. Encrypt everything and your lag/game speed just got cut to 1/4th. The reason why encryption works is not because it's impossible to get original data based on encrypted data, it's because it takes too damn long to reverse it. Thus, it therefore comes that encrypting something must be computationally long enough so that attempts at reversing it occur in a non-significant amount of time. If your encryption takes only 1 nanosecond, and your entire keyspace is only a million keys, you can reverse encrypted data in 1 second. " Alone, it's not worth hacking. Combined with "crafting is done on the client, because I can trust the client", it's gamebreaking. It's like this. I have a 6Link. I have my hack that freezes packets. Step 1: Use Fusing. Step 2: Freeze "Use Fusing" packet, send "Standing still and doing nothing packet" Step 3: Simulate Result of using Fusing. Step 4: If Fusing Results in 6 Link, Send "Use Fusing" packet, otherwise keep sending "Standing Still and Doing Nothing" Packet. Step 5: Repeat Step 1 Until You get 6Link from 1 Fusing Step 6: ??? Step 7: PROFIT " Checksum algorithm is known. What's to prevent me from forging correct checksum for my forged packets? " Packet sniffer on my router. " Virtual Machine. " After decryption, it has to be stored somewhere so that the computer can read the decrypted content. If it's stored somewhere on local, I can use something else to read it. In memory as an isolated process in the kernel? Virtualize the whole OS, use a debugger to analyze the virtual machine. Encrypted data cannot stay encrypted forever, otherwise how will the local client read and understand it? Last edited by Sachiru#1510 on Nov 23, 2013, 4:56:15 AM
|
![]() |
" I know that and agree to it, my point in saying it was that this was somewhat incorrect. " Clearly some sort of mob AI logic is dictated by the server and sent to the client. Not a full seed/state, hence the "somewhat", but some part of it is. |
![]() |
"Because people don't understand how real-world computer systems are secured. #1 is physical security, by which I mean totally outside of the computer. Put your computers in a building which has human-being-based access control during the day, and locked doors and surveillance at night. The more important the system, the more locked doors to put it behind. Simply don't allow unauthorized persons to even touch the computer, because once they do, things are almost inevitable. Most organizational hackers defeat security measures on the physical level, trumping all other forms of security you can enact. Hackers are rarely internet dwellers and much more likely to be costume artists, using a variety of stolen uniforms to appear as FedEx people, air conditioning repair, etc. and using that appearance to gain access to areas of buildings where they are normally denied. Once one steals a computer away from a physical location, or has a long period of undetected physical access to a system, there's nothing that can really be done; the only question is "how long?" not "if?" I can wipe your Windows password with a USB stick in a matter of seconds, assuming I have access; it's as simple as booting off the USB instead of your hard drive, then following a nice little password-cracking GUI ("hardcore" hackers call users of such conveniences "script kiddies," but fuck it, it is convenient). A BIOS password is a little trickier, but I just need to find the right way to reset the BIOS to default settings — this is a matter of applying the right voltages to the right pins. Encryption can help on the whole "how long?" story, but it only helps if the key is not stored on the machine. The most common way to do this is to require the key at boot (typed manually!) and then the key is stored in memory. This means, however, that the security only works if the key is properly wiped from memory, which means shutting the machine down, and ideally additional wiping of the key on top of that (it's a very difficult process, but old RAM memory can be extracted, even after power is off, unless thoroughly overwritten to jumble the data). In practice, the inconvenience of retyping the key often leads users to either leave such systems in an insecure state (by which I mean: left on all the time), or they write the key down someplace on paper and don't secure it very well. Users hate security measures, see them as a waste of time, and treat them with contempt; this works to a hacker's advantage. Networks are a lot trickier; hiding behind the right configurations of firewalls and intrusion detection systems can make remote hacking virtually impossible. There are still vulnerabilities to watch out for, like code injection, but these can be worked around. The point being, if someone has physical admin access — as every recreational user has with the computers they own — they have all the keys to malicious behavior they'd ever need. Perhaps one way to look at it is not so much "never trust the client" but instead "never trust anything which is either on the other side of your firewall, or which you do not have physical possession of yourself." When Stephen Colbert was killed by HYDRA's Project Insight in 2014, the comedy world lost a hero. Since his life model decoy isn't up to the task, please do not mistake my performance as political discussion. I'm just doing what Steve would have wanted. Last edited by ScrotieMcB#2697 on Nov 23, 2013, 5:14:27 AM
|
![]() |
Every time I come by to post, I find that this thread has grown by another 30 pages, and I then spend all my time reading it (and the linked articles, which are fascinating) and end up running out of time for actually posting... And so the cycle continues...
So here are my current thoughts.
Floating-point determinism
It is possible to perform floating-point arithmetic in a deterministic manner (which is absolutely REQUIRED for qwaves' proposal), but it is very difficult. Even developers who have made it work call it "a constant struggle" to maintain.
On a side note, I am actually quite interested in implementing this for the random level generation, if possible. Every day, a small-but-significant number of players get booted to the login screen with the message "the client's terrain generation is out-of-sync with the server". They will always get this kick+message when they try to re-enter the bad instance, and the only option they have is to either ctrl-click to force a new instance (which not everyone knows about, sadly) or to wait for it to time out naturally. This is a very annoying bug, which is caused by floating-point inconsistencies, and one I would love to stamp out for good.
Movement and combat
In our current system, movement is intrinsically linked to combat. If you click a monster to attack, you first move into range, then begin the attack. For melee, the range is small, so you must get quite close to the monster. For ranged attacks, such as spells and bow attacks, this distance is greater. Several skills involve movement as a part of their function, such as Shield Charge and Leap Slam. Some skills, like Heavy Strike, cause Knockback, which changes a monster's position. Combat also affects monster AI, such as how some monsters flee when set on fire, or hit by skills linked to the Chance to Flee support gem. Also, some monsters, like archers, try to stay at a certain range from their target.
So, I don't think it is possible to use a deterministic system to govern only movement or only combat. The two systems are just too interconnected to separate like that.
qwaves' suggested system
Client state hash?
After rereading the first post and the code sample (as of time of writing), I noticed something. There is no mention of a state hash.
Under the system as presented, the server never validates the client's state. It doesn't check if the client is in sync: it just assumes it is and makes sure the client isn't doing anything "illegal". This means that if the client DOES get desynced, the server won't actually notice until the client is SO desynced that it tries something the server doesn't like. A "state hash", sent with every snapshot, would fix this. A state hash is just a hash of the client's entities and RNG data, which the server can also calculate and compare to check if the client is desynced or not. But the real problem is the issue of singleplayer. We simply aren't going to rewrite half the game engine just to try to improve singleplayer. If we were going to go that far, we would want to make sure it worked for multiplayer, too. Also, there's no way we'd have both systems (old and new) running in parallel, falling back to the old one at times. That would be a bottomless rabbithole of doom. Now, I'm not saying the system can't be adapted to work for multiple clients. However, doing so breaks some of its fundamental assumptions:
State hash
Even though I just suggested this, it doesn't quite work in multiplayer. Because each player receives information about the other players that is delayed by latency, a client's state hash would rarely be valid. How to fix this? Well, first we need to decide who's "in charge".
Client authority
When there are multiple clients active, who has authority over the correct state of the game world? You could designate one client to be "host" who has authority and the other clients must obey the host. This is quite bad for PvP, though, since it creates an uneven playing field.
Better to make the server the authority, which is also more secure. (cheating is still possible, though)
Snapshot validation
Of course, with multiple clients sending in data at different times and latencies, the proposed system of "simulate and validate each snapshot" falls apart. One client may send in a perfectly valid snapshot, and then next frame another client with worse latency sends in a valid snapshot that invalidates the first client's action.
So the server must be able to accept "snapshots" from clients, knowing that they may be invalidated sometime in the future. This requires the server to be running the master simulation in parallel to the clients, and to be able to backtrack in time, mix in the client actions (which may or may not fail, now) and update the clients, then simulate back to the present. The clients, then must ALSO be able to backtrack, apply new actions from other players, then simulate back up to the present state. If we are allowing the client to effectively perform actions "in the past", how do we know how far back to allow actions? The server can measure the latency to each client, which gives an estimate for how far back a legitimate action could have been started, but what about lag spikes? The server has to be lenient enough to handle average network instability (though this allows cheating) but it also has to resync a client that lags too much, since it can't accept the action. The server effectively has a rolling time-frame of, say, 3 seconds into the past, where anything that has happened earlier than that has "officially" happened and cannot be altered. Anything within the 3 second window is can be potentially undone.
Client state hash, again
So the state hash trick actually does work in multiplayer, as long as the state being hashed is the state from (in this example) 3 seconds ago, a state which cannot be changed. This makes the client quite similar to the server, in that it has a rolling 3-seconds-ago window.
There might be other concerns, or problems I haven't thought of. It is late, and I am tired, after all. And I don't know how well this would hold up with 6 players all together spamming dozens of skills all at the same time...
WHAT DOES IT ALL MEAN?!?!?!
qwave has suggested a radical change to our core game systems. The concept is not impossible, I think, but it isn't really feasible. It's simply too big of a change. It also has some security concerns. But it was fun to think about and discuss.
Code warrior
| |
@rhys
Floating-point determinism: Gets rid of all floating point usage in the core logic of PoE. Use fixed point math: just overload the standard arithmetic operators (i think you do not need lot of precision). And implement your own algorithm for irrational function. For Sin/Cos i suggest to use a table of values and use linear interpolation. Movement and combat The main problem is actually that you cannot do something half deterministic and half not. Either it is 101% deterministic or it is not. The non-deterministic part will overwhelm the 'deterministic' one. " Please think about it for PoE_2.0 Thanks HG Roma timezone (Italy)
|
![]() |
" Pardon my post, rhys, but this tastes of too much compromise for too little gain. The rolling windows of validation can and will be a big issue for multiplayer PVP. Because validation occurs only for past events, potentially one can create a hack that simulates a disconnection on near-death. Because the validation does not occur at realtime, this means that you either compromise and mark all disconnects as deaths/defeats, making people with laggy or unstable connections mad, or mark disconnects as draws, making "death-negating-hack-for-PVP" possible. Additionally, assuming that the way the state hash algorithm is computed becomes known, since the server validates the state hash of a single client with regards to the states of other clients, malicious code can be written that acts as if your "pseudo-valid" state hash invalidates OTHER clients, causing THEM to be desynced or booted off the server. This would be the equivalent of a "SCREW YOU, KICK THEM ALL OFF THE INSTANCE" button. I do not need to state the danger of including something this volatile in the current system, that should be self-obvious. Also: " I agree. It's a cardinal rule in cryptography not to use floating-point numbers as crypto keys, simply because of this inconsistency. Last edited by Sachiru#1510 on Nov 23, 2013, 5:57:46 AM
|
![]() |
" The thing about using fixed-point math is this question: "How many digits of accuracy shall we use?" Under your fixed point math system, you can have 7/3 evaluate to either 2 2.3 2.33 2.333 and so on and so forth. The first consideration regarding this is, again, rounding. If you're using 7/3 as a calculation of distance, where the result is the area of effect of something, if you're using 1 digit of accuracy you either round it to 2 or 3. Increasing 2 by 12% (as in a Notable node that increases area of effect by 12%) would do jack shit because it still gets rounded down to 2. Assume that you use 2.3, or two digits of accuracy. You solve the issue of the notable node, increasing the 2.3 to 2.6, however what about the small AoE nodes (the 4% increased AoE ones)? That still gets rounded down. As you increase accuracy, however, the same rounding errors add up. For skills that deal damage, adding in a mix of More, Less, Increased and Reduced X damage would soon start to result in rounding errors that would confuse and most likely annoy players. They'd start to wonder why Increased by X, Less by Y would result quite differently from two Increased by (x/2), Less by Y effects, despite the math being the same. The second area of consideration is overflow handling. Say you have something that does 32767 damage, and is increased by 33%. What's your new damage? Under fixed point math, you need to consider bounds and overflow values, otherwise your 999 damage increased by 33% might start to become -32767 damage, and you start healing monsters with your blows. This introduces computational complexity (as in your code has to do more stuff to handle the overflow), programming complexity (Very few computer languages include built-in support for fixed point values, because for most applications, binary or decimal floating-point representations are usually simpler to use and accurate enough, so you need to add in your own custom fixed-point handling code), and resource usage (you're essentially preventing yourself from using faster FPU-based instructions, thus making your code unnecessarily slower). The third area of consideration is architectural differences. Note that the same source code compiles differently across devices, and is run through different architectures, architectures which may have differing performance goals in mind. Code that runs fixed-point math speedily on an Intel Core i7, 4th Generation, might run sluggishly on an Intel Core i7, 2nd Generation processor, due to differences in pipeline and microcode. You might get deterministic performance, true, but would that be okay to GGG if it meant that a significant portion of their player base would be playing at half speed? Or that you get inherently faster synchronization just because you're using a specific video card that can handle the specific kind of calculations that GGG uses quickly? As always, determinism in state generation to achieve synchronization over an unstable transfer medium is like using a potato peeler to tune your car: It can with a lot of finagling work, but it's not the practical or even proper solution to things. |
![]() |
" Inaccurate. This thing is about ANY non integer calculation, even floating. " Actually computer language does not support a lot of things. That why you need to write code. Do you really think that a skilled programmer cannot write their own functions to do multiplication/subtractions? It is teached in almost every elementary course of informatic. Concerning speed isssue, i'm 100% sure it is not a client issue. Maybe the servers could have some issues due to handle 1000+ instance. But Rhys said that it want to implement this only for map generation, i do not think that it is an heavy overload. If he cannot solve his problem playing with compiler flags, fixed point math is the only way to do that. PS: actually another try could be split each floating point operation as binary operations: i.e a=(b+c)*d;//wrong code temp=b+c; a=temp*d;//right code + you have also to try to play with compiler option. Roma timezone (Italy)
|
![]() |
" The considerations about digits of accuracy used in fixed point math, as I stated, are moot for floating point math. " Note how I said "Built-in" support. Yes, languages can support anything, but few languages have fixed point math baked into them from the beginning. And as every programmer knows, the moment you try to customize something instead of using something baked in, errors will inevitably come up, and weird behavior results. The lack of codified standards for fixed point mathematics in compilers and languages means that Programmer A can specify that his fixed point math engine X does F when faced with N, whereas Programmer B can specify that his fixed point math engine Y does G when faced with the same N, and have both of them be correct. Since I doubt that PoE is developed by a single developer who does not have multiple-personality disorder, we can assume that even with a common engine used by the developers, inconsistencies that can be gamebreaking under the right circumstances can and will occur, because, well, human. Also, note that Rhys stated that FP determinism generated problems for map generation. This is a thread about desync. We can infer from Rhys' statement that they have and are considering using deterministic seeds, generated from FP math, to initialize, or start from the same state, when generating maps. He does not say that these deterministic seeds will help them REMAIN in the same state, just START from it. Since maps do not change after generation, there is no entropy involved in their continued maintenance, unlike game state, thus only starting from synchronized status is needed for them. Last edited by Sachiru#1510 on Nov 23, 2013, 7:08:35 AM
|
![]() |
We're 120 pages into this, and still 90% of the 'experts' here are not understanding the core principals behind securing client/server communications.
1. Client hacks mean nothing, they cannot compromise the game in any way. 2. Packet spoofing means nothing, it cannot compromise the game in any way. 3. The only thing that matters as far as security goes is how the server processes the packet, performs the simulation, and persists the new state. Some of you are so obsessed with 'dont trust the client!!' that you have forgotten the client does not have read/write access to the PoE database. As long as the server authoritively processes each action, no amount of client/packet hacking can ever compromise the game in any way. --> My proposal only enables the client to perform the same simulation as the server <-- --> This allows the client to function without the latency of full round-trips <-- This does not at all imply that the client has read/write access to the database, nor does it imply that ANY of the client's actions are not authoritively validated by the server as is currently done in every online game. Diablo 3 performs all of the attack calculations on the client-side. This is why the game feels responsive and does not suffer from constant desync. Try 'hacking' it all you want, Diablo 3 is secure despite having a smart game client. Just because the client can perform combat calculations does not imply that the server does not provide a secure layer to validate input/output. Last edited by qwave#5074 on Nov 23, 2013, 7:45:08 AM
|
![]() |