Possibly the goofiest MCOC bug so far
DNA3000
Member, Guardian Guardian › Posts: 19,841 Guardian
I'm the first to say, programming is tricky, programming games is not easy, programming game engines is complicated. But then there's times the game just does goofy s---tuff, and the only reasonable explanation is drugs.
Here is what as far as I can tell is a perfectly ordinary alliance, with perhaps a slightly unconventional name.
Why am I pointing them out? Because this is how the arena leaderboards lists someone from this alliance:
If it isn't obvious:
(Yes, I have a pretty good idea what's happening here. Why it is happening in a text field that is supposed to be five characters long is where the drugs come in. Also, yes, I placed 418. That's somewhat less amusing)
Here is what as far as I can tell is a perfectly ordinary alliance, with perhaps a slightly unconventional name.
Why am I pointing them out? Because this is how the arena leaderboards lists someone from this alliance:
If it isn't obvious:
(Yes, I have a pretty good idea what's happening here. Why it is happening in a text field that is supposed to be five characters long is where the drugs come in. Also, yes, I placed 418. That's somewhat less amusing)
30
Comments
I suspect improper code reuse, but it could also just be a mistake. Instead of validating the string, someone decided to parse it instead. Which does almost the same thing, until the code mischaracterizes the data type being parsed.
Knowing what is likely happening on a technical level makes this no less goofy. In fact, in my opinion it makes it even more goofy. Why do that to a presumably five character text field? There’s nothing to parse, so there’s no reason anything should be cast to anything else. With one possibly exception that, if it is happening, that’s horrible. So horrible I’m not even going to voice it, although the more security-minded might guess what I’m thinking. It’s an unfortunately too common design error.
https://stackoverflow.com/questions/68122865/string-ecoding-returns-wrong-values-33-48-becomes-33-47999999999999488
https://github.com/apple/swift-corelibs-foundation/issues/4255
https://forums.swift.org/t/floating-point-precision-issue-when-decoding-json-value/54690
The [2.800] tag is for another alliance.
What DNA is pointing out is how the [2.800] value got converted into [2.79999999999999998] in the Arena Leaderboard, which I suspect is an encoding issue.
Not that I am suggesting anyone name their alliance drop table or anything...
I ended up doing a little too much
https://stackoverflow.com/questions/68122865/string-ecoding-returns-wrong-values-33-48-becomes-33-47999999999999488
https://github.com/apple/swift-corelibs-foundation/issues/4255
https://forums.swift.org/t/floating-point-precision-issue-when-decoding-json-value/54690
Since this has gone from being a joke to being a programming course, might as well go all in. For the benefit of
everyoneanyone still reading and might not understand what's going on here, and what the discussion is about, the issue here is how computers represent numbers, and just anything in general.I'm going to quote the issue in the first linked StackOverflow article as an exemplar to explain the issue. The question in the article states:
"Here in myObjectValues contains a Decimal value of 33.48. If i try to encode this mydata to string, the value returned is 33.47999999999999488."
However, one of the respondents correctly points out that the person asking the question is wrong. myObjectValues does *not* contain a Decimal value of 33.48, because the explicit data type of myObjectValues is double. In other words, it is a 64-bit floating point number. Doubles are essentially binary notation numbers (with some exponent stuff not important here). How do you represent 33.48 *exactly* in binary? You can't, any more than you can represent 1/3 in decimal precisely. When you set a double precision variable in Swift to "33.48" the compiler has to round that number off to the closest possible value you can shove into a 64 bit float. And that value is, in Swift, apparently 33.47999999999999488. So when the code tries to convert this numeric value into a string, the string gets filled with the best possible representation of the actual value stored, which is "33.4799999999999948".
So of course that's likely to be what's happening here. Some part of the game code took the alliance tag, converted it to a number, then converted that number back into a string, and during the round trip the 2.800 got turned into 2.799999... and then to the expanded string. The question is why? Why convert the alliance tag into a number in the first place? It is just a bunch of characters. Most alliance tags can't be converted into a number. If the game tried to do that constantly, most of our alliance tags would show up broken in the leaderboard. So that's not what's happening.
What's more likely to be happening is the alliance tag is being read by a parser. A parser is what we typically call a bit of code that is used to read a bunch of data with a (possibly) unknown structure, and try to infer that structure from the contents of the data itself. So "ABCDE" gets converted into a string with the contents "ABCDE" in it, but "2.800" gets converted into a number with the value 2.8 within it. The code is guessing you meant the data to be a string of characters in the first case and a number in the second case. It is what Excel does by default when you type into an empty cell. Excel tries to figure out if you typed a number, or a word, or a date, or whatever.
But that's overkill for the alliance tag. The tag is always meant to be a sequence of one to five unicode characters. That tag should never be interpreted as anything other than a unicode string. So any code that reads it should just read the data and interpret that as a sequence of unicode characters and nothing else. By asking "I wonder what this might be" you get the results above.
So why go through all the trouble to do more work than necessary? Best guess, someone decided that rather than write whatever code was necessary to handle the alliance tag, they would grab some code that already did what they needed and reuse it. But whatever code they reused did more than what they needed: it not only could read and write and otherwise pass around unicode text, it could also parse the data looking for data other than text and handle it appropriately. This overkill then gets exposed when someone creates an alliance tag that precisely matches one of the things the parser is looking for.
To summarize, the "2.800" in the alliance tag is not a number. 2.800 is not a number in this context. It is a sequence of bytes representing unicode 2, unicode decimal, unicode 0, unicode 0, unicode 0. But the code used to read that data decided that in fact "unicode 2, unicode decimal, unicode 0, unicode 0, unicode 0" should be interpreted as the 64-bit float whose mantissa is 0110011001100110011001100110011001100110011001100110 and whose exponent is 10000000000. And then some other bit of code took that value and converted it to text for display, and ended up with 2.799999...
Maybe that would have broke the camels back though
In decimal, we have the standard decimal notation for big and small numbers. We express that number as a single digit plus decimals, and an exponent that represents ten to the power of that exponent. So the way to express nine million two hundred thousand is 9.2 x 10^6, or 9.2 E 6. That is 9.2 x 10^6 = 9.2 x 1000000 = 9,200,000.
In binary we do the same thing. All numbers are expressed as a single digit plus decimals. There's only two digits in binary, so all binary numbers are expressed as 1.xxxxx. Because the first digit is always 1 in this format, we don't actually have to store it: it is implied.
So the mantissa above, 0110011001100110011001100110011001100110011001100110, should be interpreted as 1.0110011001100110011001100110011001100110011001100110. Positions to the right of the decimal point are fractional powers of two, just like in decimal they are fractional powers of ten. 9.2 is nine plus 2/10. 9.234 is nine plus 2/10 plus 3/100 plus 4/1000.
1.0110011001100110011001100110011001100110011001100110 is one plus 0/2 plus 1/4 plus 1/8 plus 0/16 plus 0 /32 + 1/64 etc. If we focus on just the fractional part we get:
0/2 = 0
0/2 + 1/4 = 0.25
0/2 + 1/4 + 1/8 = 0.375
0/2 + 1/4 + 1/8 + 0/16 + 0/32 + 1/64 = 0.390625
0/2 + 1/4 + 1/8 + 0/16 + 0/32 + 1/64 + 1/128 = 0.3984375
0/2 + 1/4 + 1/8 + 0/16 + 0/32 + 1/64 + 1/128 + 0/256 + 0/521 + 1/1024 = 0.3994140625
basically getting closer and closer to 0.3999999999...
Incidentally, 0.4's binary expansion, if you look carefully, is actually a repeating decimal just as 1/3 is in decimal notation. In decimal, 1/3 = 0.3 repeating. In binary, 0.4 is 0.0110 repeating. Which means in binary, 0.4 is 0.375 + 0.375/16 + 0.375/256 + 0.375/4096 ...
The actual binary floating point representation of 2.8 is thus essentially 1.4 E1, then we drop the leading one (because it is implied) and store 0.4 as best we can, and exponent 1 which means multiply everything by 2 (just as E1 in decimal would multiply everything by ten).
I’ll just add the following.
(except from prior DNA explanation…)
myObjectValues does *not* contain a Decimal value of 33.48, because the explicit data type of myObjectValues is double. In other words, it is a 64-bit floating point number. Doubles are essentially binary notation numbers (with some exponent stuff not important here). How do you represent 33.48 *exactly* in binary? You can't, any more than you can represent 1/3 in decimal precisely.
So, @Grub88 basically wanted the (“with some exponent stuff not important here”) stuff, lol.
And (DNA can correct if I’m wrong), but think the referring to value type as “double” is not technically saying it is of the “Double” data type, but rather that the storage space used is the same as what would be taken up by a data type of “Double”. That being, a Floating Point number uses 64 bits, just like a Double would. Although a F.P. treats its 64 bits very differently that what a “Double” would.
Basically (without parsing the longer explanation in too much depth), those 64 bits are split between some # bits used for the “Mantissa” portion, and some # of bits used for the Exponent portion. (been too long since I’ve dealt with these, but you can easily look that up, to see the exact breakdown of those 64 bits for a F.P. Number.)
Most of this stuff was standardized at a time when 32-bit processors were the mainstream, and therefore 32-bits was the standard size of a binary “word.” 64 bit quantities were referred to as “double.” Double (binary) was 64 bit integers, and double float was 64 bit floating point representations. Increasingly, things have moved to 64 bit words as the native word size on modern CPUs, but the legacy lingo is still that 32 bit is single, 64 bit is double most of the time. However, there are platforms and languages where 64 bit quantities are the norm, and 64 bits isn’t “double.” In fact, the standard tries to move us away from that for this reason, we should be calling them fp32 and fp64 to be specific and platform independent. But old people like me are probably going to be calling those single and double floats until we die.
I would be remiss if at this point I don’t mention the x86 long double float, which is an 80 bit precision pseudo standard implemented in x86, which basically means almost everywhere. The idea was that since 64 bit processors can handle 64 bit quantities natively, a floating point number represented by 64 bits of mantissa and 16 bits of exponent would have more precision (64 bits instead of 52) without costing any more calculation time (kinda sorta, this gets into CPU architecture details, a completely different side track entirely).
Typing and representations is one of those things most programmers try not to think about too much, and try really hard to write code that doesn’t need to know about, but when it rears its ugly head it can make life all kinds of interesting if you are not careful.
Ergo, concordantly.
Viza vi.