-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #71152: mt_rand() returns the different values from original mt19937ar.c #1681
Conversation
Comparing this with Richard J. Wagner's code there's obviously a typo, so thanks for pointing that out. However, fixing the bug would introduce a BC break. Not sure, what to do. |
Conflicts: ext/standard/rand.c
In my opinion, breaking backward compatibility in this case is not a large issue, since commit a930751 have changed the behavior of the function and no one have complained about that until now. |
I agree, it's weird people would rely on what the random number self is... |
Unfortunately, This was reverted by: a0724d3 , you may want to reopen this. sorry & thanks |
Sorry I was unaware this had a PR with some discussion already when I reverted. So here's a few things to add to the discussion
|
I think the main argument here is that Mersenne Twister, or more precisely, MT19337, is a well-known PRNG with well understood properties. Of particular interest is the high period of 2^19337 - 1, and the high level of equidistribution. The paper, having been published in 1998, has not only seen peer review, but also withstood the test of time since then. For reference, it's here: http://ww3.ticaret.edu.tr/mckasapbasi/files/2013/10/p3-matsumoto.pdf Now the current implementation might or might have not the same properties. I don't know, you don't know, and even if somebody with a sufficient clue sits down and finds out, who knows if that person gets it right without sufficient peer review? What this boils down to is that existing code has been written, assuming the documented properties of the code. You might have been breaking that code all along and don't even know about it. Making the change makes sure the algorithm is implemented as documented, and thus has the properties promised by the documentation and literature. If it were my call, I'd introduce a set of legacy functions implementing the old behavior, to support users who absolutely have to rely on the old code. But mt_rand() really should work as advertised. |
@Andreas23 I disagree with making legacy functions, there is already too many functions for random numbers anyway, however, the typo still needs to be fixed. I think we should just treat it like any other BC break, push it to the master (which is 7.1) and document the change. |
@lt As I noted in the bug ticket, this typo possibly makes a bad effect to randomness quality. I think that mt_rand() should return numbers generated by "Mersenne Twister" and this fix should be merged eventually. |
@Andreas23 @kusano After some quick and dirty analysis the entropy doesn't seem to be affected. @Andreas23 I'd be against legacy functions in core, the implementation is trivial and can be implemented in userland if required. @kusano If we're going to change the output of this function, there's a few more things we should look at to make sure we get it all done at once. Firstly, opinions on switching to mt19937-64 (at least on 64 bit platforms - including your fix as-is for 32 bit)? As it stands there is no good way to obtain 64 bit pseudo-random without resorting to the cryptographic random functions - this feels unnecessarily heavy handed. (binary concat of mt_rand outputs is another way, but you actually require 3 calls to it to get the full 63 bits as we're returning 31 bit values) Which brings us onto the current scaling mechanism, which is pretty horrific. Lastly, whatever changes are made, a note in UGRADING is a good idea. |
I'd call it broken, see PR #1416. |
Only just got around to catching up on internals, I see last night @nikic proposed a few things too. If we're going to fix this "typo", we should fix all of the loose ends at the same time and get as much benefit out of the BC break as possible. I can get behind that. |
@lt I want mt19937-64 with a different name like C++ to keep the same results in 32 bit and 64 bit platform. |
I'd prefer consistency between 32-bit and 64-bit. Since fixing our MT implementation means breakage, we might as well go 64-bit on both platforms while we're at it. Modern compilers can emulate 64-bit when targeting 32-bit. |
Just a quick note, since 5.2.1 we can change the results at wish as the same seed won't produce the same sequence anymore. There was a need or maybe still has using rand () to generate same sets of "random" values like maps for games and similar predictable series. |
Well, yes, but code might have relied on the post-5.2.1 behaviour. It's pretty unlikely, though. I'm probably overly cautious. |
Wondering how it could rely on it as it is supposed to be not (easily)
|
You would rely on it by assuming that, given a specific seed, it would always give the same sequence of numbers. You don't care about predictability so much as reproducibility. |
If it helps, here is an example: We use a In this case a change in |
On Feb 21, 2016 7:38 PM, "Andrea Faulds" [email protected] wrote:
Right and it does not, nor it should since 5.2.1, gives the same result or And while it is a different topic, predictability is quite important (if it
|
No, it gives the same sequence, just not the same ones < 5.2.1 produced:
|
On Feb 22, 2016 2:49 AM, "Andrea Faulds" [email protected] wrote:
Oh, that's bad and the doc is wrong too: http://php.net/manual/en/function.mt-srand.php See the note.
|
On Feb 22, 2016 11:50 AM, "Pierre Joye" [email protected] wrote:
In any case, what is the appealing reason to change the implementation? If it is making it "more secure", then I cannot agree on changing it now :)
|
@pierrejoye If And then I will submit a feature request for a seedable PRNG. |
On Mon, Feb 22, 2016 at 11:58 AM, Jesse Schalken [email protected]
So yes, we should change that as there are known use cases of using same Pierre |
@pierrejoye The documentation states:
Also the name of the function hints at that. However, the current implementation doesn't implement the Mersenne Twister algorithm. If we don't change the implementation, at least the documentation has to be fixed. |
I think we should fix the function. Mersenne Twister is a tried-and-tested algorithm, our buggy one which just happens to work isn't. I just think we shouldn't change it in 7.0.x or 5.6.x. |
I think we should fix this function in 7.0.x or 5.6.x. Because PHP minor / major version up is ambigous that I saw language specification change in minor version up. |
No description provided.