While on my way to looking for something else, and given some recent posts concerning that darn RNG 🙂, I ran across this topic. It's not too old so I thought I'd dump some of my experience in here. Let me know if I'm preaching to the choir 🙂 Feel free to totally ignore this.
I my previous life I did a lot of computer modelling of complex systems, mostly disk storage arrays and their performance characteristics. The modelling was validated using experimentation and statistical analysis of the observational results. RNGs were used heavily. RNGs are not as "random" as people think (or expect) and that can be a good thing if utilized correctly. For example, while you do want to introduce an element of randomness in access patterns you want that "randomness" to be repeatable so that you can re-do an experiment against a set of changes, like new array microcode, and observe any changes in behavior, good or bad.
RNGs come in two flavors (well, the most popular two, there are more) "Uniform Distribution" and "Normal Distribution", the difference being that in "normal distribution" the probability of any given number is greatest around the mean and decreases rapidly towards either end of the "range" of values whereas in a "uniform distribution" the probability of any number is the same as any other number (see image below).
For any simulation work I always used a uniform distribution RNG so that, over time, all possible numbers in the range are generated. For a "Train Drop System", you could simply "bucketize" the number (determine where in the total range the number falls expressed as a percentage of the range) and then match that against the published drop rates for each type of container. You don't need to retain any knowledge of what types were previously dropped in order to apply some "correction", a uniform distribution will do that automatically. What you do need to do is maintain a separate "seed path" (initial and modified seeds) for each container so as to not have the results of one type of container affect what a different type of container generates next.
In fact, if you are going to publish "drop rates" for anything, for example, types of parts from the "free parts" box, it would be best to maintain totally separate "seed paths" for each thing. Folks often use a single path for any and all uses of an RNG which may be fine if you are just trying to determine how many Nails will be needed for a building upgrade or how many barrels of crude oil you'll get when tapping on a whistle in your station but it's not fine when trying to conform to a published "drop rate".
Still it's no guarantee that some sequences of a "Uniform Distribution RNG" won't give 5 Common Trains in a row, random is as random does, but it should be more even over time. And, of course, once it does decide to give you that legendary engine, having it be biased toward one that you don't already have is another issue entirely 🙂
Well, this turned out to be longer than I expected, thanks for letting me dump this here (assuming it doesn't get deleted by a mod 🙂 ). And I won't mind if many decide "tl;dr;dc" 🙂