Solution - Build SH*T with Julia

11.2 Solution

We start by defining a few global variables required by LCG as well as a a way to set our seed to a desired value.

import Dates as Dt

a = 1664525
c = 1013904223
m = 2^32
seed = round(Int, Dt.now() |> Dt.datetime2unix)

function setSeed!(newSeed)::Nothing
    global seed = newSeed
    return nothing
end

The values were chosen based on this table from Wikipedia.

Time to translate the LCG algorithm to Julia’s code.

function getRandFromLCG()::Int
    newSeed::Int = (a * seed + c) % m
    setSeed!(newSeed)
    return newSeed
end

Note: In general a function should not rely on nor change global variables. Instead it should only depend on the parameters that were send to it as its arguments. Relying on global variables although convenient and tempting could lead to bugs that are hard to pinpoint and eliminate. That’s why in the rest of this book I will try to avoid this style with the exception of global constant variables seen, e.g. in Section 8.2.

Let’s see how it works.

# your numbers will likely differ from mine
[getRandFromLCG() for _ in 1:6]

[2395901413, 2765252096, 1707221343, 48070450, 238965993, 372185396]

A small victory. In result we got some integers that are very hard to predict for a human brain alone.

Still, the numbers are unwieldy and look quite odd. How can we transform them into a function that returns Float64 from the range [0-1)? The key is the modulo operator (% equivalent to rem function) in getRandFromLCG. All it is is just a reminder after a division. It has an interesting property, the remainder of i divided by m is always in the range 0 to m-1, see the example below.

[i % 3 for i in 1:7]

[1, 2, 0, 1, 2, 0, 1]

If so then all we have to do is to divide our seed (that takes values from 0 to m-1) by m to get the desired Float64 value.

function getRand()::Flt
    return getRandFromLCG() / m
end

# your numbers will likely differ from mine
[getRand() for _ in 1:3]

[0.5896293527912349, 0.0345228030346334, 0.3047891960013658]

Much better, we reached the first checkpoint. OK, now how to convert it to a random integer generator. Let’s try one step at a time. The getRand() returns a Float64 in the range [0-1). So, if we were to multiply 1 (the upper limit) by let’s say 5 we would get 5, and 0 (the lower limit) times 5 would get us zero. Hence, the following code (floor returns the nearest integer smaller than or equal to a Float64).

function getRand(upToExcl::Int)::Int
    @assert 0 < upToExcl "upToExcl must be greater than 0"
    return floor(getRand() * upToExcl)
end

Time for a test. If we did our job right we should get a sequence of random values each equally likely to occur (getCounts was developed and explained here).

function getCounts(v::Vec{T})::Dict{T,Int} where T
    counts::Dict{T,Int} = Dict()
    for elt in v
        counts[elt] = get(counts, elt, 0) + 1
    end
    return counts
end

setSeed!(1111) # for reproducibility
[getRand(3) for _ in 1:100_000] |> getCounts

Dict{Int64, Int64} with 3 entries:
  0 => 33421
  2 => 33054
  1 => 33525

Yep, roughly equal counts. One more swing with a different range.

setSeed!(1111) # for reproducibility
[getRand(5) for _ in 1:100_000] |> getCounts

Dict{Int64, Int64} with 5 entries:
  0 => 20046
  4 => 19821
  2 => 20111
  3 => 19882
  1 => 20140

Ladies and gentlemen, we got it. Now, let’s tweak it a bit so that we can get an integer in the desired range.

function getRand(minIncl::Int, maxIncl::Int)::Int
    @assert 0 < minIncl < maxIncl "must get: 0 < minIncl < maxIncl"
    return minIncl + getRand(maxIncl-minIncl+1)
end

function getRand(n::Int, minIncl::Int, maxIncl::Int)::Vec{Int}
    @assert 0 < n "n must be greater than 0"
    return [getRand(minIncl, maxIncl) for _ in 1:n]
end

For that we just generated an Int from 0 up to one more than the span of our range (maxIncl-minIncl+1) and added minIncl so that the range starts at that value and not at zero. Let’s check it out.

setSeed!(2222)
getRand(100_000, 1, 4) |> getCounts

Dict{Int64, Int64} with 4 entries:
  4 => 24969
  2 => 24635
  3 => 25162
  1 => 25234

Looks good, roughly equal counts was what we were looking for. One more time, just to be sure.

setSeed!(2222)
getRand(100_000, 3, 7) |> getCounts

Dict{Int64, Int64} with 5 entries:
  5 => 19944
  4 => 19722
  6 => 20224
  7 => 19874
  3 => 20236

Another checkpoint reached.

As for the Box-Mueller transform there is a small problem. The Wikipedia’s page already contains the algorithm implementation in Julia. So for a change we will take the JavaScript version and translate it to Julia.

function getRandn()::Tuple{Flt, Flt}
    theta::Flt = 2 * pi * getRand()
    R::Flt = sqrt(-2 * log(getRand()))
    x::Flt = R * cos(theta)
    y::Flt = R * sin(theta)
    return (x, y)
end

And voila, the only difference is that instead of a vector we return a tuple with each element being a value from the normal distribution with the mean = 0 and the standard deviation = 1. However, we would prefer a function that returns any number of normally distributed values (and not only two in Tuple{Flt, Flt}), hence the following code.

function flatten(randnNums::Vec{Tuple{Flt, Flt}})::Vec{Flt}
    len::Int = length(randnNums) * 2
    result::Vec{Flt} = Vec{Flt}(undef, len)
    i::Int = 1
    for (a, b) in randnNums
        result[i] = a
        result[i+1] = b
        i += 2
    end
    return result
end

function getRandn(n::Int)::Vec{Flt}
    @assert n > 0 "n must be greater than 0"
    roughlyHalf::Int = cld(n, 2)
    return flatten([getRandn() for _ in 1:roughlyHalf])[1:n]
end

getRandn(n::Int) generates a vector of tuples using comprehensions. The length of the vector is determined using cld, which divides n by 2, and returns the smallest integer equal to or greater than the result of that division. The vector of tuples (in this case Vec{Tuple{Flt, Flt}}) is flattened before being returned from getRandn(n::Int). That’s why we get a regular vector of floats (Vec{Flt}). Notice, that if n is an odd number then we return all but the last element ([1:n]) of the vector (since flatten returns a vector of even length). Anyway, let’s see how we did.

import Statistics as St

# test: mean ≈ 0.0, std ≈ 1.0
setSeed!(401)
x = getRandn(100)

St.mean(x), St.std(x)

(0.050968498407633414, 1.0266071355197548)

I would say we did a pretty good job.

Time for the last step, let’s transform getRandn to a function that provides a normal distribution with a specified mean and standard deviation. That’s fairly simple. Since the ‘original’ deviation is 1, then if we multiply the numbers by let’s say 16, then we will get a distribution with the mean 0 and standard deviation equal to 16. So how do we make a mean equal, let’s say 100? It’s easy as well, we just add 100 to every number from a distribution.

function getRandn(n::Int, mean::Flt, std::Flt)::Vec{Flt}
    return mean .+ std .* getRandn(n)
end

# test: mean ≈ 100.0, std ≈ 16.0
setSeed!(401)
x = getRandn(100, 100.0, 16.0)

St.mean(x), St.std(x)

(100.81549597452215, 16.425714168316077)

That seems to be it, we reached the end of this task. Together we developed a simplified library for random numbers generation and it wasn’t all that bad, was it? Still, for practical reasons I would recommend to use the baptized in fire Random.jl.

11.1 Problem ← → 12 Birthday

CC BY-NC-SA 4.0 Bartlomiej Lukaszuk