Solution - Build SH*T with Julia

26.2 Solution

For our first attempt we’ll go with something simple like the bubble sort algorithm.

# sorts in ascending order
function bs(v::Vec{A})::Vec{A} where A<:Union{Flt, Int}
    result::Vec{A} = copy(v)
    swapped::Bool = true
    while swapped
        swapped = false
        for i in eachindex(result)[2:end]
            if result[i-1] > result[i]
                result[i-1], result[i] = result[i], result[i-1]
                swapped = true
            end
        end
    end
    return result
end

Here, we implemented the algorithm for a vector that contains the elements of type A. In this case it is just a sub-type (<:) of Flt (an alias for Float64) or Int (Union{Flt, Int}). We started by copying the original vector to result and setting a variable named swapped. We will stop our sorting algorithm once there was no swap during the previous traversal of the whole vector (swapped in while is false but may be reset to true in the for loop). We search through the vector starting from its second element (i in eachindex(result)[2:end]). Every time we compare the two nearby elements (result[i-1] vs. result[i]). If the elements aren’t in a desired order (if result[i-1] > result[i]) we just swap them with each other (result[i-1], result[i] = result[i], result[i-1]) and set the swapped flag to true. Once we finish we return the result.

Let’s see how we did.

[47, 15, 23, 99, 4] |> bs,
[47.3, 23.2, 47.2] |> bs

([4, 15, 23, 47, 99], [23.2, 47.2, 47.3])

If you want to better visualize how the algorithm works, you may place println or @show constructs in a few places inside the function, e.g.

function bs(v::Vec{A})::Vec{A} where A<:Union{Flt, Int}
    result::Vec{A} = copy(v)
    swapped::Bool = true
    i::Int = 0
    while swapped
        i += 1
        @show i
        swapped = false
        for i in eachindex(result)[2:end]
            if result[i-1] > result[i]
                result[i-1], result[i] = result[i], result[i-1]
                swapped = true
            end
            @show result, swapped
        end
    end
    return result
end

Which prints:

[0.75, 0.25, 0.5] |> bs

# printout
i = 1
(result, swapped) = ([0.25, 0.75, 0.5], true)
(result, swapped) = ([0.25, 0.5, 0.75], true)
i = 2
(result, swapped) = ([0.25, 0.5, 0.75], false)
(result, swapped) = ([0.25, 0.5, 0.75], false)

For our next try we will go with the quicksort algorithm. Here, I’ll try to implement it from memory based on the algorithm I once saw in “Learn You a Haskell for Great Good!” by Miran Lipovača. Of course, I’ll try to adjust it for Julia syntax.

# sorts in ascending order
function qs(v::Vec{Int})::Vec{Int}
    if isempty(v)
        return []
    else
        head, tail... = v
        # or: head::Int, tail::Vec{Int}... = v
        smallerElts::Vec{Int} = filter((<)(head), tail)
        greaterEqElts::Vec{Int} = filter((>=)(head), tail)
        return [qs(smallerElts); head; qs(greaterEqElts)]
    end
end

To do so we choose a so called pivot element (head) that for simplicity is always the first element in the vector (v). Next, we take the remaining part of the vector (tail). The head, tail... = v is a destructuring assignment that puts the first elt of v to head variable and all the remaining elements to tail (if v is made of one element, then tail is []). Anyway, we separate tail elements into the ones that are smaller (smallerElts) and greater than or equal to (greaterEqElts) our pivot element (head). The above is done with filter and partial function application. Once we got it we use recursion (e.g. qs(unsorted_smaller_elts) in return) and vector concatenation ([vector_or_elt; vector_or_elt; vector_or_elt]).

Let’s see how it works.

[47, 15, 23, 99, 4] |> qs

[4, 15, 23, 47, 99]

Looks good. OK, in case the above was too great of a squeeze, let’s try to make it a bit less functional (Haskell is a purely functional programming language) and a bit more imperative.

function qs(v::Vec{Int})::Vec{Int}
    if isempty(v)
        return []
    else
        firstElt::Int = v[1]
        otherElts::Vec{Int} = v[2:end]
        smallerElts::Vec{Int} = filter(elt -> elt < firstElt, otherElts)
        greaterEqElts::Vec{Int} = filter(elt -> elt >= firstElt, otherElts)
        return [qs(smallerElts); firstElt; qs(greaterEqElts)]
    end
end

Once again, we choose a so called pivot element (firstElt) that for simplicity is always the first element of a vector. Next, we take the remaining elements (otherElts) and separate them into the elements that are smaller (smallerElts) and greater than or equal to (greaterEqElts) our pivot element (firstElt). The above is done with filter and an anonymous function. Once we got it we use recursion (e.g. qs(unsorted_smaller_elts) in return) and vector concatenation ([vector_or_elt; vector_or_elt; vector_or_elt]).

Just a small test to make sure it still works as intended.

[47, 15, 23, 99, 4] |> qs

[4, 15, 23, 47, 99]

Flawless victory, but we are still unable to sort the numbers in alphabetical order. Let’s fix that by modifying our qs.

# by - fn(A) -> B; is applied to every elt of v before sorting
# lt - fn(B, B) -> Bool; compares pairs of elts after 'by' was applied
function qs(v::Vec{A},
            by::Function=identity,
            lt::Function=<)::Vec{A} where A
    if isempty(v)
        return []
    else
        firstElt::A = v[1]
        otherElts::Vec{A} = v[2:end]
        smallerElts::Vec{A} = filter(
            elt -> lt(by(elt), by(firstElt)),
            otherElts
        )
        greaterEqElts::Vec{A} = filter(
            elt -> !lt(by(elt), by(firstElt)),
            otherElts
        )
        return [
            qs(smallerElts, by, lt);
            firstElt;
            qs(greaterEqElts, by, lt)
        ]
    end
end

Here, we added two more positional arguments to qs: 1) by which is a function applied to every element of the vector (v) before the sorting; 2) lt - a comparator function (lt - less than, !lt - negation of lt) that compares our pivot element (firstElt) with all the other elements (otherElts). The comparison occurs after by was applied to all the elements of v (by(elt) and by(firstElt) in filter). The names by and lt were inspired by the names of the keyword arguments in Base.sort. The default function for by is identity (returns its argument unchanged) and the default function for lt is Base.:<.

Let’s see how we did:

[0.75, 0.25, 0.5] |> qs,
['c', 'b', 'a', 'd'] |> qs

([0.25, 0.5, 0.75], ['a', 'b', 'c', 'd'])

And now for the alphabetical sort (getEngNumeral was a solution for the problem in Section 25.1. Here, we passed it without by= because in qs it is a positional argument).

qs(collect(1:10), getEngNumeral)

[8, 5, 4, 9, 1, 7, 6, 10, 3, 2]

The above should work similar to the built-in sort (here we use by=getEngNumeral because that’s how you pass keyword arguments to a function).

qs([0.75, 0.25, 0.5]) == sort([0.75, 0.25, 0.5]),
qs(['c', 'b', 'a', 'd']) == sort(['c', 'b', 'a', 'd']),
qs(collect(1:10), getEngNumeral) == sort(1:10, by=getEngNumeral)

(true, true, true)

26.1 Problem ← → 27 Regex

CC BY-NC-SA 4.0 Bartlomiej Lukaszuk