flyinghyrax.net

Extended Polymerization - Divide and Conquer

This series is about Advent of Code 2021 Day 14 - "Extended Polymerization" - beginning with a recursive solution and developing that into an iterative one based on dynamic programming.

For brevity (ha), I'm going to omit most things about parsing input or the final max/min calculation needed for a full Advent of Code solution, and and try to focus on the core algorithms.

This post is part of a series: "Divide-and-conquer, Memoization, and Dynamic Programming"
  1. Divide and Conquer
  2. Memoization
  3. Dynamic Programming
  4. Benchmarking & Analysis (in progress)

Problem Statement

"Extended Polymerization" involves repeatedly expanding a string based on character insertion rules. You are given a string, and a set of rules for how to insert new characters based on pairs of characters in the current string. For example:

The rules are interpreted like this:

  1. Look at each pair of characters in the starting string; the pairs overlap so the second character of a pair is the first character of the next
  2. Look for a rule matching the pair of characters
  3. Insert the character on the right side of the rule in-between the characters in the pair

So for the string "QW", we have the rule "QW → "W". We insert "W" in the middle of the pair and get "QWW".

Each time we expand the string, all the pairs get evaluated at the same time, so inserting new characters doesn't effect the other pairs until we've evaluated all of them.

For "QWEQ", we would evaluate these pairs:

[QW]EQ
Q[WE]Q
QW[EQ]

And insert 3 new characters, one for each pair:

Q[W]W[Q]E[Q]Q
=> QWWQEQQ

It's important how the pairs of characters overlap. We do not do this:

QW -> QWW
WE -> WQE
EQ -> EQQ
QWW + WQE + EQQ
=> QWWWQEEQQ

Instead, the characters of adjacent pairs overlap:

diagram showing the expansion of "QWE" - the middle character "W" is shared by the 2 pairs "QW" and "WE", so only one copy of it is included when expanding the string once

Inserting a new character for all the pairs in the string is one "step" or "iteration" of string expansion. The string can be expanded multiple times (multiple "steps"), growing each time.

First Solution

In F#, one possible implementation looks like this:

let expandOnce (rules : Map<char * char, char>) (current : char list) : char list =
        
    let rec loop (input : char list) (output : char list) : char list =
        match input with
        // base case -  we've processed the entire string
        | [] -> List.rev output
        // last character - just append it to the result
        | [c] -> loop [] (c :: output)
        // each pair - find the char to insert and add it to the output
        | c :: (d :: rest) ->
            let insert = Map.find (c, d) rules
            let output' = insert :: c :: output
            let input' = d :: rest
            loop input' output'
    
    loop current []

To expand the string multiple times, we just apply the above repeatedly:

let expandTimes (count : int) (rules : Map<char * char, char>) (template : char list) =

    let rec loop (remainingCount : int) (resultAccumulator : char list) =
        if remainingCount = 0 then
            resultAccumulator
        else
            loop (remainingCount - 1) (expandOnce rules resultAccumulator)
    
    loop count template

With a few helper functions, we can try it out in FSI:

Helper Functions
module String =
    let toCharList (str : string) : char list =
        str.ToCharArray() |> Array.toList

    let ofCharList (cs : char list) : string =
        cs
        |> List.toArray
        |> fun cs' -> new System.String(cs')
        |> string


let sampleRules =
    Map.ofList [
        ('Q', 'Q'), 'E'
        ('Q', 'W'), 'W'
        ('Q', 'E'), 'Q'
        ('W', 'Q'), 'W'
        ('W', 'W'), 'E'
        ('W', 'E'), 'Q'
        ('E', 'Q'), 'Q'
        ('E', 'W'), 'E'
        ('E', 'E'), 'W'
    ]

let sampleString = "QWE"

let doExpansion n =
    sampleString
    |> String.toCharList
    |> StringExpand.expandTimes n sampleRules
    |> String.ofCharList

let countExpansion n =
    doExpansion n
    |> String.length
PS C:\...\advent of code\2021> dotnet fsi
...

> #load "ExtendedPolymerization.fsx";;
[Loading C:\...\ExtendedPolymerization.fsx]
...

> open ExtendedPolymerization;;
> doExpansion 0;;
val it: string = "QWE"

> doExpansion 1;;
val it: string = "QWWQE"

> doExpansion 2;;
val it: string = "QWWEWWQQE"

This solution works fine for a small number of iterations, well enough that it is sufficient for part 1 of the Advent of Code problem (10 steps of expansion). Things get more interesting when trying larger iterations. FSI can show the execution times for different steps:

> #time "on";;

--> Timing now on

> countExpansion 10;;
Real: 00:00:00.001, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
val it: int = 2049

> countExpansion 15;;
Real: 00:00:00.039, CPU: 00:00:00.046, GC gen0: 4, gen1: 2, gen2: 0
val it: int = 65537

> countExpansion 20;;
Real: 00:00:01.771, CPU: 00:00:01.750, GC gen0: 121, gen1: 30, gen2: 6
val it: int = 2097153

> countExpansion 25;;
Real: 00:00:54.481, CPU: 00:00:54.218, GC gen0: 3739, gen1: 1273, gen2: 19
val it: int = 67108865

10 iterations creates a string about 2,000 characters long and takes 1 millisecond or less. 15 iterations creates a string of about 65,000 characters, and still only takes 39 milliseconds. But at 20 iterations the calculation already takes over a second, and processing 25 iterations takes almost an entire minute - and invokes the garbage collector over 3,000 times.

You can see that when you increase the number of iterations linearly - an additional +5 iterations each time - the size of the string and the execution time increase much more quickly.

Step Count Exec. Time String Size
10 1 2,049
+5 (15) +38 +63,488
+5 (20) +1,733 +2,031,616
+5 (25) +52,710 +65,011,712

Analyzing Performance

There are several ways to think about why this solution scales so poorly. One way is to consider how much the string grows on each step. A string 10 characters long will contain 9 overlapping pairs - in general, a string \(N\) characters long will have \(N - 1\) pairs. One new character is inserted for each pair. So each time we expand the string once, the result would contain \(n + (n - 1) = 2n - 1\) characters, or about 2X growth.

Then the expansion is repeated this for however many steps. And every step the string (almost) doubles:

for step in 1 to s
n = n * 2

i.e. \(n\) gets multiplied by 2 \(s\) times. We can state that like this:

\[ n * 2 * 2 * 2 * 2 * 2 ... \\ n * 2^s \]

There is an exponent! The length of the string will increase exponentially with respect to the number of iterations. This isn't exact, because we changed \(2n - 1\) to just \(2n\) in order to get an approximation - but the general rate of growth (exponential vs. linear) still holds.

To relate this result to execution time, consider that for each round of expansion we have to iterate over every character in the current string. So if the length of the string grows exponentially with the number of steps, so will our execution time.

Trying to use this solution for part 2 is infeasible - if nothing else, the final string will be at least 1,099,511,627,776 characters long. At 1 byte per character[^1], that's a terabyte for a single string. Besides the space requirements, when we calculated \(n_0=3\) and \(s=10\), it took about 1ms. If we approximate the number of operations as \(2^s\), we can work out a rough "time per operation" based on our measurements from FSI, and estimate that executing 40 iterations would take about 12 days!

Math supporting these approximations is in the #Notes section at the end of the post.

Counting Characters

Note how the advent of code question doesn't actually ask for the value of the expanded string. Instead, it asks specifically for the number of times each character will occur. We've estimated that the complete string for part 2 would be over a trillion characters, and we can assume that advent of code problems should be solvable on consumer PC hardware. It should be possible to count the characters without allocating such huge string values.

Our earlier solution operated "breadth first" - it expanded the entire string once, then expanded the entire string again, and so on. Instead, perhaps we can focus on expanding a single pair. Every time we expand a pair, we get 2 new ones:

[QW] -> QWW
[QW]W, Q[WW]

We have now split our original focus (expanding a single pair) into two instances of the same problem. Instead of asking: "how many of each character will be in 'QW' after expanding it 10 times?", we can equivalently ask: "how many will be after expanding 'QW' 9 times, plus how many will there be after expanding 'WW' 9 times?"

If we call our solution function (that we haven't implemented yet) countChars, we can phrase the recursive relationship something like this:

countChars("QW", 10) = countChars("QW", 9) + countChars("WW", 9)

We can apply the same expansion repeatedly:

countChars("QW", 9) = countChars("QW", 8) + countChars("WW", 8)
countChars("WW", 9) = countChars("WE", 8) + countChars("EW", 8)

countChars("QW", 8) = countChars("QW", 7) + countChars("WW", 7)
countChars("WW", 8) = countChars("WE", 7) + countChars("EW", 7)
countChars("WE", 8) = countChars("WQ", 7) + countChars("QE", 7)
countChars("EW", 8) = countChars("EE", 7) + countChars("EW", 7)

...

As the number of expansions we request goes down each time, we will eventually arrive at a call like this:

countChars("EW", 1) = countChars("EE, 0) + countChars("EW", 0)

How many times does each character appear in the string "EE", after expanding it zero times? We can answer that for every possible pair of characters:

Pair # of 'Q's # of 'W's # of 'E's
QQ 2 0 0
QE 1 0 1
WQ 1 1 0
WW 0 2 0
WE 0 1 1
...

This is our recursive base case. We can answer the question without any further recursion, and if we can figure out how to combine the results from 2 answers, then we will be able to work backwards from the base case to build up answers for larger numbers of iterations.

Divide and Conquer

This general strategy is called "divide and conquer". Split the problem into smaller problems, solve the smaller problems, then combine the results of the smaller problems to get a full answer.

When we combine the results of our subproblems (adding 2 counts together), we have to be careful to recall what we're actually analyzing - pairs of characters in a string, where those pairs all overlap. Say we wanted to count the number of 'E's for the following:

count("EW", 1) = count("EE", 0) + count("EW", 0)

If we simply add the number of E's in each sub-pair, we will get a count that is too high. The second E in "EE" and the first E in "EW" are the same character in the expanded string, and we would double-count it! Instead, we have to subtract 1 if the shared character is the character we're counting. So for the above example we must subtract 1 E:

count("EW", 1) = count("EE", 0) + count("EW", 0) - E

And in this case, the shared character is Q, so we must subtract one of those instead:

count("WE", 1) = count("WQ", 0) + count("QE", 0) - Q

Second Implementation

At each step, we need to store a mapping of characters to their number of occurrences. We also need to perform some operations on those mappings, like adding two of them together or subtracting from a particular character's count. I implemented this with some helper functions like the following:

open System.Collections.Generic

/// Type alias so we can omit the second type parameter to Dictionary
type Counter<'T when 'T: comparison> = Dictionary<'T, int64>

(* Signatures only, implementations omitted *)
module Counter =
    /// Create and return an empty dictionary
    let empty () : Counter<'T>

    /// Add one to the count for a specific element
    let incr (key : 'T) (counter : Counter<'T>) : Counter<'T>

    /// Subtract one from the count for a specific element
    let decr (key : 'T) (counter : Counter<'T>) : Counter<'T>

    /// Combine counts from 2 maps. The resulting map contains the 
    /// union of the two arguments, with the counts of keys that occur
    /// in both arguments added together.
    let add (left : Counter<'T>) (right : Counter<'T>) : Counter<'T>

With these functions available, we can implement a recursive algorithm directly from the divide-and-conquer strategy we created.

type Pair = char * char
type Rules = Map<Pair, char>

/// This function calculates how many times each character occurs after
// expanding the given `pair` of characters `iterations` number of times.
let rec countForPair (rules : Rules) (pair : Pair) (iterations : int) =
    let leftChar, rightChar = pair
    
    if iterations = 0 then
        (*  BASE CASE 
            The only characters that we can initially count are 
            the ones in the pair itself
        *)
        Counter.empty ()
        |> Counter.incr leftChar
        |> Counter.incr rightChar
    
    else
        (*  RECURSIVE CASE 
            Look up the character to insert inside the pair we have.
            Construct 2 new pairs from this, and recur with those.
            Add the character occurrences together, then subtract one from
            the count for the shared character so it isn't double-counted.
        *)
        let sharedChar = Map.find pair rules
        let leftSubResult = countForPair rules (leftChar, sharedChar) (iterations - 1)
        let rightSubResult = countForPair rules (sharedChar, rightChar) (iterations - 1)

        Counter.add leftSubResult rightSubResult
        |> Counter.decr sharedChar


/// This function iterates over the starting string we're given and 
/// counts character occurrences for each pair of characters, and
/// combines the results into a single counter containing the grand
/// total for each character.
let countAllCharacters (rules : Rules) (template : char list) (iterations : int) =

    /// This helper function handles the overlap between pairs of characters
    /// in the starting string; for every pair of characters except the
    /// first, the first character in the pair is the same as the second
    /// character of the pair before it, so we subtract one for each overlap.
    let countWithOverlap (index : int, pair : Pair) =
        let pairResult = countForPair rules pair iterations
        if index = 0 then
            pairResult
        else
            Counter.decr (fst pair) result
    
    template
    |> List.windowed 2
    |> List.map (fun xs -> xs[0], xs[1]) // convert 2-element arrays into tuples
    |> List.indexed
    |> List.map countWithOverlap
    |> List.fold Counter.add (Counter.empty ())

And with a few changes to our helper functions we can compare this to our original solution:

Helper Functions
let doExpansion n =
    sampleString
    |> String.toCharList
    |> StringExpand.expandTimes n sampleRules

let countExpansionLength n =
    doExpansion n
    |> List.length

let countExpansion n =
    doExpansion n
    |> List.countBy id

let countRecursive n =
    sampleString
    |> String.toCharList
    |> fun cs -> 
        Recursive.countAllCharacters sampleRules cs n
    |> Seq.map (fun kv -> kv.Key, kv.Value)
    |> Seq.toList
> countExpansion 10;;
Real: 00:00:00.002, CPU: 00:00:00.000, GC gen0: 1, gen1: 1, gen2: 0
val it: (char * int) list = [('Q', 915); ('W', 356); ('E', 778)]

> countRecursive 10;;
Real: 00:00:00.002, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
val it: (char * int64) list = [('E', 778L); ('Q', 915L); ('W', 356L)]


> countExpansion 15;;
Real: 00:00:00.034, CPU: 00:00:00.046, GC gen0: 3, gen1: 1, gen2: 0
val it: (char * int) list = [('Q', 34187); ('W', 7474); ('E', 23876)]

> countRecursive 15;;
Real: 00:00:00.058, CPU: 00:00:00.078, GC gen0: 10, gen1: 1, gen2: 0
val it: (char * int64) list = [('E', 23876L); ('Q', 34187L); ('W', 7474L)]


> countExpansion 20;;
Real: 00:00:01.438, CPU: 00:00:01.421, GC gen0: 116, gen1: 27, gen2: 0
val it: (char * int) list = [('Q', 1198035); ('W', 157306); ('E', 741812)]

> countRecursive 20;;
Real: 00:00:01.684, CPU: 00:00:01.671, GC gen0: 313, gen1: 1, gen2: 0
val it: (char * int64) list =
  [('E', 741812L); ('Q', 1198035L); ('W', 157306L)]


> countExpansion 23;;
Real: 00:00:13.803, CPU: 00:00:13.671, GC gen0: 936, gen1: 229, gen2: 10
val it: (char * int) list = [('Q', 9939939); ('W', 978794); ('E', 5858484)]

> countRecursive 23;;
Real: 00:00:13.487, CPU: 00:00:13.468, GC gen0: 2501, gen1: 1, gen2: 0
val it: (char * int64) list =
  [('E', 5858484L); ('Q', 9939939L); ('W', 978794L)]

The good news is that our recursive implementation is producing the same results as our original. The bad news is that it takes the same amount of time to execute! (And invokes the garbage collector more - but we'll handle that later.) Although we can now get answers without building huge strings, both our solutions so far execute the same number of operations, and their runtime scales the same way.

Next, we will test several strategies for reducing this runtime.

This post is part of a series: "Divide-and-conquer, Memoization, and Dynamic Programming"
  1. Divide and Conquer
  2. Memoization
  3. Dynamic Programming
  4. Benchmarking & Analysis (in progress)

Notes

Math for exponential string length

Deriving a closed form from a recurrence relation

To get an exact answer, we can construct a recurrence relation for the size of the final string, find a closed form, and check the closed form against the results from our code.

Recurrence:

\[ \begin{align} n_s &= 2(n_{s-1}) - 1 \\ n_s &= 2(2(n_{s-2})-1) - 1 \\ n_s &= 2(2(2(n_{s-3}) - 1) - 1) - 1 \\ \end{align} \]

Concrete example for \(s=3\):

\[ \begin{align} n_3 &= 2(2(2(n_0)-1)-1)-1 \\ n_3 &= 2(2(2n_0-1)-1)-1 \\ n_3 &= 2(2*2*n_0 - 2*1 - 1) - 1 \\ n_3 &= 2*2*2*n_0 - 2*2*1 - 2*1) - 1 \\ n_3 &= n_0*2^3 - 2^2 - 2^1 - 2^0 \\ \end{align} \]

Use the pattern to guess a closed form:

\[ \begin{align} n_s &= (n_0*2^s) - (2^{s-1}+2^{s-2}+\dots+2^0) \\ n_s &= (n_0*2^s) - (\sum_{i=0}^{s-1}2^i) \\ n_s &= (n_0*2^s) - \frac{1-2^s}{1-2} \\ n_s &= (n_0*2^s) - (\frac{1-2^s}{-1}) \\ n_s &= (n_0*2^s) - (-1+2^s) \\ n_s &= n_0*2^s - 2^s + 1 \\ n_s &= (n_0 - 1) * 2^s + 1 \end{align} \]

Now we can check our work! When testing, we used the starting string "QWE", so \(n_0=3\).

\[ \begin{align} n_{10} &= 2 * 2^{10} + 1 = 2,049 \\ n_{15} &= 2 * 2^{15} + 1 = 65,537 \\ n_{20} &= 2 * 2^{20} + 1 = 2,097,153 \\ \end{align} \]

These values exactly match the strings lengths our code produced for the same values of \(s\), so we conclude that the closed form we derived is correct.

Math for naive execution time

Estimate for "About 12 days"

\[ \begin{align} 2^{10}op = 1000{\mu}s \rightarrow 1,000{\mu}s / 1,024 op &\approx 0.98{\mu}s/op \\ 2^{15}op = 39,000{\mu}s \rightarrow 39,000{\mu}s / 32,768 op &\approx 1.19{\mu}s/op \\ 2^{20}op = 1,771,000{\mu}s \rightarrow 1,771,000{\mu}s / 1,048,576 op &\approx 1.69{\mu}s/op \end{align} \]

At about 1 microsecond per operation, \(s^{40}\) gives us \(2^{40} {\mu}s\).
Converting that to more meaningful units:

\[ \frac{1,099,511,627,776{\mu}s}{1} * \frac{1ms}{1,000{\mu}s} * \frac{1s}{1,000ms} * \frac{1 hour}{3,600 s} * \frac{1 day}{24 hour} = \\ \frac{1,099,511,627,776}{1,000*1,000*3,600*24} days = \\ \frac{1,099,511,627,776}{86,400,000,000} \approx 12.73 days \]