Lispy Euler: Problem 014: Longest Collatz Sequence

From Project Euler:

The following iterative sequence is defined for the set of positive integers:

n → n/2 (n is even)
n → 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:

13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1
It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

This problem has a brute force method, which is simple: just loop through and calculate the length of each chain.

(defn get-next-val [n]
  (if (even? n)
    (quot n 2)
    (+ (* 3 n) 1)))

(defn get-chain-length [n]
  (loop [_n n
         total 1]
    (if (== _n 1)
      total
      (recur (get-next-val _n)
             (+ 1 total)))))

(defn find-longest-chain-brute-force [max-start-value]
  (loop [current 2
         best 1
         best-length 1]
    (if (== current max-start-value)
      best
      (let [current-length (get-chain-length current)]
        (recur (+ 1 current)
               (if (> current-length best-length) current best)
               (max current-length best-length))))))

This doesn't actually take much time (I discovered the time function recently):

$ lein run
Processing...
837799
"Elapsed time: 17328.997568 msecs"

17 seconds isn't too bad! But we can do better. A lot of these chains flow into one another, so we end up repeating a number of calculations: a pattern called overlapping sub-problems. This leads us to a technique called dynamic programming, which in this case simply involves us caching the lengths of chains and loading them from the cache when we need them.

Here's an example:

(defn populate-lengths [root lengths]
  (let [next-val (get-next-val root)
        new-lengths (if (not (contains? lengths next-val))
                      (populate-lengths next-val lengths)
                      lengths)]
    (assoc new-lengths root (+ 1 (get new-lengths next-val)))))

(defn find-longest-chain-dp [max-start-value]
  (loop [current 2
         lengths {1 1}
         best 1
         best-length 1]
    (if (== current max-start-value)
      best
      (let [new-lengths (populate-lengths current lengths)
            current-length (get new-lengths current)]
        (recur (+ 1 current)
               new-lengths
               (if (> current-length best-length) current best)
               (max current-length best-length))))))

This runs much faster:

$ lein run
Processing...
837799
"Elapsed time: 6036.937932 msecs"

This is much faster, and will scale a lot better than the brute force approach.

There is an option to use Clojure's memoize function to wrap get-chain-length. This unfortunately gets you into some annoyances around variable bindings, so I just ended up sticking to the dynamic programming approach.

Lispy Euler

Problem 014: Longest Collatz Sequence

No comments:

Post a Comment

Report Abuse