Showing posts with label primes. Show all posts
Showing posts with label primes. Show all posts

Problem 010: Summation of primes

From Project Euler:
The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.

Find the sum of all the primes below two million.
My initial approach to this was to do something using the Sieve of Eratosthenes, and my solution was something like this:
(defn prime-sieve-sum [n]
  (loop [nums (vec (concat [2] (range 3 n 2)))
         total 0]
    (if (empty? nums)
      total
      (let [current (first nums)]
        (recur (vec (remove #(== 0 (mod % current)) (rest nums)))
               (+ current total))))))
This keeps a vector of all the possible primes, and filters out any multiples as it goes along. The downside is that this was very slow: it took about 10 minutes to run on my machine with $n$ at 2 million.

My guess is that since it is constantly creating new vectors, there is a lot of memory copying going on. A better approach might be to keep a large vector of bools and set them to false as each multiple is encountered. This doesn't jive well with Clojure's immutable vectors, but you can fall back to Java's ArrayList if you want to do this approach.

Instead I just decided to brute force it. Using the is-prime? function from Problem 003, I just scanned across:

(defn prime-sum [n]
  (loop [current 3
         total 2]
    (if (>= current n)
      total
      (recur (+ 2 current)
             (+ total (if (is-prime? current)
                        current
                        0))))))
This gets the job done much more quickly (about 5s overhead for lein/JVM):
$ time lein run
Running...
142913828922

real 0m21.508s
user 0m22.936s
sys 0m0.530s

Problem 007: 10001st prime

From Project Euler:

By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.

What is the 10 001st prime number?

My initial approach to this was to use a Sieve of Eratosthenes to efficiently find all the prime numbers up to a given limit, however that ended up spending a lot of processing in manipulating lists. Turns out there are faster ways to do it.

I also discovered that Clojure does not have tail-call recursion, you need to use the loop and recur constructs to avoid getting stack overflows.

There are two optimizations here. The first is that we'll skip the first prime 2, since it's obviously not the 10001st prime and it is an anomaly that doesn't let us use the next optimization, which is that all primes greater than 3 have the form $6k \pm 1$. Instead of iterating through each number or even each odd number, we'll iterate through all numbers of that form.

Here's the code:
(defn nth-prime [n]
  (loop [i (- n 2)
         k 1
         s -1]
    (let [p (+ (* 6 k) s)
          next-s (if (= s 1) -1 1)
          next-k (if (= s 1) (+ k 1) k)]
      (cond
        (not (is-prime? p)) (recur i next-k next-s)
        (= i 0) p
        :else (recur (- i 1) next-k next-s)))))
Running this code is actually surprisingly slow:
rob@alien ~/code/euler $ time clj 7.clj 
104743

real 0m16.076s
user 0m17.930s
sys 0m0.349s
I'll have to do some profiling to figure out why this is the case. When translating the code as-is to Python it takes 0.176s, and Python isn't exactly known for speed.

Problem 005: Smallest Multiple

From Project Euler:
2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.

What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?
This is a pretty straight-forward problem: we're just trying to find the least-common multiple (LCM) of a set of numbers. To find the LCM, it is just the product of the two numbers $a$ and $b$ divided by their greatest common divisor (GCD). If we think for a second about why this works: any number is just a product of prime factors. A prime number is just a product of itself and 1, composite numbers are a more complex sequence: 12 is a product of 2 * 2 * 3, 8 is a product of 2 * 2 * 2. When you multiply the two numbers together, you're a making a long sequence of prime factors that will result in the product of $a$ and $b$.

This product however is not the least common multiple, because there could be factors in common between $a$ and $b$: both 8 and 12 have a common factor of 2 * 2, if we simply multiplied them together we'd get 96 when the LCM is actually 24. We should divide by the GCD (2 * 2) to filter out these common factors. This gives us the correct result: 96 / 4 = 24.

Here's the code, which uses the Euclidean algorithm to find the GCD of two numbers:
(defn gcd [a b]
  (cond 
    (< a b) (gcd b a)
    (== 0 b) a
    :else (gcd b (mod a b))))

(defn lcm [a b]
  (if (or (== a 0) (== b 0))
    0
    (quot (* a b) (gcd a b))))
Then we can just do a reduce to get the LCM of the range:
(println (reduce lcm (range 1 21)))
This runs pretty fast (note that on my machine the JVM takes about 1.5s to start up):
$ time clj 5.clj
232792560

real 0m1.585s
user 0m3.136s
sys 0m0.151s
An alternative approach is to use something like the Sieve of Eratosthenes (SoT). This is an algorithm for finding prime numbers, but we can use the same idea here. The way the SoT works is you take a list of numbers from 2 to $N$. For each number $i$, you loop through the rest of the list, removing any multiple of $i$. After the first pass we'll have removed all numbers divisible by 2, after the second pass all numbers divisible by 3, etc. What we'll be left with is just a list of prime numbers. We'll apply the same idea here, but instead of removing the number if it is divisible, we'll divide out our current factor.
(defn lcm-sieve [xs]
  (if (empty? xs) ()
    (let [head (first xs)
          tail (rest xs)]
      (cons head
            (lcm-sieve (map #(if (== 0 (mod % head))
                               (quot % head)
                               %)
                              tail))))))
This algorithm efficiently gives us the list of prime factors for the LCM:
user=> (lcm-sieve (range 1 11))
(2 3 2 5 1 7 2 3 1)
These can be combined using reduce:
(println (->> (range 1 21)
              (lcm-sieve)
              (reduce *)))
This gives the correct result:
rob@alien ~/code/euler $ time clj 5.clj
232792560

real 0m1.549s
user 0m3.014s
sys 0m0.143s
Both methods seem quite fast, I hit integer overflow on my machine without either of them taking more than half a second. The reason I dove into the second one is that the SoT algorithm is the basis for a lot of my solutions in the future, so it's good to see a version of it now.

Problem 003: Largest Prime Factor

From Project Euler:
The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143 ?
This is the first problem with primes. There are some good ways to find primes, however for this simple problem we'll use a simple solution. First, let's define a function that tells us if a number $n$ is prime. The way it works is it will test every number $i$ from 2 to $\sqrt{n}$ to see if $n$ is divisible by $i$. If it is, then $n$ is not prime. We only need to go up to $\sqrt{n}$ because all non-prime numbers will have at least one factor between 2 and their square root.
(defn is-prime? [n]
  (->> (range 2 (int (Math/ceil (Math/sqrt n))))
       (some #(== (mod n %) 0))
       (not)))
From here, we can now filter through and find divisors of 600851475143, and figure out which is the maximum.
(def N 600851475143)

(println
  (->> (range 2 (int (Math/ceil (Math/sqrt N))))
       (filter #(== (mod N %) 0))
       (filter is-prime?)
       (apply max)
       ))
This gives us the correct result (note that on my machine the JVM takes about 1.5s to start up):
$ time clj 3.clj
6857

real 0m1.651s
user 0m3.232s
sys 0m0.138s
There are more efficient ways to find primes: one good one to use is a prime sieve. We don't really need that for this problem since the number is fairly small, but for more advanced problems later on we'll want to use that approach.