This writeup discusses few most important concepts in number theory that every programmer should ideally know. It is neither an introductory tutorial, nor any specific algorithms are discussed here. Rather, this writeup is intended to act as a reference. External references (mostly from Wikipedia and Wolfram) have been provided at many places for further details.
The entire formalization of arithmetic is based on five fundamental axioms, called Peano axioms, which define properties of natural numbers. These axioms are -
(i) 0 is a natural number
(ii) Every natural number has a successor, which is also a natural number
(iii) 0 is not the successor of any natural number
(iv) Different natural numbers have different successors
(v) If a set contains the number 0 and it also contains the successor of every number in S, then S contains every natural number.
The fifth axiom is also popularly known as "principal of mathematical induction"
Being extremely basic, we would rarely need them directly, unless we want to prove every theorem in arithmetic from the first principles. But being the building blocks of arithmetic, these axioms are worth knowing.
As the name rightly says, this theorem lies at the heart of all the concepts in number theory. The fundamental theorem of arithmetic states that any integer greater than 1 can be written as a product of prime numbers in a unique way (up to the ordering of prime factors in the product). For example, 18 = 2 X 32, 1755 = 33 X 5 X 13. This theorem plays very important role in almost every number theoretic algorithm, like finding prime factors, finding GCD, finding sum of divisors of a number etc to mention a few. Proving this theorem is easy - in fact, it emerges out as a corollary to the Euclid's first theorem (discussed below).
The division algorithm states that given two integers a,b (b != 0) there exist two unique integers q and r such that
a = bq + r, 0 <= r < b
q is typically called quotient, whereas r is called remainder. If r = 0, we say that b divides a, and denote it as b|a.
The two important theorems, called "Euclid's first theorem (Or Euclid's lemma)" and "Euclid's second theorem (usually simply referred as "Euclid's theorem") are as follows:
First theorem: p | ab => p|a or p | b. A direct consequence of this is the fundamental theorem of arithmetic.
Second theorem : There are infinitely many primes. There are many simple proofs for this.
While it is true that there are infinitely many primes, it should also be remembered that there are arbitarily large gap between prime numbers. In other words, it is always possible to get a sequence of n consecutive composite numbers, given n.
The most common algorithm for finding the greatest common divisor of two numbers is the Euclid's algorithm. This is an extremely efficient algorithm, as the number of steps required in this algorithm is at most 5 times the number of digits of the smaller number. GCD is typically denoted using round brackets - (a,b) denotes the gcd of a and b. Similarly LCM is denoted using square brackets - [a,b] denotes the lcm of a and b.
The numbers a and b are called coprimes iff (a,b) = 1 , i.e. iff [a,b] = ab.
If gcd(a,b) = d then (a/d, b/d) = 1.
GCD and LCM are related by a very simple equation: (a,b) * [a,b] = ab. This gives a very fast way to calculate LCM of two numbers.
The bezout's identity states that if d = (a,b) then there always exist integers x and y such that ax+by = d. (Of course, the theory of linear diophantine equations assures existance of infinitely many solutions, if one exists). It is also worth noting that k=d is the smallest positive integer for which ax+by = k has a solution with integral x and y.
Given a,b, Finding x and y, such that ax+by = d is done by extended Euclid's algorithm, which can be implemented in recursive as well as iterative styles.
The most commonly used algorithm for the integer factorization is the Sieve of Eratosthenes. It is sufficient to scan primes upto sqrt(N) while factorizing N. Also, if we need to factorize all numbers between 1 to N, this task can be done using a single run of this algorithm - For every integer k between 1 to N, we can maintain a single pair - the smallest prime that divides k, and its highest power , say (p,a). The remaining prime factors of k are then same as that of k/(pa).
The equations of the form ax ≡ b (mod n) where (x is an unknown integer ) are called linear congruences. Such a congruence will have a solution if and only if there exists an integer x such that n | (ax-b), i.e. ax -b = ny for some integer y, or in other words ax + n(-y) = b. We already know from Bezout's identity that a linear diophantine equation like this will have a solution only if gcd of (a,n), say d, divides b. In such a case, let b = dd', a = da', n = dn', so we have:
da'x + dn'(-y) = dd' where gcd(a',n') = 1
Cancelling d throughout,
a'x + n'(-y) = d'.
since gcd of (a', n') = 1, now we can use Extended Euclid's algorithm to find the solution for a'x + n'(-y) = 1. and then multiply this solution by d' to get a solution for a'x + n'(-y) = d'.
Note that if ax ≡ b (mod n) has a solution, then there is a unique solution mod (n/d). If this solution is denoted by x0, then there are exactly d solutions mod n, given by x0+ kn/d where 0 <= k < d. The tutorial on quadratic equations discusses this in details.
Typical problems of the form "Find a number which when divided by 2 leaves remainder 1, when divided by 3 leaves remainder 2, when divided by 7 leaves remainder 5" etc can be reformulated into a system of linear congruences and then can be solved using Chinese Remainder theorem. For example, the above problem can be expressed as a system of three linear congruences: "x ≡ 1 (mod 2), x ≡ 2 mod(3), x ≡ 5 mod (7)".
In general, a system of linear congruences:
x ≡ a1 (mod n1)
x ≡ a2 (mod n2)
x ≡ a3 (mod n3)
x ≡ ak (mod nk)
where (ni,nj) = 1 for every ni != nj has a unique solution modulo n where n = n1n2n3...nk.
Let ci= n/ni for every i. Let di be the solution for the congruence cix = 1 (mod ni) such that 0 <= di < ni. (This solution can be found out using Extended Euclid's algorithm). Then the common solution to the above system of linear equations is given by
c = a1c1d1 + a2c2d2 + ... + akckdk
A direct corollary of the Chinese Remainder theorem is as follows: Let n = p1a1 * p2a2 * .... * pkak be the prime factorization of n. Then, for any integers a and b, we have a = b (mod n) iff a = b (mod piai ) for each i.
The generalization of the Chinese Remainder Theorem, which discusses the case when the ni's are not necessarily pairwise coprime is as follows - The system of linear congruences
x ≡ a1 (mod n1)
x ≡ a2 (mod n2)
x ≡ a3 (mod n3)
x ≡ ak (mod nk)
has a solution iff gcd(ni,nj) divides (ai-aj) for every i != j. In such a case, there is a unique solution mod n, where n is the least common multiple of n1,n2...nk
Given q and n, if the quation x2 ≡ q (mod n) has a solution, then q is called quadratic residue modulo n. If this equation doesnot have a solution, then q is called "quadratic non residue" modulo n. For example, x2 ≡ 9 (mod 15) has a solution x = 12, hence 9 is a quadratic residue modulo 15. On the other hand, the equation x2 ≡ 11 (mod 15) has no solution, hence 11 is a quadratic non-residue modulo 15. In simpler terms, an integer q is a quadratic residue modulo n if a square can take the form (nk+q) for some positive integer n.
Finding whether a quadratic congruence having prime number modulus has a solution or not is somewhat easy: x2 ≡ a (mod p) has a solution only if a(p-1)/2 = 1 (mod p). In such a case, the Shank-Tonelli algorithm can be used to get the solution.