This writeup discusses few most important concepts in number theory that every programmer should ideally know. It is neither an introductory tutorial, nor any specific algorithms are discussed here. Rather, this writeup is intended to act as a reference. External references (mostly from Wikipedia and Wolfram) have been provided at many places for further details.

这篇文章讨论了数论中每个程序员都应该知道的几个重要概念。本文的内容既不是对数论的入门介绍,也不是针对数论中任何特定算法的讨论,而只是想要做为数论的一篇参考。如果读者想要获取关于数论的更多细节,文中也提供了一些外部的参考文献(大多数来自于 Wikipedia 和 Wolfram )。

0. The Peano axioms

The entire formalization of arithmetic is based on five fundamental axioms, called Peano axioms, which define properties of natural numbers. These axioms are - 

(i) 0 is a natural number
(ii) Every natural number has a successor, which is also a natural number
(iii) 0 is not the successor of any natural number
(iv) Different natural numbers have different successors
(v) If a set contains the number 0 and it also contains the successor of every number in S, then S contains every natural number. 

The fifth axiom is also popularly known as "principal of mathematical induction"

Being extremely basic, we would rarely need them directly, unless we want to prove every theorem in arithmetic from the first principles. But being the building blocks of arithmetic, these axioms are worth knowing.

Further Readingwikipediawolfram

0. 皮亚诺公理

整个算术规则都是建立在 5 个基本公理基础之上的,这 5 个基本公理被称为皮亚诺公理。皮亚诺公理定义了自然数所具有的特性,具体如下:











1. Fundamental Theorem of Arithmetic and the Division Algorithm

As the name rightly says, this theorem lies at the heart of all the concepts in number theory. The fundamental theorem of arithmetic states that any integer greater than 1 can be written as a product of prime numbers in a unique way (up to the ordering of prime factors in the product). For example, 18 = 2 X 32, 1755 = 33 X 5 X 13. This theorem plays very important role in almost every number theoretic algorithm, like finding prime factors, finding GCD, finding sum of divisors of a number etc to mention a few. Proving this theorem is easy - in fact, it emerges out as a corollary to the Euclid's first theorem (discussed below).

The division algorithm states that given two integers a,b (b != 0) there exist two unique integers q and r such that

a = bq + r, 0 <= r < b

q is typically called quotient, whereas r is called remainder. If r = 0, we say that b divides a, and denote it as b|a.

Further Readingwikipediawalfram , divisibility , division algorithm

1. 算术基本定理和除法运算法则

正如这个定理的名称所言,算术基本定理是数论中所有概念的核心。算术基本定理含义如下:任何一个大于1的整数都可以以某种特定的方式写成质数的乘积的形式(这种特定方式取决于乘积中质数的顺序)。例如,18 = 2 * 9, 1755 = 33 *5 * 13. 这个定理在几乎所有的数论运算法则中都扮演着十分重要的角色,例如求一个数的质数因子、最大公约数、除数的和等等。想要证明这个定理其实很简单,实际上它是欧几里得第一个定理的一个推论(下面小节会讨论到)。


a = bq + r, 0 <= r < b

通常我们把q称为商,而把r称为余数。如果r = 0,那么我就说b整除a,并且表示为:b | a.






2. Euclid's Theorems

The two important theorems, called "Euclid's first theorem (Or Euclid's lemma)" and "Euclid's second theorem (usually simply referred as "Euclid's theorem") are as follows:

First theorem: p | ab => p|a or p | b. A direct consequence of this is the fundamental theorem of arithmetic. 
Second theorem : There are infinitely many primes. There are many simple proofs for this.

While it is true that there are infinitely many primes, it should also be remembered that there are arbitarily large gap between prime numbers. In other words, it is always possible to get a sequence of n consecutive composite numbers, given n.

Further Reading: Euclid's TheoremEuclid's Lemmawalfram

2. 欧几里得定理


第一定理:p|ab => p|a or p|b。该定理的直接结论就是算术基本定理。



延伸阅读:Euclid's TheoremEuclid's Lemmawalfram

3. GCD, LCM, Bezout's identity

The most common algorithm for finding the greatest common divisor of two numbers is the Euclid's algorithm. This is an extremely efficient algorithm, as the number of steps required in this algorithm is at most 5 times the number of digits of the smaller number. GCD is typically denoted using round brackets - (a,b) denotes the gcd of a and b. Similarly LCM is denoted using square brackets - [a,b] denotes the lcm of a and b.

The numbers a and b are called coprimes iff (a,b) = 1 , i.e. iff [a,b] = ab.

If gcd(a,b) = d then (a/d, b/d) = 1.

GCD and LCM are related by a very simple equation: (a,b) * [a,b] = ab. This gives a very fast way to calculate LCM of two numbers.

The bezout's identity states that if d = (a,b) then there always exist integers x and y such that ax+by = d. (Of course, the theory of linear diophantine equations assures existance of infinitely many solutions, if one exists). It is also worth noting that k=d is the smallest positive integer for which ax+by = k has a solution with integral x and y.

Given a,b, Finding x and y, such that ax+by = d is done by extended Euclid's algorithm, which can be implemented in recursive as well as iterative styles.

Further ReadingGCDBezout's IdentityEuclid's AlgorithmExtended Euclid's Algorithm

3. 最大公约数、最小公倍数和贝祖定理

欧几里得算法是求两个数的最大公约数最常用的算法,而且也是一个很高效的算法,因为使用欧几里得算法求解两个数的最大公约数的算法步骤最多不会超过这两个数中较小的那个数的5倍。最大公约数通常使用圆括号表示—— (a,b) 表示a和b的最大公约数。类似地,最小公倍数通常使用方括号表示—— [a,b] 表示a和b的最小公倍数。

如果 (a,b) = 1,即 [a,b] = ab,此时我们称a和b互质。

如果 (a,b) = d,那么 (a/d,b/d) = 1。

最大公约数和最小公倍数之间的关系可以由一个非常简单的等式来表示:(a,b) * [a,b] = ab. 该等式为我们提供了一种快速计算两个数的最小公倍数的方法。

贝祖定理是说,如果 d = (a,b) 那么一定存在整数 x 和整数 y 满足 ax + by = d. (当然,如果存在的话,那么线性双变量方程的理论保证了无穷多解的存在性)。同样值得注意的是,k = d 是满足 ax + by = k 有一个关于 x 和 y 的解的最小正整数。 

指定 a 和 b,我们可以通过递归或迭代的方式实现扩展的欧几里得算法来求解满足等式 ax + by = d 的 x 和 y。


GCDBezout's IdentityEuclid's AlgorithmExtended Euclid's Algorithm

4. Integer Factorization

The most commonly used algorithm for the integer factorization is the Sieve of Eratosthenes. It is sufficient to scan primes upto sqrt(N) while factorizing N. Also, if we need to factorize all numbers between 1 to N, this task can be done using a single run of this algorithm - For every integer k between 1 to N, we can maintain a single pair - the smallest prime that divides k, and its highest power , say (p,a). The remaining prime factors of k are then same as that of k/(pa).

Further Readingwikipediainteractive animation

4. 整数因式分解

整数因子分解的最常用的算法是 Eratosthenes 筛选法。在分解N时,将质数扫描到 sqrt(N)就足够了。另外,如果我们需要对 1 到 N 之间的所有数字进行因式分解,则可以使用该算法的单次运行来完成此任务 - 对于 1 到 N 之间的每个整数 k ,我们可以保持一对映射——整除 k 的最小质数、最大倍数,(p,a)。k 的剩余质因子与 k/(pa) 的相似。

延伸阅读:wikipedia、 interactive animation

5. Linear Congruence Equations

The equations of the form ax ≡ b (mod n) where (x is an unknown integer ) are called linear congruences. Such a congruence will have a solution if and only if there exists an integer x such that n | (ax-b), i.e. ax -b = ny for some integer y, or in other words ax + n(-y) = b. We already know from Bezout's identity that a linear diophantine equation like this will have a solution only if gcd of (a,n), say d, divides b. In such a case, let b = dd', a = da', n = dn', so we have:
da'x + dn'(-y) = dd' where gcd(a',n') = 1
Cancelling d throughout, 
a'x + n'(-y) = d'.
since gcd of (a', n') = 1, now we can use Extended Euclid's algorithm to find the solution for a'x + n'(-y) = 1. and then multiply this solution by d' to get a solution for a'x + n'(-y) = d'.

Note that if ax ≡ b (mod n) has a solution, then there is a unique solution mod (n/d). If this solution is denoted by x0, then there are exactly d solutions mod n, given by x0+ kn/d where 0 <= k < d. The tutorial on quadratic equations discusses this in details.

Further ReadingLinear Congruence TheoremSolving Linear Congruences

5. 线性同余方程组

形如ax≡b (mod n)的方程式(x是未知数)称为线性同余。当且仅当存在整数x使得n | (ax-b)成立时,这样的方程组将有一个解,即ax -b = ny,y是整数,换句话说,ax + n(-y)= b。我们已经从Bezout的等式中得知,像这样的线性不定方程将只有在(a,n)的gcd(假设该值为d)整除b时才有解。在这种情况下,让b = dd',a = da',n = dn',所以我们有:
da'x + dn'( - y)= dd',其中gcd(a',n')= 1
a'x + n'( - y)= d'。
由于gcd(a',n') = 1,现在我们可以使用扩展的欧几里德的算法来找到a'x + n'( - y)= 1的解,然后将该解乘以d'得到对于a'x + n'( - y)= d'的解。

注意,如果ax≡b(mod n)有一个解,则mod(n / d)有且仅有一个解。如果这个解是用x0表示的,那么mod n将恰好有d个解,由x0 + kn/d给出,其中0<= k<d。在关于二次方程的教程中详细讨论了这一点。

延伸阅读:Linear Congruence TheoremSolving Linear Congruences

6. Chinese Remainder Theorem

Typical problems of the form "Find a number which when divided by 2 leaves remainder 1, when divided by 3 leaves remainder 2, when divided by 7 leaves remainder 5" etc can be reformulated into a system of linear congruences and then can be solved using Chinese Remainder theorem. For example, the above problem can be expressed as a system of three linear congruences: "x ≡ 1 (mod 2), x ≡ 2 mod(3), x ≡ 5 mod (7)".

In general, a system of linear congruences:
x ≡ a1 (mod n1)
x ≡ a2 (mod n2)
x ≡ a3 (mod n3)
x ≡ ak (mod nk)

where (ni,nj) = 1 for every ni != nj has a unique solution modulo n where n = n1n2n3...nk.

Let ci= n/ni for every i. Let di be the solution for the congruence cix = 1 (mod ni) such that 0 <= di < ni. (This solution can be found out using Extended Euclid's algorithm). Then the common solution to the above system of linear equations is given by 
c = a1c1d1 + a2c2d2 + ... + akckdk


典型的问题形式是“寻找一个数,除以2余1,除以3余2,除以7余5”其余各项可以被推广为一元线性同余方程组之后可以使用中国剩余定理来解决。举个例子,下面的问题可以被表示为三个线性同余式:“x ≡ 1 (mod 2), x ≡ 2 mod(3), x ≡ 5 mod (7)”

x ≡ a1 (mod n1)
x ≡ a2 (mod n2)
x ≡ a3 (mod n3)
x ≡ ak (mod nk)


对于任意的i,当0 <= di < ni,令ci=n/ni,令di为同余式cix=1(mod ni)的解(这个解法可以在利用扩展的欧几里德算法)。上面的线性方程组的通解可以给出为:

c = a1c1d1 + a2c2d2 + ... + akckdk

A direct corollary of the Chinese Remainder theorem is as follows: Let n = p1a1 * p2a2 * .... * pkak be the prime factorization of n. Then, for any integers a and b, we have a = b (mod n) iff a = b (mod piai ) for each i. 

The generalization of the Chinese Remainder Theorem, which discusses the case when the ni's are not necessarily pairwise coprime is as follows - The system of linear congruences 

x ≡ a1 (mod n1)
x ≡ a2 (mod n2)
x ≡ a3 (mod n3)
x ≡ ak (mod nk)

has a solution iff gcd(ni,nj) divides (ai-aj) for every i != j. In such a case, there is a unique solution mod n, where n is the least common multiple of n1,n2...nk

Further Reading: chinese Remainder theoremSolving Linear CongruencesApplet

中国剩余定理的直接推论如下:假设 n = p1a1 * p2a2 * .... * pkak 为 n 的素因子分解。 那么,对于任何整数 a 和 b,我们对于每个 i 都有 a = b (mod n) iff a = b (mod piai ) 。

讨论一下 Ni 的不一定都是两两互质的中国剩余定理的推广,如下 - 线性同余系统

x≡a1(mod n1)
x≡a2(mod n2)
x≡a3(mod n3)
x≡ak(mod nk)

有解,当对于每个 i != j 都有 iff gcd(ni,nj) 除 (ai-aj) ,且存在唯一解 mod n,其中 n 是 n1,n2 ... nk 的最小公倍数


7. Quadratic Congruences

Given q and n, if the quation x2 ≡ q (mod n) has a solution, then q is called quadratic residue modulo n. If this equation doesnot have a solution, then q is called "quadratic non residue" modulo n. For example, x2 ≡ 9 (mod 15) has a solution x = 12, hence 9 is a quadratic residue modulo 15. On the other hand, the equation x2 ≡ 11 (mod 15) has no solution, hence 11 is a quadratic non-residue modulo 15. In simpler terms, an integer q is a quadratic residue modulo n if a square can take the form (nk+q) for some positive integer n. 

Finding whether a quadratic congruence having prime number modulus has a solution or not is somewhat easy: x2 ≡ a (mod p) has a solution only if a(p-1)/2 = 1 (mod p). In such a case, the Shank-Tonelli algorithm can be used to get the solution. 

Further Reading: Quadratic Residue , Quadratic ReciprocitysimulationShank-Tonelli Algorithmtutorial for E4

7. 二次方一致性

给定 q 和 n,如果等式 x2≡q(mod n) 具有解,则 q 称为二次残的模 n。如果该方程不具有解,则q被称为“二次非残差”。例如,x2≡9(mod 15)具有解 x = 12,因此 9 是模 15 的二次余数。另一方面,等式 x2≡11(mod 15)没有解,因此 11 是二次非残差,为了简单起见,如果一个正方形可以取一些正整数 n 的形式(nk + q),则整数 q 是模 n 的二次余数。

发现具有质数模的二次一致性是否具有一个解,是有些容易的:x2≡a(mod p)只有在(p-1)/ 2 = 1(mod p)时才具有解。 在这种情况下,可以使用 Shank-Tonelli 算法来获得解决方案。

延伸阅读: 二次残差二次互反性模拟Shank-Tonelli算法E4手册