Abstract. A modification, due to Peter Montgomery,
of Pomerance's Quadratic Sieve for factoring large integers is discussed
along with its implementation. Using it, allows factorization with over
an order of magnitude less sieving than the basic algorithm. It enables
one to factor numbers in the 60-digit range in about a day, using a
large minicomputer. The algorithm has features which make it well
adapted to parallel implementation.
本文讨论了 Peter Montgomery 对 Pomerance
的二次筛选法所做的改进,这是一种用于分解大整数的算法,以及其实现方法。使用这种改进的算法,可以在比基本算法少一个数量级的筛选次数下完成因数分解。利用这种方法,可以在大约一天的时间内,使用一台大型小型计算机分解60位数字的整数。该算法具有一些特性,使其非常适合并行实现。
介绍
Introduction. The basic quadratic sieve algorithm
has origins which date back to M. Kraitchik, but was first explicitly
stated and analyzed by C. Pomerance. The only two reported
implementations of this algorithm were done by J. Gerver at Rutgers and
J. Davis and D. Holdridge at Sandia National Laboratories. The latter
used a Cray-1 to factor some numbers in the 60-70 digit range from the
'Cunningham Project'. They also used a Cray XMP to factor the then 'MOST
WANTED' number in
9.5 hours. While some of the arguments that follow have been detailed by
Pomerance elsewhere, we present ours in a manner which is more oriented
towards implementation of the algorithm. The basic algorithm depends on
constructing a solution to the following equation, where is the number you wish to factor If
and , then
and are proper factors of . This version of the quadratic sieve
generates a set of quadratic residues of using the following single polynomial:
It follows immediately that if a prime , then for all . Thus, the values of the
polynomial may be factored with a sieve, once one solves . This may be solved by
any one of a number of available algorithms. The potential divisors
of are exactly those primes for which
the Legendre symbol and
the unit -1 is needed to hold the sign.
基本的二次筛选算法起源于 M. Kraitchik,但首次由
C. Pomerance
明确提出并进行分析。这个算法仅有的两次报告实现是由 J. Gerver 在
Rutgers大学 和 J. Davis 与 D. Holdridge 在Sandia National Laboratories
完成的。后者使用 Cray-1超级计算机 在 Cunningham Project
中分解了一些 60-70 位数字范围的数。他们还使用 Cray XMP超级计算机
分解了当时被称为 'MOST WANTED' 的数 ,耗时9.5小时。尽管 Pomerance
在其他地方已经详细描述了以下一些论证,我们在这里以更倾向于算法实现的方式呈现。基本算法依赖于构建以下方程的解,其中
是你希望分解的数。
Select a factor base ¡,for some appropriate value of , and for the sign.
Solve the quadratic equation , for all . There will be two roots rx and r2 for each p
Initialize a sieve array to zero over the interval for some appropriate .
For all add the
value of to the sieve
array at locations and
The value of will be
approximately over , so compare each sieve location
with .Fully
factored residues will have their corresponding sieve value close to
this value. For these, construct the exact factorization via division.
Factorizations are found so infrequently that the time to do this
division is negligible. One need not check all the primes in the factor
base in doing this division. If
is the location in the sieve array, one need only compute . Only if equals one of the two roots do we go
ahead and do the multi-precise division; Let be the
corresponding vector of exponents with .
Collect factorizations.
One then finds a set of residues whose product is a square via Gaussian
elimination over on the
matrix formed by reducing all of the . This creates a linear dependency on the exponents, and the product
of the vectors in that dependency forms a square. It is then trivial to
construct an instance of congruence.
The chief difficulty with this approach is that one must obtain
approximately as many fully factored residues as there are primes in the
factor base. In order to obtain enough factorizations, M must be very
large, and the residues grow linearly in size with .
A way around this problem was formally suggested by Peter Montgomery:
Simply use many polynomials to generate the residues and sieve each
polynomial over a much smaller interval. Utilizing multiple polynomials
enables one to keep the sieve interval small, and hence makes the
residues easier to factor. We have implemented this approach and found
that it is quite effective. It allows one to find enough factored
residues using less than one-tenth the total sieve length in the single
polynomial version. We also show that the cost of changing polynomials
is small.
In fact, the implementation of Sandia was done in two stages. Their
first version used a single polynomial. A later version used multiple
polynomials, but in a disguised rather than explicit form. This latter
version created implicit polynomials from subsequences of the main sieve
which were divisible by a large prime q lying outside the factor base.
They called their method "special q". Montgomery's suggested polynomials
are somewhat better, however, because they yield residues which are
smaller on average and are less clumsy to implement.
In Section 2 we shall discuss how to select the coefficients of the
polynomials. Section 3 will show how to compute those coefficients
efficiently. Section 4 will discuss the basic steps of the algorithm.
Section 5 will discuss selection of the algorithm's input parameters.
Section 6 will present some numerical results. In Section 7 we discuss
its parallel implementation and finally, in Section 8, we compare our
algorithm with other methods.
Peter Montgomery
正式提出了一个解决这个问题的方法:简单地使用多个多项式来生成 ,并在更小的区间内对每个多项式进行筛选。使用多个多项式可以保持筛选区间较小,从而使
更容易分解。我们已经实现了这种方法,并发现它非常有效。它允许我们使用不到单多项式版本总筛选长度的十分之一来找到足够的分解。我们还展示了更换多项式的成本很小。
实际上,Sandia
的实现分为两个阶段。他们的第一版使用了单个多项式。后来的版本使用了多个多项式,但是以一种隐式而非显式的形式。这个后来的版本从主筛选的子序列中创建了隐式多项式,这些子序列可以被因子基之外的大素数
整除。他们称他们的方法为
special q。然而,Montgomery
建议的多项式更好一些,因为它们产生的
平均来说更小,而且在实现上更简洁。
系数选择
Selection of Coefficients. Select . In order to make generate quadratic residues, it is
required that . Since
this latter expression is congruent to , it means that if one must premultiply it by a
small constant k, so .
This is also a good thing to do, in general, because it usually allows
one to find a factor base which is much richer in small primes. We will
later discuss a function that may be used to evaluate a multiplier. A
suggestion by Pomerance allows one to do away with requiring : Simply take as the middle coefficient rather than
. However, we have not yet
implemented this suggestion. It would be advantageous to keep the value
of small, in some appropriate
sense, over . There are
several obvious ways of doing this, e.g.
Minimize over
Minimize
Minimize
subject to and
.
It is easily seen that and
are essentially equivalent.
The length of the base of the parabola is and its area will be directly
proportional to its height. Relaxing the integer constraints and solving
each of the above Lagrange multiplier problems, one finds that they all
yield essentially the same result. The exact answer for is where The only difference among the results of the different
minimization problems is that the constants and change very slightly. ranges from to and ranges from to with . The maximum value of
over is , a factor of improvement over (2) and the
'special q' polynomials at Sandia. follows immediately from symmetry considerations, but and give similar results for sind because the constraint is very binding on the
shape of . The simplest way to
see this is to realize that at the roots its slope will be ± fkÑ. We
would really like to flatten the parabola, but the constraint on the
discriminant means that the curve must have a certain 'steepness'. Thus,
we can do little more than translate the parabola up and down.
3. Computation of Coefficients. A simple method for
selecting , , and comes from a method for finding modular
square roots quickly. To satisfy one must have Let and . It is
desirable that be prime because
if a prime in the factor base divides , then has only one root and the probability that
over drops from to . It is sufficient for practical
purposes that be only a probable
prime. Alternatively, one may select to be the product of primes not in the
factor base, but its factorization must be known to solve . Our implementation took to be a probable prime. To find the
coefficients, compute
Since ,Then Let One now has and Since B must be odd, if it is even subtract it from A.
The value of
is easily obtained since has already been computed. One also has It is not necessary, in practice, to actually compute since the value of is not really needed. It may be used,
however, as a check on the other computations. Compute and save the
value of for later.
This will enable us to quickly compute when we find a factorization. As a
result of the way was chosen, one
also has Some care must be taken: If x lies between the real roots of
, then is negative, and one must subtract
its value from .
The cost of finding the coefficients is dominated by the probable
prime and residue tests on , the
computation of and and . Nevertheless, the
total arithmetic to be done is small.
Finally, the roots of , are since is
invariant.
Most of the cost of changing polynomials occurs in computing . The cost of computing is dominated by the computation of
, which must be done
for all primes in the factor base. Even with an efficient algorithm for
doing this, such as the extended Euclidean algorithm, one must typically
do this thousands of times when changing polynomials. On a SUN-3/75, for
a 60-digit number with a factor base of primes, it takes seconds to compute the coefficients
and seconds to compute all the
roots.