Details, Explanation and Meaning About Quadratic sieve

Quadratic sieve Guide, Meaning , Facts, Information and Description

The quadratic sieve algorithm (QS) is a modern integer factorization algorithm and, in practice, the second fastest method known. It is a general-purpose factorization algorithm, meaning that its running time depends solely on the size of the integer to be factored, and not on special structure or properties.

Table of contents
1 Basic aim
2 How QS optimizes finding congruences
3 Multiple polynomials
4 Example
5 Factoring records

Basic aim

The algorithm attempts to set up a congruence of squares modulo n (the integer to be factorized), which often leads to a factorization of n. The algorithm works in two phases: the data collection phase, where it collects information that may lead to a congruence of squares; and the data processing phase, where it puts all the data it has collected into a matrix and solves it to obtain a congruence of squares. This phase requires large amounts of memory and cannot be parallelized so it is usually performed on a supercomputer when the number to be factored has more than 100 digits.

The naïve approach to finding a congruence of squares is to pick a random number, square it, and hope the least non-negative remainder modulo n is a perfect square. This approach finds a congruence of squares only rarely for large n, but when it does find one, more often than not, the factorization is complete.

The quadratic sieve is a modification of Dixon's factorization method.

How QS optimizes finding congruences

The quadratic sieve attempts to find pairs of integers x and y(x) (where y(x) is a function of x) satisfying a much weaker condition than x2y2 (mod n). It selects a smoothness bound B, and attempts to find x such that the least absolute remainder of x2 mod n (y(x)) is B-smooth. Such a pair of integers is called a relation. The set of primes less than or equal to B is called the factor base.

The quadratic sieve speeds up the process of finding relations by taking x close to the square root of n. This ensures that y(x) will be smaller, and thus have a greater chance of being B-smooth.

This implies that y is on the order of 2x[√n]. However, it also implies that y grows linearly with the square of |x|.

We can also increase the chance of smoothness by simply increasing the size of the factor base, but that means we have to collect more relations. We must have at least as many full relations as we have primes in the factor base.

Partial relations and cycles

Even if we find a relation where y(x) is not B-smooth, we may be able to merge two of these partial relations to form a full one. This is only possible if the two y 's are products of the same prime(s) outside the factor base. For example, if B = 7 and n = 91 we have the partial relations:

We can multiply these together:

and multiply both sides by (11−1)2 modulo 91. 11−1 = 58, so:

and we have a full relation. Such a full relation (obtained by combining partial relations) is called a cycle. Sometimes, forming a cycle from two partial relations leads straight to a congruence of squares, but rarely.

Checking smoothness by sieving

There are several ways to check for smoothness of the ys. The most obvious is by trial division, although this increases the running time for the data collection phase. Another method that has some acceptance is the elliptic curve method. However, in practice, a process called sieving is used.

Thus by solving y(x) ≡ 0 (mod p) for p, we find a whole sequence of ys which are divisible by p. (This is where the quadratic sieve gets its name -- y is a quadratic polynomial in x, and the sieving process works like the Sieve of Eratosthenes.) Repeating this process for other values of p allows us to quickly find likely B-smooth values, without having to test for smoothness every time. When large factor bases are used (see Factoring Records below), say 500000 primes, testing for smoothness every time takes too long to be practical.

When the data processing phase begins, however, it is necessary to factorize the y-values collected, since we need to know the exponents associated with each prime of in the factor base. This is not as time-consuming as testing for smoothness by factorizing, since we already know which primes a y-value is divisible by, from the sieving.

Multiple polynomials

In practice, many different polynomials are used for y, since with only one polynomial, we will not be able to collect enough (x, y) pairs that are smooth over the factor base. The polynomials must all have a similar form to the original y(x) = x2n:

This approach (called MPQS, Multiple Polynomial Quadratic Sieve) is ideally suited for parallelization, since each processor involved in the factorization can be given n, the factor base and a collection of polynomials, and it will have no need to communicate with the central processor until it is finished with its polynomials.

Example

Here is an example. Let n = 55 and B = 5. Since n is small (very small), we only need one polynomial, and no sieving.

Data collection

Now, we let x equal ...−2, −1, 0, 1, 2... and check if the resulting y is 5-smooth. Keep in mind that −1 is included in the factor base.

We have to collect, at the very least, one more full relation than there are primes in the factor base to guarantee the presence of a congruence of squares. We now have 4 full relations, with 4 primes in the factor base. In fact, it is a standard result from
linear algebra that we need t + 1 independent full relations to guarantee a linear dependence (a relationship that will yield a congruence of squares), where t is the size of the factor base. When t is large is nearly impossible for the relations being found to be independent, so we compute more full relations in order to overcome this problem.

Notice that the y(6) and y(9), though they both contain large primes, contain different large primes and so cannot be combined into a full relation. We can enter the exponent vectors (row vectors which list the exponents each prime is raised to) in a matrix. Each column represents one of the primes in our factor base; from left to right: −1, 2, 3, 5. We list only the "good" pairs, i.e., the ones that are 5-smooth.

Data processing

Using linear algebra, we can find rows that, when added together, produce a row vector with all even entries. In this case, the matrix is small enough that we can simply find such rows by inspection. In fact, the rows corresponding to x = 4 and x = 8 are already congruences of squares, but we will demonstrate combining relations to get a congruence nonetheless. Adding rows x = 5, 7, and 10 will give the all-even vector (2 2 4 2). We multiply those relations together and get:

Thus we have our congruence of squares. Now, gcd(35 − 9, 55) = 1, and gcd(35 + 9, 55) = 11. If we had not canceled 10 in the second line, we would have gcd(350 − 9) = 5 and gcd(350 + 90, 55) = 55. Either approach yields a non-trivial factor of 55.

This demonstration should also serve to show that the quadratic sieve is only appropriate when n is large. For a number as small as 55, this algorithm is overkill. Trial division could have found a factor with 3 divisions.

Factoring records

Until the discovery of the number field sieve (NFS), QS was the asymptotically-fastest known general-purpose factoring algorithm. Now, Lenstra elliptic curve factorization has the same asymptotic running time as QS (in the case where n has exactly two prime factors of equal size), but in practice, QS is faster since it uses single-precision operations instead of the multi-precision operations used by the elliptic curve method.

On April 2, 1994, the factorization of RSA-129 was completed using QS. It was a 129-digit number, the product of two large primes, one of 64 digits and the other of 65. The factor base for this factorization contained 524339 primes. The data collection phase took 5000 mips-years, done in distributed fashion over the Internet. The data collected totaled 2GB. The data processing phase took 45 hours on Bellcore's MasPar (massively parallel) supercomputer. This was the largest published factorization by a general-purpose algorithm, until NFS was used to factor RSA-130, completed April 10, 1996. All RSA numbers factored since then have been factored using NFS.


This is an Article on Quadratic sieve. Page Contains Information, Facts Details or Explanation Guide About Quadratic sieve


Google
 
Web www.E-paranoids.com

Search Anything