#! /usr/local/bin/perl

$limit = $ARGV[0] || ~ 0;

$t = times;

# Calculate a vector of skips, so that I never check multiples of the first n
# primes.  For each small prime calculate the skip list based on the skip list
# of the next samller prime.  Compare each candidate only to the odd multiples
# of the current prime.
@skip = $max = 2;
print "2\n";
for $prime ( 3, 5, 7, 11, 13, 17, 19 ) {
    print "$prime\n";
    $max *= $prime;
    my $prev = $candidate = 0;
    my $i = $prime;
    my $prime2 = $prime << 1;
    my $nextoddmultiple = $prime + $prime2;
    my $max = $max + $i;
    $max = $limit if $max > $limit;
    my @preskip = @skip;
    @skip = ();
  SERIES: while() {
	for( @preskip ) {
	    $i += $_;
	    $nextoddmultiple += $prime2 while $i > $nextoddmultiple;
	    if( $i < $nextoddmultiple ) {
		$candidate ||= $i;
		push @skip, $i - $prev if $prev;
		$prev = $i;
	    }
	    last SERIES if $i > $max;
	}
    }
}

print "$candidate\n";
@primes = $candidate;

$endnormal = -1;
$end = $primesmin = $primesi = $delayskip = 0;
$nextprimeproduct = $max = $candidate * $candidate;
# Dataset for predictable non primes:
# 0: the prime this concerns (constant)
# 1: this prime multiplied by the next prime giving an as yet unreached product
# 2: the index of that next prime
# 3: this prime**3, limit where this starts failing
@primefactors = [$candidate, $nextprimeproduct, 0, $candidate * $nextprimeproduct];

print STDERR 'step 1: ', -$t + ($t = times), "s\n";

while() {
    for( @skip ) {
	$candidate += $_;
	exit if $candidate > $limit;
	$delayskip += $_;
	if( $candidate < $nextprimeproduct ) {
	    $div = 0;
	    for( @primes[0..$endnormal] ) {
		if ( ($_->[1] += $delayskip) >= $_->[0] ) {
		    $_->[1] -= $_->[0] until $_->[1] < $_->[0];
		    $div = 1 unless $_->[1];
		}
	    }
	    unless( $div ) {
		print "$candidate\n";
		push @primes, $candidate;
	    }
	    $delayskip = 0;
	    next;
	}

	# remove first element, multiply it with next prime
	$x = shift @primefactors;
	$x1 = $$x[1] = $$x[0] * $primes[++$$x[2]];
	if( $x1 > $max ) {
	    # multiplication surpassed next unused prime to the square, so push an element for that
	    $end++;
	    $max = $primes[$end] * $primes[$end];
	    push @primefactors, [$primes[$end], $max, $end, $primes[$end] * $max];
	}

	if( $x1 >=0){# $$x[3] ) {
	    # We reached prime**3, so eliminate prime, activate it for normal sieve
	    $endnormal++;
	    $primes[$endnormal] = [$primes[$endnormal], ($candidate - $delayskip) % $primes[$endnormal]]
	} else {
	    if( $x1 >= $$x[3] ) {
		# We multiplied past prime**3, so reinsert that, instead of $x1
		$x1 = $$x[1] = $$x[3];
		$$x[3] = 0;
	    }
	    # reinsert element in order, treating @primefactors as a B-tree
	    $a = 0;
	    $z = @primefactors;
	    $i = $z >> 1;
	    while() {
		if( $x1 < ${$primefactors[$i]}[1] ) {
		    $z = $i;
		} else {
		    $a = ++$i;
		}
		last if $a == $z;
		$i = ($a + $z) >> 1;
	    }
	    splice @primefactors, $i, 0, $x;
	}
	$nextprimeproduct = ${$primefactors[0]}[1];
    }
}

END { print STDERR 'step 2: ', times - $t, "s\n" }

__END__

=head1 Fast Prime Sieve

The famous Sieve of Eratosthenes has a few drawbacks: it is O(n**2), barring
the same values again and again.  It requires space for as many numbers as you
want to analyse, and can not be used to continue if you want more numbers.

=head2 Inverting Eratosthenes

That's why I invert it: I look at candidates one by one, checking if they are
zero modulo any of the primes up to its square root, i.e. divisible by it.
Only otherwise have I found a new prime.  To do this I must store each
encountered prime.  To save calculating a modulus at each step, I add a
counter to each, where modulus is usually one comparison, or two and a
subtraction.

=head2 Reducing Candidates

When looking at this further, it becomes evident that every second candidate
is even.  So I skip them, by adding two at each step.  Then every other
remaining candidate is divisible by three.  I can skip those too, by
alternately adding two and four.  When driving this further, it becomes more
complicated, the remaining multiples of five to skip are 25, 35, 55, 65...

When calculating these series of skips, they become exponentially long,
before they can be repeated, even for a small number of prime multiples I want
to skip.  For primes up to 17 the skip list is 92159 entries long, up to 19,
already 1658879.  But it means looking at far less candidates, and doing a few
less divisibility tests for each candidate.  So it's definitely worth it, as
timing tests show!

=head2 Predictable Non-Primes

Since I never test a candidate divisible by the first few primes, I never need
to modulus test against those.  So the next non-primes are predictable: 23*23,
23*29, 23*31... and 29*29, 29*31...  By calculating the next of each series
and ordering them all, I must only look for the smallest of them to know it's
not a prime.  Then I multiply that by the next prime and reinsert it into the
prediction list.

This alas fails, when I pass 23**3, 29**3, 31**3...  So at each of those
stages I take the prime out of the prediction list into the modulus check list.
Result is that modulus checking starts not at the square of a prime, but only
at the power of 3.  And checks go not up to the square root of each candidate,
but only up to the 3rd root.

With Perl 5.8.5 on a 1.5GHz Pentium it takes 19s to calculate all primes up to
1e6, 490s up to 1e7 and s up to 1e8.

=head1 AUTHOR

Daniel Pfeiffer <occitan@esperanto.org>

=begin CPAN

=head1 README

B<Fast Prime Sieve>
bounded only by memory with Math::BigInt::Lite
Optimizations:
B< · >never test product of first I<n> primes
B< · >quick test products in I<p>**2 .. I<p>**3
B< · >modulus by addition up to 3rd root

=pod SCRIPT CATEGORIES

Educational/ComputerScience
Search
