#! /usr/local/bin/perl

$limit = $ARGV[0] || -1^0;

$t = times;
# Calculate a vector of skips, so that I never check multiples of the first n
# primes.  Since all elements are small numbers (for 19 there are skips from 2
# to 34), we might store them in a bit group vector to save space.
@skip = $max = 2;
for $prime ( 3, 5, 7, 11, 13, 17, 19 ) {
    $max *= $prime;
    my $prev = $candidate = 0;
    my $i = $prime;
    my $nextoddmultiple = $prime * 3;
    my $max = $max + $i;
    $max = $limit if $max > $limit;
    my @preskip = @skip;
    @skip = ();
  SERIES: while() {
	for $skip ( @preskip ) {
	    $i += $skip;
	    while ( $i > $nextoddmultiple ) {
		#print "<$nextoddmultiple|$i+$skip|$prev>\n";
		$nextoddmultiple += $prime << 1;
	    }
	    if ( $i < $nextoddmultiple ) {
		$candidate ||= $i;
		push @skip, $i - $prev if $prev;
		#print "[$nextoddmultiple|$i+$skip|$prev]@skip\n";
		$prev = $i;
	    }
	    last SERIES if $i > $max;
	}
    }
}
#print ">$i<\n";exit;
@primes = $candidate;
#print "$_	$count{$_}\n" for keys %count;exit;
#print ":$candidate:$#skip:\n";exit;
#print "$candidate:@skip\n"; exit;

#use Math::BigInt;
#$candidate = new Math::BigInt 5;

$endnormal = -1;
$end = $primesmin = $primesi = $delayskip = 0;
$nextprimeproduct = $max = $candidate * $candidate;
# 0: the prime this concerns (constant)
# 1: this prime multiplied by the next prime giving an as yet unreached product
# 2: the index of that next prime
# 3: this prime**3, limit where this starts failing
@primefactors = [$candidate, $nextprimeproduct, 0, $candidate * $nextprimeproduct];

print STDERR 'step 1: ', -$t + ($t = times), "s\n";
#print">$nextprimeproduct = $candidate ** 2\n";

while() {
    for $skip ( @skip ) {
	$candidate += $skip;
	exit if $candidate > $limit;
	if( $candidate < $nextprimeproduct ) {
	    $skip += $delayskip;
	    $div = 0;
	    for( @primes[0..$endnormal] ) {
		if ( ($_->[1] += $skip) >= $_->[0] ) {
		    $_->[1] -= $_->[0] until $_->[1] < $_->[0];
		    $div = 1 unless $_->[1];
		}
	    }
	    unless( $div ) {
		print $candidate, "\n";
		push @primes, $candidate;
	    }
	    $delayskip = 0;
	    next;
	}

	# We didn't increment primes modulo counter, so remember $skip for later
	$delayskip += $skip;

	# remove first element, multiply it with next prime
	$x = shift @primefactors;
	$x1 = $$x[1] = $$x[0] * $primes[++$$x[2]];
	if( $x1 > $max ) {
	    # multiplication surpassed next unused prime to the square, so push an element for that
	    $end++;
	    $max = $primes[$end] * $primes[$end];
	    push @primefactors, [$primes[$end], $max, $end, $primes[$end] * $max];
	}

	if( ! $$x[3] ) {
	    # We reached prime**3, so eliminate prime, activate it for normal sieve
	    $endnormal++;
	    $primes[$endnormal] = [$primes[$endnormal], ($candidate - $delayskip) % $primes[$endnormal]]
	} else {
	    if( $x1 >= $$x[3] ) {
		# We multiplied past prime**3, so reinsert that, instead of $x1
		$x1 = $$x[1] = $$x[3];
		$$x[3] = 0;
	    }
	    # reinsert element in order, treating this as a B-tree
	    $a = 0;
	    $z = @primefactors;
	    $i = $z >> 1;
	    while() {
#	    print "{$primefactors[$i]}:$a:$i:$z:\n";
		if( $x1 < ${$primefactors[$i]}[1] ) {
		    $z = $i;
		} else {
		    $a = ++$i;
		}
#	    print "<$a:$i:$z>\n";
		last if $a == $z;
		$i = ($a + $z) >> 1;
	    }
	    splice @primefactors, $i, 0, $x;
	}
	$nextprimeproduct = ${$primefactors[0]}[1];
#print ">$$_[1] = $$_[0] * $primes[$$_[2]]\n" for @primefactors;
    }
}

END { print STDERR 'step 2: ', times - $t, "s\n"; }

__END__

=head1 Fast Prime Sieve

The famous Sieve of Eratosthenes has a few drawbacks: it is O(n**2), barring
the same values again and again.  It requires space for as many numbers as you
want to analyse, and can not be used to continue if you want more numbers.

=head2 Inverting Eratosthenes

That's why I invert it: I look at candidates one by one, checking if they are
zero modulo any of the primes up to its square root, i.e. divisible by it.
Only otherwise have I found a new prime.  To do this I must store each
encountered prime.  To save calculating a modulus at each step, I add a
counter to each, where modulus is usually one comparison, or two and a
subtraction.

=head2 Reducing Candidates

When looking at this further, it becomes evident that every second candidate
is even.  So I skip them, by adding two at each step.  Then every other
remaining candidate is divisible by three.  I can skip those too, by
alternately adding two and four.  When driving this further, it becomes more
complicated, the remaining multiples of five to skip are 25, 35, 55, 65...

When calculating these series of skips, they become exponentiallyy long,
before they can be repeated, even for a small number of prime multiples I want
to skip.  For primes up to 17 the skip list is 92159 entries long, up to 19,
already 1658879.  But it means looking at far less candidates, and doing a few
less divisibility tests for each candidate.  So it's definitely worth it, as
timing tests show!

=head2 Another Reduction

Since I never test a candidate divisible by the first few primes, I must
never modulus test against those.  The next non-primes are predictable: 23*23,
23*29, 23*31... and 29*29, 29*31...  By calculating the next of each series
and ordering them all, I must only look for the smallest of them to know it's
not a prime.  Then I multiply that by the next prime and reinsert it into the
prediction list.

This alas fails, when I pass 23**3, 29**3, 31**3...  So at each of those
stages I take the prime out of the prediction list into the modulus check list.
Result is that modulus checking starts not at the square of a prime, but only
at the power of 3.  And checks go not up to the square root of each candidate,
but only up to the 3rd root.

=head1 AUTHOR

Daniel Pfeiffer <occitan@esperanto.org>

=begin CPAN

=head1 README

B<Fast Prime Sieve>
B< · >can go on up to maxint (32 or 64 bit)
B< · >could continue into Math::BigInt
B< · >3 major optimizations
B< · >self unpacking
B< · >embeddable unpacker
B< · >Emacs mode

=pod SCRIPT CATEGORIES

Educational/ComputerScience
Search
