572 views (last 30 days)

Show older comments

John D'Errico
on 22 Sep 2019

Edited: John D'Errico
on 2 Apr 2020

There have now been multiple answers to this homework, all of which are fairly equivalent. (The response by Shubham Sangle has a lot of good points.) Let me offer a few ideas.

A good code to solve for the next prime would try to avoid calling isprime as much as possible, because calls to isprime are expensive. Of course, this depends on how large a number you are looking at

isprime(10000000000037)

ans =

logical

1

timeit(@() isprime(10000000000037))

ans =

0.009439529724

10000000000037 is close to as large a number you can use isprime on, working in double precision. If a single test for primality costs roughly 0.01 seconds (at least on my computer, your mileage may vary) then how long will it take to find the next prime after some number?

As it turns out, 10000000000037 is the next prime that exceeds 1e13. After that, we see primes at 10000000000051, 10000000000099, 10000000000129, 10000000000183, 10000000000259...

So out as far as 1e13, the spaces between primes seem to be running around 30 to 70, sometimes more, sometims less. That means, if you wanted to find the next prime after one of those primes, you might need to test perhaps as many as 70 numbers, or more. (However, you might get unlucky, if you find yourself at the beginning of a rather large hole between primes.)

If a single test by isprime takes 0.01 seconds out there, then your code for nextprime might take as much as one second of CPU time. That is not too bad, as long as one second is acceptable to you. Even so, there are some tricks you might employ.

The idea I will suggest here relies on the concept of k-rough numbers. Thus a rough number is one that has no small prime divisors. A k-rough number is one that has no prime divisors less than k. Some definitions will allow k as the smallest divisor, whereas I won't lose any sleep over the distinction.

A test in advance for small divisors is not a bad thing. For example, you can rule out all even numbers in your search. And you can rule out all multiples of 3, 5, 7, etc. That is, suppose I picked an arbitary large integer. What is the probability that it happens to be divisible by some small primes? (I call this pre-test a partial sieve.) Thus, 50% of the time, some large number will be divisible by 2. But 2/3 of the time (67%), it will be divisible by EITHER 2, OR 3. That means we could exclude 67% of the numbers, merely by knowing in advance if they were divisuble by EITHER 2, or by 3.

100*(1 - prod(1 - 1./[2 3]))

ans =

66.6666666666667

100*(1 - prod(1 - 1./[2 3 5 7 11 13 17 19]))

ans =

82.8975977582789

100*(1 - prod(1 - 1./primes(100)))

ans =

87.9682709525065

100*(1 - prod(1 - 1./primes(1000)))

ans =

91.9034736493157

So, if we excluded all numbers that are divisible by any small prime below 1000, then we could reduce the number of calls to isprime by almost 92%. We can carry this pretty far of course. So, if we pretest for divisibility by all primes below 1e8, the exclusion rate goes down to only roughly 97%.

100*(1 - prod(1 - 1./primes(1e8)))

ans =

96.9520278389414

That is important when a single call to isprime costs 30 minutes of CPU time. (This only happens if you are looking at numbers with many thousands of decimal digits though, using perhaps the sym version of isprime.)

As such, a decent version of nextprime might look like this:

function nextp = nextprime(N)

% solve for the next prime that comes immediately after N

% insure that N is itself an integer.

if N == ceil(N)

% then N is an integer already. We need to start looking at the NEXT

% prime number though, so we need to start our search at N+1.

N = N + 1;

else

% N was not an integer at all. So ceil will suffice to increase N

N = ceil(N);

end

% is N even? If so, then since we need ony test odd numbers, since

% even numers are not prime, then we can increment it to an odd number.

% however, just in case N was so small that the next prime would have

% been 2, special case that.

if N < 2

nextp = 2;

return

elseif rem(N,2) == 0

% we only need to look at odd numbers after this.

N = N + 1;

end

% Set up a partial sieve. Here, I'll set the sieve to check for divisibility

% by primes that do not exceed 1000. We don't need to check for divisibility

% by 2 though, since we will just step around all even numbers anyway, and

% we have already excluded the case where N was less than 2. So we never

% need to worry about the only even prime at this point.

partialsieve = primes(1000);

partialsieve(1) = [];

% finally, this is just a while loop. The test in the while becomes a no-op,

% because we break out when a prime gets hit. Use a while instead of a for

% because we have no idea how long the search will go. It does take a very long

% time before the intervening distance between two primes gets very large, so

% the while loop will terminate before long.

while true

if ~all(rem(N,partialsieve))

% no need to test this candidate

N = N + 2;

else

if isprime(N)

nextp = N;

% just break out of the loop. No need to even reset flag.

break

else

N = N + 2;

end

end

end

end

This code is actually pretty efficient, allowing us to exclude tests with isprime for 92% of the numbers we might otherwise try. In fact, it looks like it had to make only one call to isprime in this example run:

timeit(@() nextprime(1e13))

ans =

0.009934246724

I can make that claim, since that is roughly the time that MATLAB took for ONE call to isprime out there. And in this next test, there should have been only 5 calls to isprime.

timeit(@() nextprime(10000000000183))

ans =

0.047855878724

The nextprime code above is efficient, since it is MUCH faster to test for divisibility by even a rather lengthy list of integers using the rem test, then it it to make one call to isprime out there.

How long will the search for the next prime require? Luckily, the primes are scattered randomly, but it takes a seriously long time before the inter-prime stride becomes terribly large. At the same time, it can be shown this stride can be arbitrarily large if you go out far enough.

As you can see below, it generally will not take that many tests before the next prime is found, but it will help if you can avoid some of those tests with careful code.

>> max(diff(primes(1e6)))

ans =

114

>> max(diff(primes(1e7)))

ans =

154

>> max(diff(primes(1e8)))

ans =

220

>> max(diff(primes(1e9)))

ans =

282

Will this all be significant for large numbers? Yes, very much so. If you plot the maximum interprime stride, as a function of the size of the numbers we are looking at, you will find that on a plot with logged x axis, the curve should be roughly linear.

semilogx(10.^[6 7 8 9],[114 154 220 282],'-o')

grid on

I'll just fit a linear polynomial to it, as:

interprimepoly = polyfit([6 7 8 9],[114 154 220 282],1)

interprimepoly =

57 -235

This will be adequate for an extrapolation, since I don't care if it is only wildly approximate.

polyval(interprimepoly,[13, 100000])

ans =

506 5699765

As such, out in the vicinity of 1e13, we may possibly be pushed to test several hundred numbers to see if they are prime. If we can intelligently exclude perhaps 90% of those tests, then we can cut the time required by up to 90%.

Further out, looking at numbers with tens or hundreds of thousands of decimal digits, the importance of avoiding millions of calls to isprime will be quite significant.

Shubham Sangle
on 10 Jul 2019

This is link of answer to similar question https://stackoverflow.com/questions/2468412/finding-a-prime-number-after-a-given-number

Answer mentioned in above link mentions:

Some other methods have been suggested and I think that they are good, but it really depends on how much you want to have to store or compute on the spot. For instance if you are looking for the next prime after a very large number, then using the Sieve of Eratosthenes might not be so great because of the number of bits you would need to store.

Alternatively, you could check all odd integers between (and including) 3 and sqrt(N) on every number odd number N greater than the input number until you find the correct number. Of course you can stop checking when you find it is composite.

If you want a different method, then I would suggest using the Miller-Rabin primality test on all odd numbers above the input number (assuming the input is > 1) until a prime is found. If you follow the list, located at the bottom of the page, of numbers a to check for the given ranges, you can significantly cut down on the number of as you need to check. Of course, you might want to check at least a few of the smaller primes (3,5,7,11 for instance) before checking with Miller-Rabin.

John D'Errico
on 10 May 2020

For a little expansion on the ideas mentioned here...

A rough number is a number that is divisible by no small primes. A k-rough number is a number that is divisible by no prime less than k. That allows us to define the extent that a number is known to be rough. (Some definiitions would have that inequality be less than, others would say less than or equal to. The difference is not really important here.)

Rough-ness is useful because it can help you to exclude numbers from testing them using isprime, because isprime is somewhat computationally intensive. Excluding all even numbers from a test essentially allows you to exclude half the numbers from testing to see if they are prime. But then you might also like to exclude all numbers divisible by 3, then 5, 7, 11, etc.

A simple test to identify 42-rough numbers is easy to implement using gcd.

prod(primes(41))

ans =

304250263527210

log2(prod(primes(41)))

ans =

48.1122518409569

So we see the product of all primes no larger than 41 is small enough that it fits perfectly into the dynamic range of a double precision integer. And if we could exclude all numbers that have at least one prime factor no larger than 41, we would manage to exclude roughly 85% of all integers.

1 - prod(1 - 1./primes(41))

ans =

0.854906329597798

For example, is the number 1e9+1 a 41-rough number? GCD will tell us.

gcd(prod(primes(41)),1e9+1)

ans =

19019

No. It has at least one small prime factor, here we see that to be 3. It may have multiple factors of 3, but we really don't care, since we know now that 1e9+1 cannot possibly be prime.

In fact, if we consider the next 10 integers that exceed 1e9, how many are 41 rough?

gcd(prod(primes(41)),1e9 + (1:10))

ans =

19019 6 23 82 15 2 1 42 1 170

The idea is to look only for numbers that have a gcd of exactly 1 with our product of small prime, since that implies the number is not divisible by any of them.

So we immediately know that of the integers following 1e9, only 1e9+7 and 1e9+9 can possibly be prime, because only they are 41 rough.

isprime(1e9+[7 9])

ans =

1×2 logical array

1 1

In fact, they are both prime numbers. If we wanted to find more primes exceeding 1e9, we could just do:

kr41 = find(gcd(prod(primes(41)),1e9 + (1:50)) == 1)

kr41 =

7 9 13 19 21 31 33 37

isprime(1e9 + kr41)

ans =

1×8 logical array

1 1 0 0 1 0 1 0

So only 8 explicit calls to isprime were needed, and half of those tests were successful in identifying a prime number.

BHARAT JAWLE
on 4 Feb 2021

Edited: BHARAT JAWLE
on 4 Feb 2021

function k=next_prime(n)

k=n+1;

while factor(k)~= k

k=k+1;

end

Rik
on 4 Feb 2021

Rose Potter
on 10 Apr 2019

I was working on that exercise too and I tried

function next_prime(n)

x=n

while isprime(x) = 0

x = n+1

end

however, Matlab returns "The expression to the left of the equals sign is not a valid target for an assignment."

Stephen Cobeldick
on 10 Apr 2019

>> next_prime(5) % wrong

ans = 5

>> next_prime(7) % wrong

ans = 7

>> next_prime(8) % infinite loop until I press ctrl+c

>>

Your function either returns an incorrect value or goes into an infinite loop. You really need to test your code and read the comments in this thread.

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!