-2

Let $g(n),f(n)$ be functions of $n \in \mathbb{N}$ such that $g(n)=(n−1)^\frac{1}{n−1}$ and $f(n)=\frac{a(g(n)^n)+(g(n)+(\frac{b}{n}))^n}{2}$. Define $P(n)=(f(n))\log_e (f(n))$. $P(n)$ gives the $n$th Prime Number.

$a=0.520372$ , $b=0.544973$

This works best for numbers like $100,200,500,1000,10000...$ etc.

Klangen
  • 5,075

1 Answers1

1

Cipolla (1902), as shown in Dusart (1999) page 12 and other places. To term m = 2:

$P_n \approx n \Big(\log n + \log\log n - 1 + \frac{\log\log n - 2}{\log n} - \frac{\log\log n \log\log n - 6\log\log n + 11}{2\log n\log n}\Big)$

Empirically we can add in a third order approximation to get slightly better results. This is a small factor varying from 10 to -0.025 times $(\log\log n / \log n)^3n$ depending on the value of n.

While more work to implement, we can do better with $R^{-1}(n)$ --- the inverse of Riemann's R function.

$\begin{align} n & = 30,000\\ p_{n} & = 350,377\\ P(n) & = 351,097 \ \ (+720) \\ C(n) & = 349,883 \ \ (-494)\\ C^*(n) & = 350,613 \ \ (+236)\\ R^{-1}(n) & = 350,517 \ \ (+140)\\ \end{align}$

$\begin{align} n & = 20,000,000\\ p_{n} & = 373,587,883\\ P(n) & = 379,994,996 \ \ (+6407113) \\ C(n) & = 373,571,901 \ \ (-15982)\\ C^*(n) & = 373,571,901 \ \ (-15982)\\ R^{-1}(n) & = 373,578,699 \ \ (-9184)\\ \end{align}$

$\begin{align} n & = 2,000,000,000\\ p_{n} & = 47,055,833,459\\ P(n) & = 48,337,784,789\ \ (+1281951330)\\ C(n) & = 47,056,149,503 \ \ (+316044)\\ C^*(n) & = 47,056,003,062 \ \ (+169603)\\ R^{-1}(n) & = 47,055,819,481 \ \ (-13978)\\ \end{align}$

The approximation you give doesn't do great even at values like 30,000. It does very poorly when we get into semi-large values.

If you want something specifically tuned for very small values (e.g. under 10k), then I suppose we'd want to look at some other methods. On the other hand, we could just use a table, an interpolated sparse table, or just sieve them since this is a trivial size.

The paper "The n-th Prime Asymptotically" (2012) gives a bit more history for this approximation, then goes on to construct approximations to ${\rm li}^{-1}(n)$. As programmers we can just use a binary search on ${\rm li}()$. This will give a very good approximation for $p_n$ of course. $R^{-1}(n)$ is better however, just like $R(n)$ is better than ${\rm li}(n)$ for approximating the prime count.

DanaJ
  • 2,999
  • Try $f(n)=1.1202*(g(n)^n)$ – Ashish Goel Jul 12 '16 at 17:28
  • That does worse than your original under 20k, but better once over. It's only a very small improvement for large values such as the 20M and 2000M examples. Based on MSE for various ranges, $C$ is better once past 70k, and $R^{-1}$ is much better than the others at pretty much every size. – DanaJ Jul 12 '16 at 18:21
  • The original value I gave is not very different from the second one. For any $nth$ value, you have to multiply $g(n)^n$ with a value close to $1.1$ like$1.10,1.11,1.12,1.13,1.14,1.15$ and you'll get a value very close to the $nth$ prime number with a difference of less than 10. – Ashish Goel Jul 12 '16 at 19:22
  • Orig: $f(n)=\frac{a(g(n)^n)+(g(n)+(\frac{b}{n}))^n}{2} (a=0.520372, b=0.544973)$ New: $f(n) = 1.1202 g(n)^n$. I think you need to be more explicit about what your goal is. Is there a particular range that you care about? Regardless of parameters, your method cannot work well for asymptotic approximation. It looks like you're testing with tiny inputs. Finding the best method for, say, 2 to 100k, is very different than finding something that works well at all values. The latter has to take the way primes behave into account, the former does not. – DanaJ Jul 12 '16 at 19:28
  • In the original formula, it is just an average of two formulas that is $a(g(n)^n$ & $(g(n)+b/n)^n$. Both of these equations give a good approximation with $a$~$1.1$ and $b$~$0.11$. The value of $a$ & $b$ I gave was just a try to make it good for all the values, which it did for some range. Also, I tried to find a formula which can be calculated by hand. If you compare this with a $nlog(n)$, you'll see that this gives far better approximation. And I'm only talking about the known numbers. Also, I only have a phone calculator that's why couldn't calculate this for a very big range. – Ashish Goel Jul 12 '16 at 20:42