Nice primer — As you already noted, little changes (long/int) can have a big effect on such micro benchmarks. Also different JS runtimes may likely behave different. Lessons one should learn from this:

- ALWAYS measure with something that resembles what your users will use
- Native calls from the JS runtime have a cost per call. So reduce the calls as much as possible and try to make data bridging cost as small as possible. The longer the computation in native land takes the more likely will the call cost be no factor.
- Integration of native modules makes the whole system more complicated and perhaps more fragile. I would think twice to do that for only a 50% speedup. There could be MUCH more success in choosing different algorithms (here Baillie-PSW and perhaps a lookup table for smaller primes, there are only 168 primes between 1 and 1000)

I know — calculating primes was just an example — but with nearly anything getting complicated enough to take profit from native implementation there is a chance that there is an even bigger profit in choosing better algorithms (and common complexity reducing techniques like memoizing and stuff).