Ok, this was fun. Its been a while since I dusted off good old rzf … ok, its been 12-ish days … but I really have been wanting to try recoding it in javascript.

As you might (or might not) remember, I asked questions (a very long time ago) about quality of generated code from a few different C compilers (and eventually the same code in Fortran). I rewote inner loops to hand optimize the compilation, and then recoded as SSE2. Finally, in a late night fit of sleeplessness, I rewrote the SSE2 as AVX to see what happens. I can’t exactly draw valid conclusions of these tests, but they were fun exercises.

Ok, now you know part of the history. Also, I’ve been playing more and more with javascript … no … node.js … for a set of projects. So I wanted to see how hard the port was from C to node. Well, not terribly hard. Remember, node.js uses the V8 engine as a compiler, and if you look at the Julia Language website with their scaled benchmark results, Javascript isn’t half bad relative to C.

And its actually not so bad to program in.

But back to the performance.

I rewrote rzf.c to rzf.js. If you want to run this you’ll need to run

npm install posix-getopt npm install printf

and yes, I was being lazy with the conversion, printf almost works the way C’s does. I can clean up the code some what, but, this is a quick and dirty test, so don’t yell at me over the out=printf(…);console.log(out); sillyness.

Running the original rzf.c compiled with gcc

landman@pegasus:~/work/benchmarking/rzftest$ make -f Makefile.rzf-gcc gcc -O3 -I. -Bstatic -c rzf.c gcc -O3 -Bstatic -o rzf-gcc.exe rzf.o -lm gcc -dA -dp -S -O3 -I. -Bstatic -c rzf.c -o rzf-gcc.s

I got this:

landman@pegasus:~/work/benchmarking/rzftest$ ./rzf-gcc.exe -n 2 -l 100000000 D: checking arguments: N_args=5 D: arg[0] = ./rzf-gcc.exe D: arg[1] = -n D: N found to be = 2 D: should be 2 D: arg[2] = 2 D: arg[3] = -l D: infinity found to be = 2 D: should be 100000000 D: arg[4] = 100000000 D: running on machine = pegasus zeta(2) = 1.644934056848226 pi = 3.141592644040496 error in pi = 0.000000009549297 relative error in pi = 0.000000003039636 Milestone 0 to 1: time = 0.000s Milestone 1 to 2: time = 1.329s

Thats 1.329s to execute the core loop. Same thing in node.js

landman@pegasus:~/work/benchmarking/rzftest$ /opt/scalable/bin/node rzf.js -n 2 -l 100000000 n option found. Argument is: 2 l option found. Argument is: 100000000 D: running on machine = pegasus zeta(2) = 1.644934056848226 pi = 3.141592644040496 error in pi = 0.000000009549297 relative error in pi = 0.000000003039636 Milestone 0 to 1: time = 0.007000s Milestone 1 to 2: time = 3.856000s

Thats about a factor 2.9 slower. Of course, comparing this to hand optimized sse2 from a few years ago …

landman@pegasus:~/work/benchmarking/rzftest$ ./rzf-sse2.exe -n 2 -l 100000000 D: checking arguments: N_args=5 D: arg[0] = ./rzf-sse2.exe D: arg[1] = -n D: N found to be = 2 D: should be 2 D: arg[2] = 2 D: arg[3] = -l D: infinity found to be = 2 D: should be 100000000 D: arg[4] = 100000000 D: running on machine = pegasus D: start_index = 100000000 D: end_index = 2 D: unroll = 2 D: inf-1 = 99999999 zeta(2) = 1.644934056848226 pi = 3.141592644040497 error in pi = 0.000000009549296 relative error in pi = 0.000000003039635 Milestone 0 to 1: time = 0.000s Milestone 1 to 2: time = 0.372s

which is about 10.4x faster than node.js. Then again, I don’t expect node to vectorize, or efficiently optimize its math libraries (inlining calls, etc.).

Still, this is quite respectable.

Compare this to a perl port of this code …

landman@pegasus:~/work/benchmarking/rzftest$ /opt/scalable/bin/perl rzf.pl -n 2 -l 100000000 D: running on machine = cpu_name zeta(2) = 1.644934056848226 pi = 3.141592644040496 error in pi = 0.000000009549297 relative error in pi = 0.000000003039636 Milestone 0 to 1: time = 0.002s Milestone 1 to 2: time = 15.797s

which is about 4x slower than the node.js code.

rzf.js:

var i,j,k,milestone,n,rc; var NMAX=5000000, MMAX=10; var array,total,p_sum,sum,delta_t,pi, pi_exact, t_now; var inf=0 ; var name_length=0; var cpu_name,c; var caliper = new Array(10); var mod_getopt = require('posix-getopt'); var parser, option; var os = require("os"); var printf = require("printf"); var out; milestone = 0; n = 0; sum = 0.0; pi_exact = 4.0*Math.atan(1.0); caliper[milestone] = new Date().getTime(); /* Riemann zeta function of integer argument (c.f. http://mathworld.wolfram.com/RiemannZetaFunction.html ) zeta(n) = sum[k=1;k< =inf;k++] (1.0/pow(k,n)) inf ---- \ 1 zeta(n) = > - / n ---- k k=1 this code will compute zeta(2) in order to calculate pi. pi*pi/6 = zeta(2), so pi = sqrt(6*zeta(2)) to run this code, type ./rzf.exe -l INFINITY -n n where INFINITY is an integer value of how many terms you would like to take for your sum, and n is the argument to the Reimann zeta function. If you use 2 for n, then it will calculate pi for you as well. */ var opt; parser = new mod_getopt.BasicParser('n:l:', process.argv); while ((opt = parser.getopt()) !== undefined) { switch (opt.option) { case 'n': console.log("n option found. Argument is: " + opt.optarg); n = opt.optarg; break; case 'l': console.log("l option found. Argument is: " + opt.optarg); inf = opt.optarg; break; } } sum=0; cpu_name = os.hostname(); out = printf("D: running on machine = %s\n",cpu_name); console.log(out); milestone++; caliper[milestone] = (new Date).getTime(); for(k=inf-1;k>=1;k--) { sum += 1.0/Math.pow(k,n); } milestone++; caliper[milestone] = new Date().getTime(); out = printf("zeta(%i) = %.15f \n",n,sum); console.log(out); if (n == 2) { pi = Math.sqrt(sum*6.0); out = printf("pi = %.15f \n",pi); console.log(out); out = printf("error in pi = %.15f \n",pi_exact-pi); console.log(out); out = printf("relative error in pi = %.15f \n",(pi_exact-pi)/pi_exact); console.log(out); } /* now report the milestone time differences */ for (i=0;i< =(milestone-1);i++) { delta_t = (caliper[i+1] -caliper[i])/1000; out = printf("Milestone %i to %i: time = %fs\n",i,i+1,delta_t); console.log(out); }

and for completeness, the rzf.pl

use Time::HiRes qw( usleep ualarm gettimeofday tv_interval nanosleep clock_gettime clock_getres clock_nanosleep clock stat ); use Getopt::Long; use Math::Trig; my ($i,$j,$k,$milestone,$n,$rc); my $NMAX=5000000; my $MMAX=10; my (@array,$total,$p_sum,$sum,$delta_t,$pi, $pi_exact,@caliper); use constant true => (1==1); use constant false => (1==0); my $inf=0 ; my $c; $milestone = 0; $n = 0; $sum = 0.0; $pi_exact = 4.0*atan(1.0); $caliper[$milestone] = [gettimeofday]; # # Riemann zeta function of myeger argument # (c.f. http://mathworld.wolfram.com/RiemannZetaFunction.html ) # # zeta(n) = sum[k=1;k< =inf;k++] (1.0/pow(k,n)) # # inf # ---- # \ 1 # zeta(n) = > - # / n # ---- k # k=1 # # # this code will compute zeta(2) in order to calculate # pi. pi*pi/6 = zeta(2), so pi = sqrt(6*zeta(2)) # # to run this code, type # # ./rzf.exe -l INFINITY -n n # # where INFINITY is an myeger value of how many terms you would # like to take for your sum, and n is the argument to the Reimann # zeta function. If you use 2 for n, then it will calculate pi # for you as well. # # */ GetOptions ("l=i" =>\$inf, "n=i" => \$n); $sum=0.0; chomp($cpu_name = `hostname`); printf("D: running on machine = %s\n",cpu_name); $milestone++; $caliper[$milestone] = [gettimeofday]; for($k=$inf-1;$k>=1;$k--) { $sum += 1.0/$k**$n; } $milestone++; $caliper[$milestone] = [gettimeofday]; printf("zeta(%i) = %-.15f \n",$n,$sum); if ($n == 2) { $pi = sqrt($sum*6.0); printf("pi = %-.15f \n",$pi); printf("error in pi = %-.15f \n",$pi_exact-$pi); printf("relative error in pi = %-.15f \n",($pi_exact-$pi)/$pi_exact); } #/* now report the milestone time differences */ for ($i=0;$i< =($milestone-1);$i++) { $delta_t = tv_interval($caliper[$i] , $caliper[$i+1] ); printf("Milestone %i to %i: time = %-.3fs\n",$i,$i+1,$delta_t); }