# A wake up call about language choices for certain use cases

I am a somewhat unapologetic Perl user. I write lots of code in Perl, simply because I find it doesn’t get in my way of expressing concepts, and its really quite fast. I’ve used many languages over the decades, and none is perfect, Perl has become something of my goto language (similar to C in years past).
I forgot the exact context I found this page under, but find it I did. On it, the author gives a simple multi-precision example in Java, Go, and Python. As a performance test, it leaves some things to be desired (I’ve got another post on a benchmarking failure I found online in the works, give me some time to get that done), but it was interesting to me in that it was a very simple test that one could explore the expressiveness of a language together with a rough feel for how well integrated the gmp or similar packages were.
I pulled the python source down and ran it. Similar times to what they had. Very compact source. This, honestly, surprised me.
Ok, I wanted to convert this to Perl to see if we could be within an order of magnitude of the other codes. Python isn’t particularly fast by itself, but it has a good set of internals to make interacting and using external libraries trivial. So how well GMP was integrated into Python would be relevant to its performance.
Likewise with Perl.
First the python code. Using 3.4.2 revision I ran the following code

# 157 bit n = pq with p ~= 78 bits
n = 273966616513101251352941655302036077733021013991
i = 496968652506233112158689  # prime minus 10e6
while True:
if n % i == 0:
print i
break
i += 2
landman@lightning:~/work/benchmarking/factor$time /opt/scalable/bin/python3 factors.py 496968652506233122158689 real 0m1.894s user 0m1.880s sys 0m0.012s  Then I rewrote this in Perl, using Math::BigInt and the GMP libs. This is Scalable Informatics toolchain BTW for Python and Perl #!/opt/scalable/bin/perl use strict; use Math::BigInt only => 'GMP'; # 157 bit n = pq with p ~= 78 bits my$n = Math::BigInt->new('273966616513101251352941655302036077733021013991');
my $i = Math::BigInt->new('496968652506233112158689'); # prime minus 10e6 my$two = Math::BigInt->new('2');
$i+=$two  while ($n %$i);
printf "$i\n"; landman@lightning:~/work/benchmarking/factor$ time ./factors.pl
496968652506233122158689
real	2m12.028s
user	2m11.817s
sys	0m0.004s


Ok … wow. I wasn’t expecting that.
I tried the go version as well, again with our 1.4 build in the Scalable Informatics tool chain

landman@lightning:~/work/benchmarking/factor$time /opt/scalable/go/bin/go run factors.go 496968652506233122158689 real 0m4.288s user 0m4.338s sys 0m0.112s  Ummm … ok … I am having a hard time understanding this. Ok, let me rewrite this as a simple GMP C program. This should be the fastest of them all. #include #include #include void main(void) { mpz_t n,i,two,m; mpz_init(m); mpz_init_set_str (n , "273966616513101251352941655302036077733021013991", 10); mpz_init_set_str (i , "496968652506233112158689", 10); mpz_init_set_str (two , "2", 10); while(1) { mpz_mod(m,n,i); if (mpz_sgn(m) == 0) { break; } mpz_add(i,i,two); } gmp_printf ("%Zd\n", i); mpz_clear(n); mpz_clear(i); mpz_clear(two); }  then compile and run landman@lightning:~/work/benchmarking/factor$ gcc -O3 -o factor.exe factor.c -lm -lgmp
landman@lightning:~/work/benchmarking/factor$time ./factor.exe 496968652506233122158689 real 0m0.334s user 0m0.330s sys 0m0.004s  and finally, since I am enjoying Julia so much n = BigInt(273966616513101251352941655302036077733021013991) i = BigInt(496968652506233112158689) # prime minus 10e6 two = BigInt(2) zero= BigInt(0) while (mod(n,i) > 0) i += two end println(i) landman@lightning:~/work/benchmarking/factor$ time /opt/scalable/bin/julia factors.jl
496968652506233122158689
real	0m3.433s
user	0m3.467s
sys	0m0.733s


Ok.
So, I profiled the perl code, and it spent something like 90% of its time in initializing new temps. That is the GMP library usage in perl, is horrible. And its horrible as there does not seem to be an understanding that one should avoid instantiating objects deep in inner loops. That copies should be strenuously avoided in high performance code.
What is blowing me away is how fast Python is relative to C. Its within a factor of 10. I expected Julia to be easily within that, and am quite surprised that it is slower than Python. JIT versus a purely interpreted engine.
Between this and the other bits I’ve run into (playing games with a Mandelbrot set generator in a few languages) where Perl is maddeningly slow, or, even worse, wrong on its answers (and the code is too bloody simple to be wrong, I’ve stepped through it by hand, and I still can’t catch the bug) … I think I need to take a long hard look at where I should be going with my next set of languages.
I think Python is back in (/sigh, its Fortran all over again with the damn indentation), Julia, q, and a few others. This was a wake up call for me. Perl does great for what it does. Its just not great hooking into other code and its optimizer sucks rocks.

### 6 thoughts on “A wake up call about language choices for certain use cases”

1. The poor performance of BigInts in Julia is due to the high turnover of temporary objects and the subsequent garbage collection. It’s a known problem and is actively being worked on: https://github.com/JuliaLang/julia/issues/4176

• @Matt cool. Julia is one of my favorite languages to play with. I noticed some GC issues when I put the code in a @time []; construct. About 10% time spent on GC.
I am still hoping for a static compiler, as I’d love to be able to write/deliver applications built on julia.