High Performance: Prepared for Tomorrow's Hardware
Once the memory problem is solved we can start reaching for truely high performance. This is mainly a matter of efficient utilization of all available computing resources. Traditionally, such computations target a CPU. But today there is so much more we have to deal with: Several CPUs share multiple cores, each one with multiple SIMD vector extensions, potentially a vast amount of computational tasks can be done in parallel on GPUs or accelerating units....
Today, the ILNumerics Computing Engine already supports multiple cores - transparent to the user. This is one of the reasons for the higher speed in the previous example, as you can see from these performance indicators. For the red line, indicating the percentage of processor utilization - higher is better:
The green line shows a high number of soft page faults for the C version and much less processing capacity is spend on the computation. ILNumerics reuses the same memory over and over (leeding to almost no page faults) and is busy computing the results instead: on two cores in parallel.
Managing all available resources optimally is still a topic of scientific research. However, the level of abstraction and the mature runtime environment give ILNumerics a clear advantage over established native languages. Our team of engineers is working hard to make more and more resources available. Your advantage: you will not be required to modify any existing code in order to profit from future speed-up!