ILNumerics Accelerator Compiler (prerelease)
ILNumerics Accelerator compiler has one goal: to speed-up general array algorithms. It supports algorithms created with ILNumerics Computing Engine.
The compiler introduces new optimizations on multiple levels of granularity, at compile time and at runtime. Compared to others ILNumerics uses many more information - at the moment when it becomes available. It lets the computer decide how to utilize the hardware resources more efficiently, which parts of the algorithm are best executed in parallel: where, when and how.
Some enabling features:
- Deterministic disposal and pooling of objects and memory.
- Removal of unnecessary operations and intermediate results.
- Array pipelining for array ops:
- Automatic, robust detection of parallel potential.
- Automatic JIT building of efficient, specialized low-level kernels ...
- ... for the .NET CLR and for OpenCL devices.
- Latency hiding for kernel building and memory ops.
- Low overhead, asynchronous, in-order execution.
- High-level optimizations, transforming of array expressions.
Many optimizations in ILNumerics are data-driven. The compiler tracks and uses the actual state of the data at runtime. This not only includes the size of arrays (important for workload / cost analysis), the shape and storage scheme (for cache awareness), the type (for SIMD support), but also the location (which device currently hosts the data), and many more.
This way ILNumerics Accelerator gains great freedom to make important decisions autonomously and at runtime. It removes the need for static, global program analysis. It releases the programmer from implementing low-level, device specific optimizations. Programs automatically adopt to any hardware. Projects are finished faster, run faster and remain maintainable.
The ILNumerics Accelerator is currently available as prerelease version 7.0 on nuget.org.
The initial release of ILNumerics Accelerator focusses on fundamental unary, binary, reduce and generator array instructions, as well as more complex instructions: FFT, linear algebra, interpolations on the CPU. As the ultimate goal all parts of an algorithm comprise of supported ILNumerics array instructions. Such regions will be utilizing all parallel compute resources - with optimal efficiency.
The documentation for ILNumerics Accelerator is structured into the following sections:
Getting Started Guide II: Speeding-up the k-means algorithm
Getting Started Guide III: Array Pipelining for Faster Fast Fourier Transforms (FFFT)
Getting Started Guide IV: Array Pipelining vers. Multithreading
Table of supported Features & Roadmap
Don't miss our introductory blog article series: part 1, part 2.