Tag Archives: ILNumerics

ILNumerics, Numerical Algorithms, Scientific Computing

ILNumerics Accelerator – A better Approach to faster Array Codes, Part I

October 31, 2022 haymo Leave a comment

A Truth most Programmers won’t tell.

TLDR: It was twenty years ago when computer manufacturers, while trying to boost CPU performance, were confronted with the hard wall imposed by physical limitations. It took a long time for everyone involved to realize that individual processors could no longer become significantly faster in the future. The previous approach made it too simple to deal with the constantly growing demands caused by ever larger data and ever more complex programs. Haymo Kutschbach, founder and CEO of ILNumerics explains the challenges involved for automatic parallelization on today’s hardware and why traditional compilers will not satisfy the strong demand for a feasible solution. A better, working approach is presented.

Reading time: 10 minutes

For many decades, the rule applied: “Your program is too slow? Simply buy the latest hardware!” This was made possible by the availability of (general purpose) programming languages, as well as a stream of innovations in processor architectures and the associated compilers. They allowed the programs to be written largely independently of the execution hardware. The compiler adapted abstract codes to the low-level features of the processors. Everything fit together so conveniently!

And then nature threw a spanner in the works on this easy calculation. We’ve seen new computers continueing to obey Moore’s Law and become more powerful with each generation. But this new computing power is now distributed over many processors! A program can only use the new speed if it can use all processors at the same time. In addition, a significant part of the computing power of today’s computers is found in heterogeneous architectures such as GPUs and other accelerator hardware.

The consequences can be observed today in probably every development department: Programs are written abstractly. Efforts are made to create ‘clean code’ that can be tested and maintained. Unfortunately, it is often executed too slowly, though! So you start to examine the codes for optimization potential. You identify individual bottlenecks, decide manually on a parallelization strategy, implement, test and measure again. Many programmers don’t see this as a problem – their expert knowledge guarantees them a good income for many years to come! Since there are no alternatives, the enormous delays in time to market are accepted. Just like the fact that the next generation of devices will again require a complete rewrite for large parts of the software.

Hardware today is far more diverse and heterogeneous than it was 20 years ago. But why is that actually a problem? Why are we still not able to automatically adapt and run our programs on such hardware again?

If it’s too slow, blame the Compiler!

For one, it’s because the compiler market has traditionally been dominated by processor manufacturers. Their compilers served to make their own hardware more accessible. Even though we’ve seen a recent trend towards vendor-independent compilers, they still continue the traditional approach: compiling (“lowering”) code of a general purpose programming language (like C or Fortran) to hardware instructions that can be executed on a specific (and previously selected) processor architecture. What we actually need, however, is a compiler that translates an abstract program for a “computer” – with all its diverse computing resources.

This task, however, is much more demanding! Why? Here we come to the second reason that has so far prevented a working solution: granularity. In order to be able to use parallel resources, the software must contain additional instructions that take care of the distribution of the individual program parts. This additional overhead makes a certain minimum program size (workload) mandatory. Parallelization only makes sense if the gain in speed through parallel execution exceeds the additional management effort!

Here, the term “granularity” refers to the number of individual low -level instructions that a piece of code requires to calculate its result. Apparently, the granularity increases with the size or complexity of a program part, because more low-level instructions are executed at the end. So, in order to make efficient use of parallelism, it is not enough to consider individual instructions of a program. Rather, a compiler must merge larger program parts (‘grains’) and distribute them to the available computing resources . One of the most important challenges of a parallelizing compiler is to find the optimal size of these “chunks” – and thus the optimal granularity.

This brings us to the third reason: complexity. In general, programs can become arbitrarily complex. Compilers have the task of translating programs into another, useful form. But how does one translate information that can be arbitrarily complex? Traditional compilers achieve this goal by limiting their consideration to individual instructions in the program. They know the (manageable) set of possible translations for each of these instructions. The finished program is then little more than the chain of such individual translations. Nevertheless, traditional compilers are already considered to be extremely complex software!

Unfortunately, this “complexity reduction” trick is in direct contradiction to the necessary, minimal granularity described above. A parallelizing compiler must face the challenge of finding an optimal translation even for program chunks that consist of multiple or many instructions. As always, low hanging fruits exist (see e.g.: tensorflow). However, an optimal solution must not restrict itself to individual, specific instructions just because their translation is particularly simple and therefore easy to implement!

And then there is a fourth aspect that also plays into the area of complexity. In addition to the optimal chunk size (granularity), it is also important to identify those grains of a program that can be executed independently of one another. The program analysis required must also take into account the best possible granularity. A compiler must therefore be able to understand large program sections, form chunks of an optimal grain size and execute them efficiently in parallel on heterogeneous hardware.

Ultimately, it is the complexity of this task that has so far prevented such a compiler from being born. In general, development departments still follow the manual approach described earlier. It requires expert knowledge and enormous effort to manually perform the necessary steps (granularity and independence analysis, parallel implementation) for a special program . But despite the enormous amount of time, despite the enormous maintenance issues associated, this approach allows to scratch the surface only! It can neither address nor solve the real problem.

Are we lost?

Not yet! Let’s wrap it up:

Traditional compilers work at too low a level of granularity. They consider too few instructions to cover a workload, suitable for efficient parallelization.
Previous attempts to include large/all program parts in an analysis have only been successful in a few special cases. Global analysis for general programs is an unsolved problem and, in the opinion of the author, will remain so for a long time to come.

But now we come to the topic of our headline! For a not too small class of programs ILNumerics succeeded in developing a compiler that fulfills all optimality requirements. It is the class of numeric, array-based programs. They are written by scientists, mathematicians, and engineers to encode the innovations of our modern times. They form the core of big data, machine learning and artificial intelligence, and all codes, making use of them. Numerical algorithms realize innovations – moreover: they are these innovations! They allow to formulate great complexity without great effort. They handle huge data and, therefore, require greatest speed.

A development department, faced with the challenge of accelerating its numerical array codes can only select and use the best possible language and libraries there are. Technical requirements often rule out prototyping languages, like numpy and Matlab. Individual parts may be outsourced to GPUs. Other parts are parallelized using multithreading. All of this slows down the industrial development process significantly. It is like writing GUI programs in assembler…

A Better Approach to faster Array Codes

ILNumerics proposes a new, faster approach to the execution of array codes. ILNumerics Accelerator, automatically finds parallel potential in array-based algorithms (as: numpy, Matlab, ILNumerics) and efficiently distributes its workload to heterogeneous computing resources – dynamically and at runtime, when all important information is available. It again scales your algorithm speed to any heterogeneous hardware without recompilation.

The online documentation describes technical details of the ILNumerics Accelerator Compiler. Make sure to register for our newsletter here and don’t miss any news!

Where can I get it ?

ILNumerics Accelerator has entered the public beta phase. It will be released with ILNumerics Ultimate VS version 7. The pre-release is available on nuget. Start here ! General documentation on the new Accelerator is found here. It will subsequently be completed within the next weeks. Read the getting started guide(s), get your hands dirty and please, let us know what you think!

Patents

ILNumerics software is protected by international patents and patent applications.

DE102011119404.9, WO2018197695A1, US11144348B2, CN110383247A, JP2020518881A, EP3443458A1, EP22156804, US20230259338A1, JP7495028B2, CN116610436A, EP23173406.2, PCT/EP2024/062463

. ILNumerics is a registered trademark of ILNumerics GmbH, Berlin, Germany.

ILNumerics, Usage, Visualization

Release Notes ILNumerics 4.13 (detailed)

May 3, 2017 haymo Leave a comment

See this changelog for a quick list of all changes in version 4.13. Here we dive into some details about specific topics.

Visualization Engine Improvements

The main goal of the GDI renderer is still to provide a fully compatible alternative to the OpenGL renderer. It automatically replaces the OpenGL default renderer if a problem / incompatible hardware etc. was detected at runtime. The focus lays on feature completeness and precision of the rendering result. The quality of the GDI renderer has been further improved to these regards in 4.13. Changes affect the following low level rendering features.

Antialiasing

Lines now recognize the ILLines.Antialiasing property for thick lines (Width > 2). The antialised rendering works in all situations: transparent lines, lines with stipple patterns, inside /outside of plot cubes and for logartithmic / linear axis scales. The quality of the antialiasing implementation is now on par with common OpenGL implementations.

The new default value for ILLines.Antialiasing is now true. However, thin lines would not profit from antialiasing, hence thin lines will continue to ignore the value of the ILLines.Antialiasing property.

Note, that some OpenGL drivers actually refuse to render certain combinations of line properties. We experienced such behavior for stippled, thick line strips having antialiased rendering configured. In such situations you should make sure to have the most recent OpenGL / graphics card drivers installed. Alternatively you may chose either stippled pattern – / or antialiased rendering by explicitly configuring your lines for either one setting. Or use the GDI renderer instead.

Following is a screenshot of some lines examples. The left side shows the OpenGL renderer result. On the right side the GDI version is shown. Use your browser to show the full, unscaled image (right click on the image):

… and with dotted lines:

Default Color for ILLines & ILLineStrip

Basic, low level line shapes are now created with a default color: black. Higher level objects (as ILLinePlot, ILSplinePlot, Markers, ILSurface etc.) should always provide a color for low-level objects or use vertex based coloring explicitly. If you are assembling your scene with the low level line objects make sure to check that you have done this. This is not a breaking change.

Note that the setting for the shapes Color property overrides the vertex colors buffer (Colors property). In order to use vertex based coloring one must set the Color property to null, enabling the colors information from the Colors buffer.

Auto Color for Spline Plots

ILSplinePlot (derived from ILLinePlot) now adopts the auto-coloring feature from the ILLinePlot class. When no line color was given at the time the plot was created the color gets assigned which comes next in the ILLinePlot.NextColor color enumeration. Use the linecolor constructor argument of ILSplinePlot or the Line.Color property in order to control the color of the spline line explicitly.

Camera Default Depth: 100

In earlier versions the default ILCamera.ZFar value was 1000 which led to depth buffer precision issues in some situations. The new default value of 100 increases the depth buffer precision significantly. However, the new value (just like the old one) does not take into account the actual depth of your scene! If you encounter unwanted clipping in the far clippping area now, set the value for scene.Camera.ZFar explicitly. At best you know the depth of your scene and use this value. Or – more simple – use the old default of 1000:

 scene.Camera.ZFar = 1000;

Visual Studio 2017 Compatibility

With the new version ILNumerics installs into the great, fresh new Visual Studio 2017 (released March 2017). … well, at least ‘kind of’… What is true is that with 4.13 you can again use ILNumerics in all recent Visual Studio editions supporting extensions at all (which is all editions except Express Edition – I wonder if anyone is using this, still??), including Visual Studio 2017 Community Edition. This is great and all … but:

As you noticed there was a great deal of changes coming with VS2017. This includes the installer system which is – great attempt! – much more slim now by omitting unneeded stuff from the installation. But VS2017 also requires changes to the manifest files facilitating every Visual Studio Extensibility project. A new version has been introduced: version3. This is the first version which is not compatible with Visual Studio 2010 anymore. As a consequence, by supporting the new extension packaging system we would be required to drop support for Visual Studio 2010. This is not too dramatic since the 6 years VS2010 is out now feel kind of an eternity in our dev-world. However, we wanted to give the remaining VS2010 customers at least one version iteration of deprecating VS2010 in order to jump to a more recent version smoothly.

Additionally, the way the new VS installer works seems to reflect not the last word spoken on that topic (at least this is what we hope). The bottom line: we use an MSI installer, wrapped in an exe bootstrapper. The MSI installs all GAC binaries, registers the development dlls in Visual Studio, maintains the singleton installation directory and triggers the VSIX installer which installs the extension into all supported Visual Studio installations located on the system. This may sound complicated but worked quite reliably over the years.

Now, with the new Visual Studio package system things become odd. Out of subtle reasons, the MSI cannot (reliably) trigger the VSIX package automatically (and quietly). The reason becomes obvious when considering that the new extension potentially needs to trigger the installation of new components which it depends on. This install would not be possible while the hosting MSI is being installed itself still. So, as a consequence, currently, MSI installs of VSIX extensions into VS2017 are simply not working. When you start the ILNumerics installation from the delivered *.exe you will find the extension being installed into VS2010 and upwards – with the exception of Visual Studio 2017

Currently, some smart (WIX) people are attempting a solution to this. But we are not aware of a clean solution released already. Therefore, we will wait for it and/or eventually consider a new deployment scheme for our extensions.

However, luckily there is a simple workaround for now! After the ILNumerics installation was finished, you can easily install our extension into VS2017 manually. Just go to the installation folder (by default: C:\Program Files (x86)\ILNumerics\ILNumerics Ultimate VS\bin) and find the ‘ILNumerics.VSExtension.vsix’ file. Double click on it to start the installation into the remaining Visual Studio instances manually. This should work without problems. Be sure to accept the warning dialog during the install. It originates from the fact that our extension must still support older Visual Studio extension techniques.

Keep in mind that this is a workaround, though! Once installed the extension will work as expected. But some things will not work consistentyl. Deinstallation, for example must be done manually from the “Extensions and Tools” dialog within Visual Studio 2017:

A manually installed extension requires a manual uninstall.

Also, in difference to the machine wide installation done by the (administrative) MSI install the manual VSIX installation changes the local user account only. You may have to repeat the VSIX install for other user accounts. Besides these lowered installation experience we know of no other incompatibilities in Visual Studio 2017.

ILNumerics, Installation, Usage

Installing ILNumerics – Unexpected behavior

March 24, 2015 Balázs Leave a comment

At ILNumerics we get a lot of support requests every day. During the last couple of months some questions were related to installing ILNumerics. In some cases an unexpected error message appears. The easy solution is to manually uninstall our extension from all Visual Studio instances and then reinstall ILNumerics. This issue will be resolved once we release our new installer.

Continue reading Installing ILNumerics – Unexpected behavior →

.NET, C#, ILNumerics, Visualization

ILNumerics for Science – How did you do the Visualization?

February 2, 2015 Jonas Leave a comment

In the third part of our series we focus on the visualization of scientific data. You learn how to easily display your data with ILNumerics and the ILPanels.

Continue reading ILNumerics for Science – How did you do the Visualization? →

.NET, ILNumerics, Usage, Visualization

Plotting Fun with ILNumerics and IronPython

September 15, 2014 haymo 2 Comments

Since the early days of IronPython, I keep shifting one bullet point down on my ToDo list:

* Evaluate options to use ILNumerics from IronPython

Several years ago there has been some attempts from ILNumerics users who successfully utilized ILNumerics from within IronPython. But despite our fascination for these attempts, we were not able to catch up and deeply evaluate all options for joining both projects. Years went by and Microsoft has dropped support for IronPython in the meantime. Nevertheless, a considerably large community seems to be active on IronPython. Finally, today is the day I am going to give this a first quick shot.

Continue reading Plotting Fun with ILNumerics and IronPython →

C#, Usage

ILNumerics Language Features: Limitations for C#, Part II: Compound operators and ILArray

December 29, 2013 haymo 2 Comments

A while ago I blogged about why the CSharp var keyword cannot be used with local ILNumerics arrays (ILArray<T>, ILCell, ILLogical). This post is about the other one of the two main limitations on C# language features in ILNumerics: the use of compound operators in conjunction with ILArray<T>. In the online documentation we state the rule as follows:

The following features of the C# language are not compatible with the memory management of ILNumerics and its use is not supported:

The C# var keyword in conjunction with any ILNumerics array types, and

Any compound operator, like +=, -=, /=, *= a.s.o. Exactly spoken, these operators are not allowed in conjunction with the indexer on arrays. So A += 1; is allowed. A[0] += 1; is not!

Let’s take a closer look at the second rule. Most developers think of compound operators as being just syntactic sugar for some common expressions:

int i = 1;
i += 2;

… would simply expand to:

int i = 1;
i  = i + 2;

For such simple types like an integer variable the actual effect will be indistinguishable from that expectation. However, compound operators introduce a lot more than that. Back in his times at Microsoft, Eric Lippert blogged about those subtleties. The article is worth reading for a deep understanding of all side effects. In the following, we will focus on the single fact, which becomes important in conjunction with ILNumerics arrays: when used with a compound operator, i in the example above is only evaluated once! In difference to that, in i = i + 2, i is evaluated twice.

Evaluating an int does not cause any side effects. However, if used on more complex types, the evaluation may does cause side effects. An expression like the following:

ILArray<double> A = 1;
A += 2;

… evaluates to something similiar to this:

ILArray<double> A = 1;
A = (ILArray<double>)(A + 2);

There is nothing wrong with that! A += 2 will work as expected. Problems arise, if we include indexers on A:

ILArray<double> A = ILMath.rand(1,10);
A[0] += 2;
// this transforms to something similar to the following: 
var receiver = A; 
var index = (ILRetArray<double>)0;
receiver[index] = receiver[index] + 2;

In order to understand what exactly is going on here, we need to take a look at the definition of indexers on ILArray:

public ILRetArray<ElementType> this[params ILBaseArray[] range] { ...

The indexer expects a variable length array of ILBaseArray. This gives most flexibility for defining subarrays in ILNumerics. Indexers allow not only scalars of builtin system types as in our example, but arbitrary ILArray and string definitions. In the expression A[0], 0 is implicitly converted to a scalar ILNumerics array before the indexer is invoked. Thus, a temporary array is created as argument. Keep in mind, due to the memory management of ILNumerics, all such implicitly created temporary arrays are immediately disposed off after the first use.

Since both, the indexing expression 0 and the object where the indexer is defined for (i.e.: A) are evaluated only once, we run into a problem: index is needed twice. At first, it is used to acquire the subarray at receiver[index]. The indexer get { ...} function is used for that. Once it returns, all input arguments are disposed – an important foundation of ILNumerics memory efficency! Therefore, if we invoke the index setter function with the same index variable, it will find the array being disposed already – and throws an exception.

It would certainly be possible to circumvent that behavior by converting scalar system types to ILArray instead of ILRetArray:

ILArray A = ...;
A[(ILArray)0] += 2;

However, the much less expressive syntax aside, this would not solve our problem in general either. The reason lies in the flexibility required for the indexer arguments. The user must manually ensure, all arguments in the indexer argument list are of some non-volatile array type. Casting to ILArray<T> might be an option in some situations. However, in general, compound operators require much more attention due to the efficient memory management in ILNumerics. We considered the risk of failing to provide only non-volatile arguments too high. So we decided not to support compound operators at all.

See: General Rules for ILNumerics, Function Rules, Subarrays

.NET, C#, ILNumerics

Troubleshooting: Adding ILNumerics 3D Controls to the VS Toolbox

December 5, 2013 Jonas Leave a comment

Adding ILNumerics visualizations to Visual Studio based projects has become a quite convenient task: It’s easy to use the ILNumerics math library for own projects in .NET. However, from time to time users have problems adding the ILNumerics controls to their Visual Studio Toolbox window.

Update: Since ILNumerics Ultimate VS version 4 this issue has been solved once for all. Simply install the MSI installer and find the ILNumerics ILPanel in the toolbox for all applicable situations.

That’s what a post on Stack Overflow from earlier this year was about: A developer who wanted to use our C# math library for 3d visualizations and simulations wasn’t able to access the ILNumerics controls. “How can I locate it?”, he was wondering. “Do I have to make some changes to my VS?”

Adding ILNumerics Controls to the Visual Studio Toolbox manually

If the ILNumerics Ultimate VS math library is installed on a system, normally the ILNumerics controls are automatically listed in the Visual Studio toolbox on all supported versions of Visual Studio. However, if that’s not the case there’s a way to a add them manually: After clicking right onto the toolbox, you can select “Choose Item”. The dialog allows you to select the assambly to load the controls from – that’s it! You will find the ILNumerics.dll in the installation folder on your system. By default this directory is located at: “C:\Program Files (x86)\ILNumerics\ILNumerics Ultimate VS\bin\ILNumerics.dll”.

However, if that doesn’t work straightaway, it often helps to clear the toolbox from any copies of custom controls before – simply right-click it and choose “Reset Toolbox”.

Need help? ILNumerics Documentation and Support

You want to know more about our math library and its installation? Check out our documentation and the Quick Start Guide! If you have any technical questions, have a look at our Support Section.

Interesting/ useless

Scientific Computing Online: IPython Notebook, Shiny (R) and ILNumerics

September 5, 2013 Jonas Leave a comment

It seems that we’re facing a trend at the moment: scientific computing, math and visualization software for web browsers. With our interactive web examples we have taken a step into that direction, too: Visitors of our website can change the C# code of our plotting and visualization demos in order to create a new SVG, PNG, JPG or EXE output. This allows people to easily try out the ILNumerics syntax and our powerful 2d and 3d visualization features for .NET. In addition to that, ILView allows a convenient way to interactively explore scenes that are created with ILNumerics.

There are two other web applications that cause a lot of excitement in the scientific community at the moment: The IPython Notebook and Shiny, a tool for creating web applications in R. Let’s have a closer look…

IPython Notebook: “Interactive Computational Environment”

The IPython Notebook adresses the huge amount of Python users in the scientific community. It basically offers a new way for writing papers: It’s a web based editor for code execution, math, text and visualization. Because the IPython Notebook combines all parts you normally need to write a scientific paper, you won’t have to import / export different elements from several domain specific software applications: “Everything related to my analysis is located in one unified place”, explains Philip J. Guo in his blog (http://www.pgbovine.net/ipython-notebook-first-impressions.htm). Once you have finished your paper, you can share your IPython Notebook as HTML and PDF with your colleagues, your professor etc.

Shiny: “Easy web applications in R”

Shiny stands for a different approach: It allows you to implement own analysis into web applications. While IPython obviously adresses Python users, Shiny is based on R, a still very popular programming language among statisticians. What makes Shiny interesting are its interactivity features: Most demos on the Shiny website offer the opportunity to choose input parameters from text fields or drop-downs to dynamically change the output visualization. The code seems to be quite similar to R, so users who are familiar with that language will easily be able to create interactive data visualization applications for their websites using Shiny.

Disadvantages: Performance does matter

Both approaches make web browsers accesable for specific needs of scientific visualization: The IPython Notebook offers a convenient tool to share the results of analytics related research; Shiny allows R developers to publish particular interactive plots on the web.

However, both projects are limited – namely because of technological issues. The level of performance that can be realized with both platforms is restricted: You’ll face that at the latest when you start creating complex 3d scenes with either Python or R. This holds true for the platforms’ web applications, too…

Outlook: Scientific Computing online

For certain purposes web based scientific computing software offers new convenient solutions. But if you want to realize complex interactive 3d visualizations, you still won’t use any of them but an application on your local machine instead.

Our interactive web examples point the direction we want to go. In order to make scientific computing more powerful, we’re working on the next step of our approach: a full WebGL support for ILNumerics. Stay tuned…

.NET, Features, ILNumerics, Numerical Algorithms

High Performance Fast Fourier Transformation in .NET

August 24, 2013 Jonas Leave a comment

„I started using ILNumerics for the FFT routines. The quality and speed are excellent in a .NET environment.“

The Fourier Transform (named after French mathematician and physicist Joseph Fourier) allows scientists to transform signals between time domain and frequency domain. This way, an arbitrary periodic function can be expressed as a sum of cosine terms. Think of the equalizer of your mp3-player: It expresses your music’s signal in terms of the frequencies it is composed of.

The Fast Fourier Transform (FFT) is an algorithm for the rapid computation of discrete Fourier Transforms’ values. Being one of the most popular numerical algorithms, it is used in physics, engineering, math and many other domains.

In terms of software engineering, the Fast Fourier Transform is a very demanding algorithm: In the .NET-framework, a naive approach would cause very low execution speeds. That’s the reason why many .NET-developers have to implement native C-libraries when it comes to FFTs.

ILNumerics uses Intel’s® MKL for Fast Fourier Transforms: That’s why our users don’t have to implement native library’s themselves for high performance FFTs. No matter if they have a scientific or an industrial background, many developers rely on ILNumerics because of its implementation of the Fast Fourier Transform. It’s the fastest you can get today – even for big amounts of data.

ILNumerics provides interfaces to forward and backward Fourier Transformations, for real and complex floating point data, in single and double precision, in one, two or n dimensions. In addition to the MKL’s FFTs, prepared interfaces for FFTW and for AMDs ACML exist.

Learn more about the ILNumerics library and its implementation of Fast Fourier Transformation in C#/.NET in the online documentation!

Usage

Putting on a Good Show with HDF5, ILNumerics, and PowerShell

September 14, 2012 haymo Leave a comment

It is certainly nice to have the option to do all kinds of numeric stuff right in your .NET application layer – without the need for interfacing any unmanaged module. But for some tasks, this still seems overkill.

Lets say, you went to that conference and want to give your new friends some insight into your brand new simulation results. The PC in the internet cafe enables you to fetch the data from your NAT storage at home. But will you be able to do anything with it on that plain Windows PC?

Or you want to localize a certain test data set but cannot remember its rather cryptic name. Or you might want to manage the latest measurement results from todays atmospheric observation satellite scans. The data are huge but often require some sort of preprocessing. There should be some easy way to filter them by the meta data within the files, right?

Other than getting the data from some application layer, we now want to interface plain old file objects. Of course, you store your data in HDF5 format, right? You do so, because HDF5 is portable, very efficient, flexible and you are in good company.

Let’s see. We have a fresh Windows PC and we know every Windows installation nowadays comes with Powershell. Powershell itself is based on the .NET framework and hence efficiently handles any .NET assembly. It should be easy to use ILNumerics with Powershell! All we still need is some way to access the HDF5 files. ILNumerics, natively is able to read and write Matlab mat files up to version 6. It currently lags on native HDF5 support.

Luckily, the HDF Group provides a large collection of high quality tools for HDF support. Among them you’ll find a .NET wrapper and … a brand new Powershell module: PSH5X! Together with Gerd Heber, the leading inventor of PSH5X, we did a feasibility study with the goal to investigate the options of utilizing HDF5 and ILNumerics together in Powershell. It can be downloaded here. We were quite impressed by the options this brings.

This blog post will describe the necessary steps to setup Powershell for ILNumerics and HDF5.

Getting Started

Basically, the installation process for any Powershell module consists of

Getting the module files and its dependencies from somewhere,
Deploying the module files into a special folder on your machine, and
Importing the module in your session.

The PSH5X homepage gives all information on how to get ready using the HDF5 Powershell module. Just download the package and follow the three steps on the page. At the end, HDF5 signals you a successful installation by displaying its version numbers.

Since ILNumerics depends on several other modules, we provide a small bootstrapper script. Just open up your favorite Powershell IDE (PowerShell_ISE.exe comes with any recent Windows) and copy/paste the following line:

(new-object Net.WebClient).DownloadString('http://ilnumerics.net/media/InstallILNumericsPSM.ps1') | iex

If you are curious, what this does – just ommit the trailing | iex and the script is not executed but displayed for your inspection.

The installer will ask for the installation folder (global under System32/ or local in your user profile), fetches the latest ILNumerics package for the current platform from the official nuget repository and install it into the selected module folder. In addition it loads the TypeAccelerator Powershell module and installs it into the same module directory. Note, the accelerators have been slightly modified in order to make them work with Powershell 3 and hence are fetched from our ILNumerics server. However, credits fully belong to poshoholic for his great work.

Note, the installation has to be done only once. Afterwards, on the next Powershell session, simply re-import needed modules by typing – lets say:

PS> Import-Module ILNumerics

Go!

If everything was setup correctly, we can now use the full spectrum of the involved modules:

PS> [ilmath]::rand(4,5).ToString()
<Double> [5,4]
   0,72918    0,87547    0,43167    0,94942
   0,58024    0,75562    0,96125    0,83148
   0,22454    0,20583    0,82285    0,83144
   0,13300    0,40047    0,58829    0,87012
   0,50751    0,05496    0,02814    0,48764

Nice. But what about the MKL? Are the correct binaries really installed as well?

PS> [ilf64] $A = [ilmath]::rand(1000,1000)
PS> Measure-Command { [ilf64]$C = [ilmath]::rank($A) }
Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 920
Ticks             : 9202311
TotalDays         : 1,06508229166667E-05
TotalHours        : 0,00025561975
TotalMinutes      : 0,015337185
TotalSeconds      : 0,9202311
TotalMilliseconds : 920,2311

PS> $C.ToString()
1000

We have almost all options from C#:

PS> [ilf64] $part = $A['10:15;993:end']
PS> $part.ToString()
<Double> [11,7]
   0,08522    0,87217    0,59997    0,57363    0,22956    0,02006    0,02359
   0,33479    0,49003    0,65269    0,97772    0,28322    0,69505    0,70372
   0,30072    0,68705    0,47112    0,68627    0,65030    0,40454    0,63026
   0,15639    0,30391    0,22992    0,69310    0,65716    0,51797    0,68110
   0,72854    0,60188    0,50740    0,74499    0,13459    0,88481    0,12445
   0,80525    0,60180    0,69256    0,74825    0,64388    0,16792    0,45266

Lets sort the first row of $part, keeping track of original positions:

PS> [ilf64] $indices = 0.0
PS> [ilf64] $sorted = [ilmath]::sort($part['0,1;:'],$indices,0,$false)
PS> $sorted.ToString()
<Double> [2,7]
   0,02006    0,02359    0,08522    0,22956    0,57363    0,59997    0,87217
   0,28322    0,33479    0,49003    0,65269    0,69505    0,70372    0,97772
PS> $indices.ToString()
<Double> [2,7]
         5          6          0          4          3          2          1
         4          0          1          2          5          6          3

This is all interactive. Of course, we can write complete functions and even complex algorithms that way.
One of the best things: Even in Powershell ILNumerics saves your memory and meets all expectations regarding execution speed. Powershell allows you to consequently use ILNumerics’ typing and scoping rules.

In our feasibility study with Gerd Heber, we show how easy it gets to access an HDF5 file, to convert its data to ILNumerics arrays (implicitly), filter and manipulate a little and even create a full interactive 3D surface graph from it. We demonstrate how to use the type accelerators and to mimic the using statement for artificial scoping. Take a look and let us know, what you think!

The ILNumerics Blog