It is certainly nice to have the option to do all kinds of numeric stuff right in your .NET application layer – without the need for interfacing any unmanaged module. But for some tasks, this still seems overkill.
Lets say, you went to that conference and want to give your new friends some insight into your brand new simulation results. The PC in the internet cafe enables you to fetch the data from your NAT storage at home. But will you be able to do anything with it on that plain Windows PC?
Or you want to localize a certain test data set but cannot remember its rather cryptic name. Or you might want to manage the latest measurement results from todays atmospheric observation satellite scans. The data are huge but often require some sort of preprocessing. There should be some easy way to filter them by the meta data within the files, right?
Other than getting the data from some application layer, we now want to interface plain old file objects. Of course, you store your data in HDF5 format, right? You do so, because HDF5 is portable, very efficient, flexible and you are in good company.
Let’s see. We have a fresh Windows PC and we know every Windows installation nowadays comes with Powershell. Powershell itself is based on the .NET framework and hence efficiently handles any .NET assembly. It should be easy to use ILNumerics with Powershell! All we still need is some way to access the HDF5 files. ILNumerics, natively is able to read and write Matlab mat files up to version 6. It currently lags on native HDF5 support.
Luckily, the HDF Group provides a large collection of high quality tools for HDF support. Among them you’ll find a .NET wrapper and … a brand new Powershell module: PSH5X! Together with Gerd Heber, the leading inventor of PSH5X, we did a feasibility study with the goal to investigate the options of utilizing HDF5 and ILNumerics together in Powershell. It can be downloaded here. We were quite impressed by the options this brings.
This blog post will describe the necessary steps to setup Powershell for ILNumerics and HDF5.
Getting Started
Basically, the installation process for any Powershell module consists of
- Getting the module files and its dependencies from somewhere,
- Deploying the module files into a special folder on your machine, and
- Importing the module in your session.
The PSH5X homepage gives all information on how to get ready using the HDF5 Powershell module. Just download the package and follow the three steps on the page. At the end, HDF5 signals you a successful installation by displaying its version numbers.
Since ILNumerics depends on several other modules, we provide a small bootstrapper script. Just open up your favorite Powershell IDE (PowerShell_ISE.exe comes with any recent Windows) and copy/paste the following line:
(new-object Net.WebClient).DownloadString('http://ilnumerics.net/media/InstallILNumericsPSM.ps1') | iex
If you are curious, what this does – just ommit the trailing | iex
and the script is not executed but displayed for your inspection.
The installer will ask for the installation folder (global under System32/ or local in your user profile), fetches the latest ILNumerics package for the current platform from the official nuget repository and install it into the selected module folder. In addition it loads the TypeAccelerator Powershell module and installs it into the same module directory. Note, the accelerators have been slightly modified in order to make them work with Powershell 3 and hence are fetched from our ILNumerics server. However, credits fully belong to poshoholic for his great work.
Note, the installation has to be done only once. Afterwards, on the next Powershell session, simply re-import needed modules by typing – lets say:
PS> Import-Module ILNumerics
Go!
If everything was setup correctly, we can now use the full spectrum of the involved modules:
PS> [ilmath]::rand(4,5).ToString()
<Double> [5,4]
0,72918 0,87547 0,43167 0,94942
0,58024 0,75562 0,96125 0,83148
0,22454 0,20583 0,82285 0,83144
0,13300 0,40047 0,58829 0,87012
0,50751 0,05496 0,02814 0,48764
Nice. But what about the MKL? Are the correct binaries really installed as well?
PS> [ilf64] $A = [ilmath]::rand(1000,1000)
PS> Measure-Command { [ilf64]$C = [ilmath]::rank($A) }
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 920
Ticks : 9202311
TotalDays : 1,06508229166667E-05
TotalHours : 0,00025561975
TotalMinutes : 0,015337185
TotalSeconds : 0,9202311
TotalMilliseconds : 920,2311
PS> $C.ToString()
1000
We have almost all options from C#:
PS> [ilf64] $part = $A['10:15;993:end']
PS> $part.ToString()
<Double> [11,7]
0,08522 0,87217 0,59997 0,57363 0,22956 0,02006 0,02359
0,33479 0,49003 0,65269 0,97772 0,28322 0,69505 0,70372
0,30072 0,68705 0,47112 0,68627 0,65030 0,40454 0,63026
0,15639 0,30391 0,22992 0,69310 0,65716 0,51797 0,68110
0,72854 0,60188 0,50740 0,74499 0,13459 0,88481 0,12445
0,80525 0,60180 0,69256 0,74825 0,64388 0,16792 0,45266
Lets sort the first row of $part, keeping track of original positions:
PS> [ilf64] $indices = 0.0
PS> [ilf64] $sorted = [ilmath]::sort($part['0,1;:'],$indices,0,$false)
PS> $sorted.ToString()
<Double> [2,7]
0,02006 0,02359 0,08522 0,22956 0,57363 0,59997 0,87217
0,28322 0,33479 0,49003 0,65269 0,69505 0,70372 0,97772
PS> $indices.ToString()
<Double> [2,7]
5 6 0 4 3 2 1
4 0 1 2 5 6 3
This is all interactive. Of course, we can write complete functions and even complex algorithms that way.
One of the best things: Even in Powershell ILNumerics saves your memory and meets all expectations regarding execution speed. Powershell allows you to consequently use ILNumerics’ typing and scoping rules.
In our feasibility study with Gerd Heber, we show how easy it gets to access an HDF5 file, to convert its data to ILNumerics arrays (implicitly), filter and manipulate a little and even create a full interactive 3D surface graph from it. We demonstrate how to use the type accelerators and to mimic the using
statement for artificial scoping. Take a look and let us know, what you think!