Fun with HDF5, ILNumerics and Excel

It is amazing how many complex business processes in major industries today are supported by a tool that shines by its simplicity: Microsoft Excel. ‘Recently’ (with Visual Studio 2010) Microsoft managed to polish the development tools for all Office applications significantly. The whole Office product line is now ready to serve as a convenient, flexible base framework for stunning custom business logic, custom computations and visualizations – with just a little help of tools like ILNumerics.

ExcelWorkbook1In this blog post I am going to show how easy it is to extend the common functionality of Excel. We will enable an Excel Workbook to load arbitrary HDF5 data files, inspect the content of such files and show the data as interactive 2D or 3D plots. HDF5 is an industry standard for the structured storage of technical data and is maintained by the HDF Group. With version 4.0 ILNumerics supports the HDF5 file format with a very convenient, object oriented API. Visualizations have always been a popular feature since the early days of ILNumerics. And Excel can be another convenient GUI tool to marry both.

Prerequisites

Customizations like the one we are going to show are done in Visual Studio and utilize Visual Studio Tools for Office (VSTO). If you don’t own a copy of Visual Studio, you may find it useful that Microsoft gives away their flagship for free under certain conditions. Another prerequisite we will need: Office 2010 or 2013. I use Office 2010 on my computer. For some reason, Visual Studio 2013 did not allow me to create a new Workbook project for the 2010 version of Office and rather requested Office 2013. Therefore, I used Visual Studio 2012 instead. Just make sure to use the version of Office which is supported by your Visual Studio installation.

The last ingredient needed is ILNumerics. Download the free trial – it contains all features and tools you may possibly want and makes them available systemwide.

Setup

Let’s start with a fresh new Workbook project in Visual Studio:

NewProjectWe take all defaults here. This will create a new Visual Studio project with three worksheets, ready for your customizations. Double click on Sheet1.cs to open the designer view.  In the Visual Studio Toolbox find the ILNumerics section and drag a new instance of ILPanel onto your sheet:

Workbook1_VisualStudioDesignerThis will add the reference of ILNumerics to your project and places the ILPanel control on your sheet. You may want to resize and reposition the control. It will be available from now on as ‘iLNumerics_Drawing_ILPanel1′ in your code-behind classes. Feel free to adjust the name – for this demonstration we will leave it as it is.

 

Loading HDF5 Files in Excel

Excel does not support HDF5 file imports directly (yet?). Luckily, ILNumerics bridges the gap very efficiently and conveniently. First, we add a reference to the HDF5 assembly which comes with ILNumerics. In the last step the reference to ILNumerics was added automagically. For HDF5 we have to do this manually. Right click on the References tab in the solution explorer and chose: Add Reference. Now search for “HDF5″ on your system, select the item found and hit “OK”:

AddHDF5ReferenceIf no item was found, make sure you have installed the latest ILNumerics package using our installer and that you selected the Assemblies tab in the Reference Manager window.

Once we have the HDF5 assembly available, we can start coding. The idea is, that the user should be able to load an HDF5 file from disk, inspect the datasets contained and get the option to load and/or visualize their data. So, let’s add some more controls from the toolbox to the workbook sheet: OpenFileDialog and a button to trigger the opening. Drag the button (Common Controls in the Toolbox) and the OpenFileDialog (Dialogs tab in the Toolbox) to the designer surface:

DragButtonOFDDesigner
Now, rename the new button and double click it to open the auto-generated event handler method. This is where we are going to implement the code to open and inspect the HDF5 file:

private void button1_Click(object sender, EventArgs e) {
    var fod = new OpenFileDialog();
    if (fod.ShowDialog() == DialogResult.OK) {

        var filename = fod.FileName;
        // access the HDF5 file for reading
        using (var file = new H5File(filename, accessMode: FileAccess.Read)) {
            int row = 4; // start listing at row 4
            // list all datasets in the file
            foreach (var ds in file.Find<H5Dataset>()) {
                Cells[row,   3].Value = ds.Path;
                Cells[row++, 4].Value = ds.Size.ToString();
            }
            while (row < 100) {
                Cells[row, 3].Value = "";
                Cells[row++, 4].Value = "";
            }
            // display filename
            Cells[2, 4].Value = filename;
        }
    }
}

First, we ask for the filename to open. If the user provided a valid filename we open the file for reading. HDF5 files in ILNumerics are used in ‘using’ blocks. No matter how you leave the block, ILNumerics ensures that the file is not left open. Read here for more details.
Once we have the HDF5 file we start iterating over its datasets. C# foreach constructs make that really easy. Other languages have similar constructs. Inside the foreach loop we simply write out the path of the current dataset and its size to columns of the worksheet.
The while loop afterwards is only needed to clear entries potentially left over from earlier loadings. This expects no more than 100 datasets in a file. In a production code, you will do better…
Finally, the name of the HDF5 file is written into cell[2,4] for user information.

If we now run the workbook (hit F5) and click on our “Load HDF5 Dataset” button a file dialog opens up. Once we select an existing HDF5 file from our disk the file’s datasets are listed on the worksheet:
FirstRun

Loading HDF5 Dataset Content

Next, register a new event handler for the double click event in the worksheet. The worksheet template offers a good place to do so: the Sheet1_Startup event handler is auto generated by Visual Studio. Add the following line in order to allow to react to double click events on the sheet:

this.BeforeDoubleClick += Sheet1_BeforeDoubleClick;

The implementation of the Sheet1_BeforeDoubleClick method does all the work:

void Sheet1_BeforeDoubleClick(Excel.Range Target, ref bool Cancel) {
    // only take cells we are interested in
    if (Target.Value == null || Cells[2, 4].Value == null) return;
    // grab the hdf5 filename from the cell
    var curFilename = Cells[2, 4].Value.ToString();
    // check if this points to an existing file
    if (File.Exists(curFilename)) {
        // grab the dataset name (if the user clicked on it)
        var dsName = ((object)Target.Value).ToString();
        // reasonable?
        if (Target.Count == 1 && !String.IsNullOrEmpty(dsName)) {
            // open the file
            using (var file = new H5File(curFilename, accessMode: FileAccess.Read)) {
                // find the dataset in the file, we provide the full abs. path so we
                // are sure that there is only one such dataset
                var ds = file.First<H5Dataset>(dsName);
                if (ds != null) {
                    // add a new sheet with the name of the dataset
                    var sheet = (Excel.Worksheet)Globals.ThisWorkbook.Sheets.Add();
                    sheet.Name = checkName(dsName);
                    // ... and make it active
                    Globals.ThisWorkbook.Sheets[sheet.Name].Activate();
                    // load data using our extension method (see text)
                    sheet.Set(ds.Get<double>());
                } else {
                    // user has clicked on the size column -> plot the data
                    var size = ParseSize(dsName);
                    if (size != null && Target.Previous.Value != null) {
                        dsName = ((object)Target.Previous.Value).ToString();
                        // read data and render into panel
                        renderDS(file, dsName);
                    }
                }
            }
        }
    }
}

This is all straight forward: we do some very simple error checking here, just to make sure we only react to clicks on interesting columns. In your production code you will do much better error checking! However, here we decide if the user has clicked on a cell with a valid dataset name. If so, the file is opened (fetching the name from the HDF5 filename cell written to earlier), the dataset is located and its content loaded.
The content is written to a new workbook sheet. Attention must be drawn to the naming of the sheet. If a sheet with a similar name exists already, Excel will throw an exception. One easy solution is to add the timestamp to the name, which is left as an exercise. Here, we only do very simple name checking to make sure, no invalid characters enter the name of the sheeet:

string checkName(string name) {
    var ret = name.Replace('/','_');
    if (ret.Length > 31) {
        ret = ret.Substring(-31);
    }
    return ret;
}

The method to actually load the data from an ILArray to the new sheet is found in the following extension method. This is one rather efficient attempt to load large data:

public static void Set(this Worksheet worksheet, ILInArray<double> A, int fromColumn = 1, int fromRow = 1) {
    using (ILScope.Enter(A)) {

        var luCell = worksheet.Cells[fromRow, fromColumn];
        var rbCell = worksheet.Cells[fromRow + A.S[0] - 1, fromColumn + A.S[1] - 1];
        Range range = worksheet.Range[luCell, rbCell];
        range.Value = A.T.ToSystemMatrix();
    }
}

private static System.Array ToSystemMatrix<T>(this ILDenseArray<T> A) {
    using (ILScope.Enter(A)) {
        // some error checking (to be improved...)
        if (object.Equals(A, null)) throw new ArgumentException("A may not be null");
        if (!A.IsMatrix) throw new ArgumentException("Matrix expected");

        // create return array
        System.Array ret = Array.CreateInstance(typeof(T), A.S.ToIntArray().Reverse().ToArray());
        // fetch underlying system array
        T[] workArr = A.GetArrayForRead();
        // copy memory block
        Buffer.BlockCopy(workArr, 0, ret, 0, Marshal.SizeOf(typeof(T)) * A.S.NumberOfElements);
        return ret;
    }
}

Set() creates the range in the sheet to load the data into. The size is computed by the size of the incoming ILArray<T>. In order to load the data, we do not want to iterate over each
individual cell for performance reasons. One option is to set the Value property of the range to a two dimensional System.Array. ToSystemArray() does exactly that conversion. However, you have to
be careful not to get transposed data unexpectedly. The reason is that .NET multidimensional arrays are stored in row major order. ILNumerics on the other hand stores arrays in the same order as Matlab, FORTRAN and other technical tools do. Hence, we need to transpose our data before we assign them to the range. Read more details here.

Now, when we run the application and load a HDF5 file, we can double click on the cell holding a dataset name and have Excel load the dataset contents into a new worksheet – fast. This can be easily adopted by defining ranges (hyperslabs) and only load partial datasets. Also you can adopt the method described here for writing worksheet contents to HDF5 datasets.

Visualizing HDF5 Dataset Contents

Now let’s add another nice feature to our workbook: instead of simply loading the data from a dataset to the worksheet, we add the option of creating interactive, fully configurable and fast visualizations and plots of the data. We’ll use the predefined plot classes of ILNumerics Visualization Engine here.

Back to the double click event handler created earlier, we left out the path which is executed once the user clicked on the size displayed next to each dataset. What happens here is also straightforward.

First we parse the size to see, if it gives something reasonable (again, you’ll add better error checking for a production release). If so, we give the HDF5 file together with the dataset name to the renderDS() method which does the rendering:

private void renderDS(H5File file, string dsName) {
    var ds = file.First<H5Dataset>(dsName);
    if (ds != null) {
        using (ILScope.Enter()) {
            ILArray<float> A = ds.Get<float>();
            if (A.IsVector) {
                iLNumerics_Drawing_ILPanel1.Scene = new ILScene() {
                    new ILPlotCube(twoDMode: true) {
                        new ILLinePlot(A, markerStyle: MarkerStyle.Diamond, lineWidth: 2)
                    }
                };
            } else if (A.IsMatrix) {
                iLNumerics_Drawing_ILPanel1.Scene = new ILScene() {
                    new ILPlotCube(twoDMode: false) {
                        new ILSurface(A, colormap: Colormaps.Hot) {
                            UseLighting = true
                        }
                    }
                };
            }
            iLNumerics_Drawing_ILPanel1.Refresh();
        }
    }
}

This code does not need commenting. It fetches the dataset and loads its content into an ILArray<float>. A new scene replaces the existing one in the worksheet ILPanel. The new scene contains a plot cube and a line plot or a surface plot. Which one is created depends on the shape of the data. Vector sized data create a lineplot, matrices are rendered as a surface plot. In order to have the new scene show up, we must trigger a refresh on the panel.
WorkbooksLinesNow, run the workbook, load a HDF5 file having some vector and/or matrix sized datasets, select a dataset by double clicking on its size cell. The plot is created according to the data. Like all visualizations in ILNumerics we can interact with the data: rotation/ zoom /pan is done with the left/right mouse buttons. And of course, you are free to apply any of the very flexible configuration options in order to customize your plots further. For line plots this includes: markers, dash styles, tex labeling, advanced coloring, logarithmic axes and much more

Deploying ILNumerics Excel Workbooks

Once the workbook is ready for distribution, we build a Release version from Visual Studio and from now on – do not need Visual Studio anymore! In order to make any machine ready for handling HDF5 data and create Excel worksheets with nice visualizations on it, one only needs to run the ILNumerics installer and transfer the workbook to that machine. It will run on any platform, regardsless if 32 or 64 bit!

You can find the full example for downloading on our examples page.

Performance on ILArray

Having a convenient data structure like ILArray<T> brings many advantages when handling numerical data in your algorithms. On the convenience side, there are flexible options for creating subarrays, altering existing data (i.e. lengthening or shortening individual dimensions on the run), keeping dimensionality information together with the data, and last but not least: being able to formulate an algorithm by concentrating on the math rather than on loops and the like.

Convenience and Speed

Another advantage is performance: by writing C = A + B, with A and B being large arrays, the inner implementation is able to choose the most efficient way of evaluating this expression. Here is, what ILNumerics internally does:

  • Pointer artithmetics – we remove the obligatory bound checks on array accesses by using C# pointers.
  • Cache aware implementations – the biggest bottleneck in modern processors is memory, as your know. ILNumerics implements all basic math functions in a way which allows to very efficiently profit from cache hierarchies in your processor. You will notice, if you time ILMath.sum over the rows of an array against a naive implementation.
  • Loops over large arrays are unrolled, saving a great amount of bookkeeping overhead.
  • Multithreading support – iterations over large arrays are split and distributed to multiple cores.

ILNumerics Accelerator

Here is, what we are planning to add to this list:

  • Utilizing SIMD vector extensions (SSE, AVX). Today, this parallel potential is accessed via the (native) MKL in ILNumerics. There has been a recent update to MKL 11.2 in ILNumerics Ultimate VS 4.4 which brings AVX2 support on corresponding Intel processors. For the future, we are planning to bring in extendend support for all managed evaluations as well. This will be introduced with the release of our ILNumerics Accelerator. Stay tuned – this will be a very exciting release… !
  • The Accelerator will also do onother highly efficient feature: removal of intermediate arrays. Expressions like C = A + 2 * B will no longer create an intermediate result for
    2 * B, but reformulate the expression before evaluation. Again: memory is the bottleneck and this will save a lot of memory transfer to the CPU.
  • Having a vast amount of GPUs available in basically every desktop computer today makes one wanting to use that – especially when waiting for your results. ILNumerics Accelerator will do it! It will decide on its own during runtime, which part of your algorithms are best executed on which available hardware and do it in the most efficient way possible. Need a further speedup? Just buy a better graphics card!

Limitations of n-dim arrays

Nothing comes for free, right? All the above is great for large data on vectorized algorithms. But what if we have a tight loop over small ILArrays or an algorithm which basically does a vast amount of scalar computations?

The flexibility of ILArray certainly comes at a price here. It becomes clear if you imagine the difference between a single double value and its corresponding representation in terms of a scalar ILArray<double>. The former – while being a value type lives on the stack rather than on the managed heap. It is not involved in garbage collection and there is basically no cost involved for its construction, access and destruction – at least when compared to the full featured implementation of ILArray.

ILArray, on the other hand, is a class instance. It lives on the heap instead of the stack storage. Creating an instance corresponds to the creation with ‘new’. For the destruction ILNumerics is able to completely prevent the GC having to handle ILArray instances. But even this memory management comes with a cost, naturally.

So there are two sides of the medal: flexibility and better performance for ‘large’ data. Flexibility and slower speed for tiny … to scalar data. For most algorithms, having a reasonable amount of management of small data does not hurt performance. But if such operations are to be done in tight loops, one starts looking for alternatives.

Alternatives

Especially for small data the flexible subarray features of ILArray are often not needed anyways.  Such scalar algorithm parts are much more efficiently implemented by using data structures that are especially designed for it: System.ValueType and System.Array. The good news: ILNumerics makes it very easy to resort to such system types instead. Here comes a comprehensive list of options:

Scalar operations on an ILArray<T>

Scalar access Returns Efficiency Documentation
A[0,1] ILArray<T> Improved since 4.3, but will not be fastest on scalars (ILNumerics Accelerator might change this!) Subarrays
A.GetValue()
A.SetValue()
T faster and efficient if you only need a single element occasionally Writing to Arrays
foreach (var a in A) { … } T very efficient iteration over all elements of A, gives copies as system types ILArray and LINQ
A.GetArrayForRead()
A.GetArrayForWrite()
T[] direct access to the underlying system array of A. Use this for hand tuned inner kernels or if you need access to native libraries. Array Import / Export

You may also consider the following references:
Stackoverflow posts dealing with the scalar access issue:
http://stackoverflow.com/questions/20944672/iteration-through-ilarraydouble-is-very-slow

http://stackoverflow.com/questions/19224626/performance-of-ilnumerics-of-simple-math-operations-over-vectors-vs-system-array

Uncommon data conversion with ILArray

ILNumerics Computing Engine supports the most common numeric data types out of the box: double, float, complex, fcomplex, byte, short, int, long, ulong

If you need to convert from, let’s say ushort to float, you will not find any prepared conversion function in ILMath. Luckily, it is very easy to write your own:

Here comes a method which implements the conversion from ushort -> float. A straight forward version first:

        /// <summary>
        /// Convert ushort data to ILArray&lt;float>
        /// </summary>
        /// <param name="A">Input Array</param>
        /// <returns>Array of the same size as A, single precision float elements</returns>
        public static ILRetArray<float> UShort2Single(ILInArray<ushort> A) {
            using (ILScope.Enter(A)) {
                ILArray<float> ret = ILMath.zeros<float>(A.S);
                var retArr = ret.GetArrayForWrite();
                var AArr = A.GetArrayForRead();
                int c = 0;
                foreach (ushort a in A) {
                    retArr[c++] = a;
                }
                return ret;
            }
        }

This method is used like that:

            ushort[,] rawSensorData = new ushort[,] {{0,1,2},{3,4,5}};
            ILArray<float> converted = UShort2Single(rawSensorData);
            /*
             * <Single> [3,2]
             * [0]:          0          3
             * [1]:          1          4
             * [2]:          2          5
             */

            // continue working with 'converted' here...

The following method does the same but utilizes pointer arithmetic, hence it needs the /unsafe flag. Use this, if performance is critical and your data are sufficiently large:

        /// <summary>
        /// Convert ushort data to ILArray&lt;float> (unsafe version)
        /// </summary>
        /// <param name="A">Input Array</param>
        /// <returns>Array of the same size as A, single precision float elements</returns>
        public unsafe static ILRetArray<float> UShort2SingleUnsafe(ILInArray<ushort> A) {
            using (ILScope.Enter(A)) {
                ILArray<float> ret = ILMath.zeros<float>(A.S);
                var retArr = ret.GetArrayForWrite();
                var AArr = A.GetArrayForRead();

                fixed (ushort* pAArr = AArr)
                fixed (float* pRetArr = retArr) {
                    ushort* pInWalk = pAArr;
                    ushort* pInEnd = pAArr + A.S.NumberOfElements;
                    float* pRetWalk = pRetArr;
                    while (pInWalk < pInEnd) {
                        *(pRetWalk++) = /*implicit: (float)*/ (*(pInWalk++));
                    }
                }
                return ret;
            }
        }

Plotting Fun with ILNumerics and IronPython

Since the early days of IronPython, I keep shifting one bullet point down on my ToDo list:

* Evaluate options to use ILNumerics from IronPython

Several years ago there has been some attempts from ILNumerics users who successfully utilized ILNumerics from within IronPython. But despite our fascination for these attempts, we were not able to catch up and deeply evaluate all options for joining both projects. Years went by and Microsoft has dropped support for IronPython in the meantime. Nevertheless, a considerably large community seems to be active on IronPython. Finally, today is the day I am going to give this a first quick shot.

Disclaimer: I am not a python developer. My experience with IronPython range from 0 … zero (unfortunatley, since the project deserves much more attention). For CPython it is only slighly better. So please bare with me if I went into some stupid direction or if I have completely missed better options in this article! Corrections and suggestions are always warmly welcome.

Setup

I downloaded and installed IronPython from CodePlex. Also, in Visual Studio 2013 there exists a link in the NEW PROJECT dialog advertising the download of IronPython Tools for Visual Studio. I used that to setup Visual Studio for IronPython projects. All setup went smooth and easy. Nice!

Creating Plots with IronPython

The challenge I was interested in the most was how it is possible to utilize the great plotting capabilities of ILNumerics Visualization Engine from IronPython projects. Since matplotlib seems not to be available for IronPython and other alternatives are also pretty rare (if any at all?) having our visualization engine available to IronPython projects seem to be a big improvement.

The good news first: it works and it does so very easily. The following plot is done purely in IronPython:

I started with a fresh new Iron Python Windows Forms Application from the ‘New Project’ dialog in Visual Studio 2013.

This gives a template python file ‘IronPython_ILPanel.py’ with the following content:

import clr
clr.AddReference('System.Drawing')
clr.AddReference('System.Windows.Forms')

from System.Drawing import *
from System.Windows.Forms import *

class MyForm(Form):
    def __init__(self):
        # Create child controls and initialize form
        pass

Application.EnableVisualStyles()
Application.SetCompatibleTextRenderingDefault(False)

form = MyForm()
Application.Run(form)

If you are familiar with the common setup of Windows Forms you may notice some similarities with the stub provided by a new C# Windows Forms Application project. What is otherwise spread over multiple files, now is done all within the global scope of the one and only *.py file: the form class is defined, the static Application class is configured and the event loop is started with an instance of the formerly defined MyForm class. Straight. This corresponds to the common setup for Windows.Forms from any language, where no Designer support exists for. The same could be done in PowerShell, to name just another example.

In order to integrate support for ILNumerics Visualization Engine, I did the following simple steps:

1) If not done so far, go and install ILNumerics Ultimate VS. We offer 30 days trials which contain all features without limitation.
2) Once installed, ILNumerics is available in the “Add Reference” dialog (right click on References in the Solution Explorer). ILNumerics.dll is added from the .NET tab:

3) Make the ILNumerics namespaces available: I added the following lines to the import section at the very top of IronPython_ILPanel.py:

import clr
clr.AddReference('System.Drawing')
clr.AddReference('System.Windows.Forms')
clr.AddReference('ILNumerics')

from System.Drawing import *
from System.Windows.Forms import *
from ILNumerics import *
from ILNumerics.Drawing import *
from ILNumerics.Drawing.Plotting import *

4) Now we are ready to add a plot to the form. In C# we would simply use the ilPanel1_Load event handler which is easily added by help of the forms designer. Since we do not have designer support here, we simply add the setup to the constructor of MyForm:

class MyForm(Form):
   def __init__(self):
     # Create child controls and initialize form
     ilpanel = ILPanel()
     ilpanel.Dock = DockStyle.Fill
     self.Controls.Add(ilpanel)
     # show some demo: first create a plotcube
     pc = ILPlotCube("pc", 0)
     # it will hold a surface with some sinc data in 3D
     # You can use ILRetArray returned from any ILNumerics Computing
     # Module as input argument here directly. Type conversions seem to happen automatically.
     sf = ILSurface(ILSpecialData.sincf(40,50))
     pc.Add(sf)
     # add the plotcube to the scene
     ilpanel.Scene.Add(pc)
     self.Text = "Plotting Fun with ILNumerics in IronPython"

This is all straightforward. The configuration of the panel and the plots is exactly as it would have been done in C# or Visual Basic. However, we have to add the ILPanel to the self.Controls collection manually. The result, of course is exactly the same: a nice interactive form, utilizing OpenGL, with all manipulation options like rotation, pan and zoom. The full spectrum of the Visualization Engine should work out of the box. This includes several, flexible 3D and 2D plotting types as well as the whole scene graph functionality for building your own interactive visualizations.

This is all what needs to be done in order to create nice professional plots directly within IronPython. Press F5 and start your application, rotate the plot with the mouse and learn about all the options you have.

Computing Fun with IronPython and ILNumerics

ILNumerics not only rely on its own flexible n-dimensional array implementation, it also introduces its very own memory management – for very good reasons. In order for the memory managements to work, individual array types (ILArray, ILInArray, ILOutArray and ILRetArray) are used in ILNumerics algorithms. This makes great use of the strict type safety of .NET languages as C# and Visual Basic and of automatic conversions between those types.

Python on the other side is a dynamic language. It does not know the concept of types to the same extend. A straightforward application of the ILNumerics array types is not possible. IronPython, however offers workarounds (clr.Convert) but they are not able to provide the same syntactical convenience as a pure C# algorithm.

My recommendation for the utilization of ILNumerics Computing Engine therefore is as follows:

ILNumerics Computing Engine can be used without restriction. The utilization of existing algorithms is straightforward. Algorithms leveraging the ILNumerics Function Rules can be called directly without any type conversions. The creation of local variables on the other side requires type conversions from ILRetArray to ILArray. This can be done by help of IronPythons clr.Convert function.

The type conversion issue possibly makes it less feasible to write own ILNumerics Computing Engine algorithms in IronPython. But most the time, one would rather want to use existing python algorithms anyway. In order to actually create a new algorithm, one should rather utilize C# and compile the algorithm into its own .NET module which can than easily be imported into your python project and get interfaced from your python code.

Summary

This blog demonstrated how easy it is to utilize ILNumerics from IronPython. Especially the Visualization Engine is incorporated without any problems and offers the full set of visualization and plotting options out of the box. Algorithms created with ILNumerics Computation Engine can directly be interfaced and used – in parallel with numpy algorithms, if the need arise. The syntax of the Computing Engine however is not as expressive and more clumsy due to the absense of implicit type conversions and operator overloads. Complex algorithms one therefore would rather implement in C# or VisualBasic than in IronPython. For small algorithms, like data preprocessing and the like, ILNumerics Computing Engine serves as a faster alternative to numpy arrays.

Dark color schemes with ILPanel

I recently got a request for help in building an application, where ILPanel was supposed to create some plots with a dark background area. Dark color schemes are very popular in some industrial domains and ILNumerics’ ILPanel gives the full flexibility for supporting dark colors. Here comes a simple example:

And here comes the code used to create this example:

private void ilPanel1_Load(object sender, EventArgs e) {
    // create some test data
    ILArray<float> A = ILSpecialData.torus(1.3f, 0.6f);

    // create the plot: a simple surface
    ilPanel1.Scene.Add(new ILPlotCube(twoDMode: false) {
        new ILSurface(A, colormap: Colormaps.Summer) {
            // we also want a colorbar
            new ILColorbar() {
                Background = {
                    Color = Color.DarkGray
                }
            }
        }
    });

    // set the backcolor of the scene to black
    ilPanel1.BackColor = Color.Black; 

    // set labels color
    foreach (var label in ilPanel1.Scene.Find<ILLabel>()) {
        label.Color = Color.White;
        label.Fringe.Width = 0;
    }
            
    // set the color of the default labels for axis ticks
    foreach (var axis in ilPanel1.Scene.Find<ILAxis>()) {
        axis.Ticks.DefaultLabel.Color = Color.White;
        axis.Ticks.DefaultLabel.Fringe.Width = 0;
    }
            
    // some more configuration: the view limits
    ilPanel1.Scene.First<ILPlotCube>().Limits.Set(
        new Vector3(0, 0, 1), new Vector3(2, 2, -1));

}

In line 4 we use the ILSpecialData class to create some test data. torus() creates the X, Y and Z values which eventually assemble a torus when used in ILSurface. The next line creates and adds a new plot cube to the scene. We set its two2Mode property to false, so we can rotate the torus with the mouse.

The next line creates a new surface and provides the torus data to it. As colormap ‘Colormaps.Summer’ is configured. Most surfaces need a colorbar in order to help mapping colors to actual values. We add a new colorbar below the surface and set its background color to some dark value.

Next, the BackColor of the main panel is set to black. Note, that setting the background color of a panel must be done in code in the current version (3.3.3). This is due to a bug in ILPanel which causes settings made in the designer to be ignored!

Now we have a dark background color but the labels still remain black. So let’s fix this: all labels which are part of the regular scene graph can easily be set at once. We simply use the ILGroup.Find() function to enumerate all labels and set their color to white. Also, we remove the fringe around them. Alternatively we could have set the fringe color to some dark color.

The last issue remaining is caused by the fact that labels for ticks cannot be configured here. The reason is, that tick labels are created dynamically. they don’t even exist at the time of execution of this code. So we must configure a thing called ‘DefaultLabel‘ instead. DefaultLabel is a member of the ticks collection of every axis object and used at runtime to provide default properties for all tick labels in auto mode.

This gives a nice dark color scheme. Keep in mind that the default color values for all scene-/plot objects are currently optimized for light background colors. Using dark backgrounds, therefore requires one to adjust the color on all plot objects accordingly.

Large Object Heap Compaction – on Demand ??

In the 4.5.1 side-by-side update of the .NET framework a new feature has been introduced, which will really remove one annoyance for us: Edit & Continue for 64 bit debugging targets. That is really a nice one! Thanks a million, dear fellows in “the corp”!

Another useful one: One can now investigate the return value of functions during a debug session.

Now, while both features will certainly help to create better applications by helping you to get through your debug session more quickly and conveniently, another feature was introduced, which deserves a more critical look: now, there exist an option to explicitly compact the large object heap (LOH) during garbage collections. MSDN says:

If you assign the property a value of GCLargeObjectHeapCompactionMode.CompactOnce, the LOH is compacted during the next full blocking garbage collection, and the property value is reset to GCLargeObjectHeapCompactionMode.Default.

Hm… They state further:

You can compact the LOH immediately by using code like the following:

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect(); 

Ok. Now, it looks like there has been quite some demand for ‘a’ solution for a serious problem: LOH fragmentation. This basically happens all the time when large objects are created within your applications and relased and created again and released… you get the point: disadvantageous allocation pattern with ‘large’ objects will almost certainly lead to holes in the heap due to reclaimed objects, which are no longer there, but other objects still resisting in the corresponding chunk, so the chunk is not given back to the memory manager and OutOfMemoryExceptions are thrown rather early …

If all this sounds new and confusing to you – no wonder! This is probably, because you are using ILNumerics :) Its memory management prevents you reliably from having to deal with these issues. How? Heap fragmentation is caused by garbage. And the best way to handle garbage is to prevent from it, right? This is especially true for large objects and the .NET framework. And how would one prevent from garbage? By reusing your plastic bags until they start disintegrating and your eggs get in danger of falling through (and switching to a solid basket afterwards, I guess).

In terms of computers this means: reuse your memory instead of throwing it away! Especially for large objects this puts way too much pressure on the garbage collector and at the end it doesn’t even help, because there is still fragmentation going on on the heap. For ‘reusing’ we must save the memory (i.e. large arrays in our case) somewhere. This directly leads to a pooling strategy: once an ILArray is not used anymore – its storage is kept safe in a pool and used for the next ILArray.

That way, no fragmentation occurs! And just as in real life – keeping the environment clean gives you even more advantages. It helps the caches by presenting recently used memory and it protects the application from having to waste half the execution time in the GC. Luckily, the whole pooling in ILNumerics works completely transparent in the back. There is nothing one needs to do in order to gain all advantages, except following the simple rules of writing ILNumerics functions. ILNumerics keeps track of the lifetime of the arrays, safes their underlying System.Arrays in the ILNumerics memory pool, and finds and returns any suitable array for the next computation from here.

The pool is smart enough to learn what ‘suitable’ means: if no array is available with the exact length as requested, a next larger array will do just as well:

public ILRetArray CreateSymm(int m, int n) {
    using (ILScope.Enter()) {
        ILArray A = rand(m,n); 
        // some very complicated stuff here...
        A = A * A + 2.3; 
        return multiply(A,A.T);
    }
}

// use this function without worrying about your heap!
while (true) {
   dosomethingWithABigMatrix(CreateSymm(1000,2000)); // one can even vary the sizes here!
   // at this point, your heap is clean ! No fragmentation! No GC gen.2 collections ! 
}

Keep in mind, the next time you encounter an unexpected OutOfMemoryException, you can either go out and try to make use of that obscure GCSettings.LargeObjectHeapCompactionMode property, or … simply start using ILNumerics and forget about that problem at least.

ILNumerics Language Features: Limitations for C#, Part II: Compound operators and ILArray

A while ago I blogged about why the CSharp var keyword cannot be used with local ILNumerics arrays (ILArray<T>, ILCell, ILLogical). This post is about the other one of the two main limitations on C# language features in ILNumerics: the use of compound operators in conjunction with ILArray<T>. In the online documentation we state the rule as follows:

The following features of the C# language are not compatible with the memory management of ILNumerics and its use is not supported:

  • The C# var keyword in conjunction with any ILNumerics array types, and
  • Any compound operator, like +=, -=, /=, *= a.s.o. Exactly spoken, these operators are not allowed in conjunction with the indexer on arrays. So A += 1; is allowed. A[0] += 1; is not!

Let’s take a closer look at the second rule. Most developers think of compound operators as being just syntactic sugar for some common expressions:

int i = 1;
i += 2;

… would simply expand to:

int i = 1;
i  = i + 2; 

For such simple types like an integer variable the actual effect will be indistinguishable from that expectation. However, compound operators introduce a lot more than that. Back in his times at Microsoft, Eric Lippert blogged about those subtleties. The article is worth reading for a deep understanding of all side effects. In the following, we will focus on the single fact, which becomes important in conjunction with ILNumerics arrays: when used with a compound operator, i in the example above is only evaluated once! In difference to that, in i = i + 2, i is evaluated twice.

Evaluating an int does not cause any side effects. However, if used on more complex types, the evaluation may does cause side effects. An expression like the following:

ILArray<double> A = 1;
A += 2;

… evaluates to something similiar to this:

ILArray<double> A = 1;
A = (ILArray<double>)(A + 2); 

There is nothing wrong with that! A += 2 will work as expected. Problems arise, if we include indexers on A:

ILArray<double> A = ILMath.rand(1,10);
A[0] += 2;
// this transforms to something similar to the following: 
var receiver = A; 
var index = (ILRetArray<double>)0;
receiver[index] = receiver[index] + 2; 

In order to understand what exactly is going on here, we need to take a look at the definition of indexers on ILArray:

public ILRetArray<ElementType> this[params ILBaseArray[] range] { ... 

The indexer expects a variable length array of ILBaseArray. This gives most flexibility for defining subarrays in ILNumerics. Indexers allow not only scalars of builtin system types as in our example, but arbitrary ILArray and string definitions. In the expression A[0], 0 is implicitly converted to a scalar ILNumerics array before the indexer is invoked. Thus, a temporary array is created as argument. Keep in mind, due to the memory management of ILNumerics, all such implicitly created temporary arrays are immediately disposed off after the first use.

Since both, the indexing expression 0 and the object where the indexer is defined for (i.e.: A) are evaluated only once, we run into a problem: index is needed twice. At first, it is used to acquire the subarray at receiver[index]. The indexer get { ...} function is used for that. Once it returns, all input arguments are disposed – an important foundation of ILNumerics memory efficency! Therefore, if we invoke the index setter function with the same index variable, it will find the array being disposed already – and throws an exception.

It would certainly be possible to circumvent that behavior by converting scalar system types to ILArray instead of ILRetArray:

ILArray A = ...;
A[(ILArray)0] += 2;

However, the much less expressive syntax aside, this would not solve our problem in general either. The reason lies in the flexibility required for the indexer arguments. The user must manually ensure, all arguments in the indexer argument list are of some non-volatile array type. Casting to ILArray<T> might be an option in some situations. However, in general, compound operators require much more attention due to the efficient memory management in ILNumerics. We considered the risk of failing to provide only non-volatile arguments too high. So we decided not to support compound operators at all.

See: General Rules for ILNumerics, Function Rules, Subarrays

Troubleshooting: Adding ILNumerics 3D Controls to the VS Toolbox

Adding ILNumerics visualizations to Visual Studio based projects has become a quite convenient task: It’s easy to use the ILNumerics math library for own projects in .NET. However, from time to time users have problems adding the ILNumerics controls to their Visual Studio Toolbox window.

Update: Since ILNumerics Ultimate VS version 4 this issue has been solved once for all. Simply install the MSI installer and find the ILNumerics ILPanel in the toolbox for all applicable situations.

That’s what a post on Stack Overflow from earlier this year was about: A developer who wanted to use our C# math library for 3d visualizations and simulations wasn’t able to access the ILNumerics controls. “How can I locate it?”, he was wondering. “Do I have to make some changes to my VS?”

Adding ILNumerics Controls to the Visual Studio Toolbox manually

If the ILNumerics Ultimate VS math library is installed on a system, normally the ILNumerics controls are automatically listed in the Visual Studio toolbox on all supported versions of Visual Studio. However, if that’s not the case there’s a way to a add them manually: After clicking right onto the toolbox, you can select “Choose Item”. The dialog allows you to select the assambly to load the controls from – that’s it! You will find the ILNumerics.dll in the installation folder on your system. By default this directory is located at:  “C:\Program Files (x86)\ILNumerics\ILNumerics Ultimate VS\bin\ILNumerics.dll”.

However, if that doesn’t work straightaway, it often helps to clear the toolbox from any copies of custom controls before – simply right-click it and choose “Reset Toolbox”.

Need help? ILNumerics Documentation and Support

You want to know more about our math library and its installation? Check out our documentation and the Quick Start Guide! If you have any technical questions, have a look at our Support Section.

Using LAPACK in C#/.NET: Linear Equotation Systems in ILNumerics

If you install a math library to your .NET/C# project, LAPACK will be probably one of the key feature you expect from that: The routines provided by LAPACK (which actually means: “Linear Algebra Package”) cover a wide range of functionalities needed for nearly any numerical algorithm, in natural sciences, computer science, and social science.

The LAPACK software library is written in FORTRAN code – until 2008 it was even written in FORTRAN 77. That’s why adding LAPACK functions to an enterprise software project written in Java or C#/.NET can be quite a demanding task: The implementation of native modules often causes problems regarding maintainability and steadiness of enterprise applications.

Our LAPACK implementation for C#/.NET

ILNumerics offers a convenient implementation of LAPACK for C# and .NET: It provides software developers both the execution speed of highly optimized processor specific native code and the convenience of managed software frameworks. That allows our users to create powerful applications in a very short time.

For linear algebra functions ILNumerics uses the processor-optimized LAPACK library by the MIT and Intel’s MKL. ILMath.Lapack is a concrete interface wrapper class that provides the native LAPACK functions. The LAPACK wrapper is initialized when a call to any static method of ILMath is made. Once the corresponding binaries for your actual architecture have been found, consecutive calls will utilize them in a very efficient way.

The MKL is utilized (and needed) for all calls to any fft(A) function, for matrix decompositions (like for example linsolve, rank, svd, qr etc.). The only exception to that is ILMath.multiply – the general matrix multiplication. Matrix multiplication is just such an often needed feature, a math library simply could not go without. So we decided to implement ILMath.multiply() purely in managed code. The good thing: it is not really far behind the speed of the processor optimized version! If MKL binaries are found at runtime, those will be used, of course. But in the case of their absence, the managed version should work out just fast enough for the very most situations.

In most cases using this kind of .NET/C# LAPACK implementation means: faster results and more stable software applications. Learn more about Linear Equation Systems and other features of ILNumerics in our Documentation.

C# for 3D visualizations and Plotting in .NET

2D and 3D Visualizations are an important feature for a wide range of domains: both software developers and scientists often need convenient visualization facilities to create interactive scenes and to make data visible. The ILNumerics math library brings powerful visualization features to C# and .NET: ILView, the ILNumerics Scene Graph API and its plotting engine. We’d like to give an overview over our latest achievements.

ILView: a simple way to create interactive 3d visualizations

We have created ILView as an extension to our interactive web component: It allows you to simply try out ILNumerics’ 2d and 3d visualization features by chosing the output format .exe in our visualization examples. But that’s not all: ILView is also a general REPL for the evaluation of computational expressions using C# language. ILView is Open Source – find it on GitHub!

Screenshot of ILView
Using ILView for interactive 3D Visualization

ILNumerics Scene Graph: realize complex visualizations in .NET

The ILNumeric’s scene graph is the core of ILNumerics’ visualization engine. No matter if you want to create complex interactive 3D visualizations, or if you aim at enhancing and re-configuring existing scenes in .NET: The ILNumerics scene graph offers a convenient way to realize stunning graphics with C#. It uses OpenGL, GDI, and it’s possible to export scenes into vector and pixel graphics.

Screenshot of an interactive 3D scene
Using C# for 3D visualizations: the ILNumerics Scene Graph

Scientific Plotting: visualize your data using C#

With ILNumerics’ visualization capabilities, C# becomes the language of choice for scientists, engineers and developers who need to visualize data: Our plotting API and different kinds of plotting types (contour plots, surface plots etc.) make easy work of creating beautiful scientific visualizations.

Screenshot of a Surface Plot in ILNumerics
Scientific Plotting in .NET: A Surface Plot created with ILNumerics