Category Archives: Usage

Large Object Heap Compaction – on Demand ??

In the 4.5.1 side-by-side update of the .NET framework a new feature has been introduced, which will really remove one annoyance for us: Edit & Continue for 64 bit debugging targets. That is really a nice one! Thanks a million, dear fellows in “the corp”!

Another useful one: One can now investigate the return value of functions during a debug session.

Now, while both features will certainly help to create better applications by helping you to get through your debug session more quickly and conveniently, another feature was introduced, which deserves a more critical look: now, there exist an option to explicitly compact the large object heap (LOH) during garbage collections. MSDN says:

If you assign the property a value of GCLargeObjectHeapCompactionMode.CompactOnce, the LOH is compacted during the next full blocking garbage collection, and the property value is reset to GCLargeObjectHeapCompactionMode.Default.

Hm… They state further:

You can compact the LOH immediately by using code like the following:

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect(); 

Ok. Now, it looks like there has been quite some demand for ‘a’ solution for a serious problem: LOH fragmentation. This basically happens all the time when large objects are created within your applications and relased and created again and released… you get the point: disadvantageous allocation pattern with ‘large’ objects will almost certainly lead to holes in the heap due to reclaimed objects, which are no longer there, but other objects still resisting in the corresponding chunk, so the chunk is not given back to the memory manager and OutOfMemoryExceptions are thrown rather early …

If all this sounds new and confusing to you – no wonder! This is probably, because you are using ILNumerics :) Its memory management prevents you reliably from having to deal with these issues. How? Heap fragmentation is caused by garbage. And the best way to handle garbage is to prevent from it, right? This is especially true for large objects and the .NET framework. And how would one prevent from garbage? By reusing your plastic bags until they start disintegrating and your eggs get in danger of falling through (and switching to a solid basket afterwards, I guess).

In terms of computers this means: reuse your memory instead of throwing it away! Especially for large objects this puts way too much pressure on the garbage collector and at the end it doesn’t even help, because there is still fragmentation going on on the heap. For ‘reusing’ we must save the memory (i.e. large arrays in our case) somewhere. This directly leads to a pooling strategy: once an ILArray is not used anymore – its storage is kept safe in a pool and used for the next ILArray.

That way, no fragmentation occurs! And just as in real life – keeping the environment clean gives you even more advantages. It helps the caches by presenting recently used memory and it protects the application from having to waste half the execution time in the GC. Luckily, the whole pooling in ILNumerics works completely transparent in the back. There is nothing one needs to do in order to gain all advantages, except following the simple rules of writing ILNumerics functions. ILNumerics keeps track of the lifetime of the arrays, safes their underlying System.Arrays in the ILNumerics memory pool, and finds and returns any suitable array for the next computation from here.

The pool is smart enough to learn what ‘suitable’ means: if no array is available with the exact length as requested, a next larger array will do just as well:

public ILRetArray CreateSymm(int m, int n) {
    using (ILScope.Enter()) {
        ILArray A = rand(m,n); 
        // some very complicated stuff here...
        A = A * A + 2.3; 
        return multiply(A,A.T);
    }
}

// use this function without worrying about your heap!
while (true) {
   dosomethingWithABigMatrix(CreateSymm(1000,2000)); // one can even vary the sizes here!
   // at this point, your heap is clean ! No fragmentation! No GC gen.2 collections ! 
}

Keep in mind, the next time you encounter an unexpected OutOfMemoryException, you can either go out and try to make use of that obscure GCSettings.LargeObjectHeapCompactionMode property, or … simply start using ILNumerics and forget about that problem at least.

ILNumerics Language Features: Limitations for C#, Part II: Compound operators and ILArray

A while ago I blogged about why the CSharp var keyword cannot be used with local ILNumerics arrays (ILArray<T>, ILCell, ILLogical). This post is about the other one of the two main limitations on C# language features in ILNumerics: the use of compound operators in conjunction with ILArray<T>. In the online documentation we state the rule as follows:

The following features of the C# language are not compatible with the memory management of ILNumerics and its use is not supported:

  • The C# var keyword in conjunction with any ILNumerics array types, and
  • Any compound operator, like +=, -=, /=, *= a.s.o. Exactly spoken, these operators are not allowed in conjunction with the indexer on arrays. So A += 1; is allowed. A[0] += 1; is not!

Let’s take a closer look at the second rule. Most developers think of compound operators as being just syntactic sugar for some common expressions:

int i = 1;
i += 2;

… would simply expand to:

int i = 1;
i  = i + 2; 

For such simple types like an integer variable the actual effect will be indistinguishable from that expectation. However, compound operators introduce a lot more than that. Back in his times at Microsoft, Eric Lippert blogged about those subtleties. The article is worth reading for a deep understanding of all side effects. In the following, we will focus on the single fact, which becomes important in conjunction with ILNumerics arrays: when used with a compound operator, i in the example above is only evaluated once! In difference to that, in i = i + 2, i is evaluated twice.

Evaluating an int does not cause any side effects. However, if used on more complex types, the evaluation may does cause side effects. An expression like the following:

ILArray<double> A = 1;
A += 2;

… evaluates to something similiar to this:

ILArray<double> A = 1;
A = (ILArray<double>)(A + 2); 

There is nothing wrong with that! A += 2 will work as expected. Problems arise, if we include indexers on A:

ILArray<double> A = ILMath.rand(1,10);
A[0] += 2;
// this transforms to something similar to the following: 
var receiver = A; 
var index = (ILRetArray<double>)0;
receiver[index] = receiver[index] + 2; 

In order to understand what exactly is going on here, we need to take a look at the definition of indexers on ILArray:

public ILRetArray<ElementType> this[params ILBaseArray[] range] { ... 

The indexer expects a variable length array of ILBaseArray. This gives most flexibility for defining subarrays in ILNumerics. Indexers allow not only scalars of builtin system types as in our example, but arbitrary ILArray and string definitions. In the expression A[0], 0 is implicitly converted to a scalar ILNumerics array before the indexer is invoked. Thus, a temporary array is created as argument. Keep in mind, due to the memory management of ILNumerics, all such implicitly created temporary arrays are immediately disposed off after the first use.

Since both, the indexing expression 0 and the object where the indexer is defined for (i.e.: A) are evaluated only once, we run into a problem: index is needed twice. At first, it is used to acquire the subarray at receiver[index]. The indexer get { ...} function is used for that. Once it returns, all input arguments are disposed – an important foundation of ILNumerics memory efficency! Therefore, if we invoke the index setter function with the same index variable, it will find the array being disposed already – and throws an exception.

It would certainly be possible to circumvent that behavior by converting scalar system types to ILArray instead of ILRetArray:

ILArray A = ...;
A[(ILArray)0] += 2;

However, the much less expressive syntax aside, this would not solve our problem in general either. The reason lies in the flexibility required for the indexer arguments. The user must manually ensure, all arguments in the indexer argument list are of some non-volatile array type. Casting to ILArray<T> might be an option in some situations. However, in general, compound operators require much more attention due to the efficient memory management in ILNumerics. We considered the risk of failing to provide only non-volatile arguments too high. So we decided not to support compound operators at all.

See: General Rules for ILNumerics, Function Rules, Subarrays

AnyCPU Computing, limping Platform Specific Targets and a Happy Deployment End

EDIT 2014-10-02: The information in this article relates to ILNumerics Professional Edition, which is no longer maintained. However, the deployment scheme similarly holds for its successor: ILNumeris Ultimate VS. The goal is to make platform specific native binaries available at runtime. For ILNumerics Ultimate VS the number of such native libs has increased: next to mkl_custom.dll there will be several native HDF5 related dlls needing to be deployed with ILNumerics. All libs are found in the distribution package of any ILNumerics Ultimate VS Edition which allows distribution.

One issue escorted ILNumerics for just a long enough time. It is an issue which prevented ILNumerics to deploy to multiple
platform targets seamlessly. It completely prevented designer support for visualization applications when targeting 64 bit. It prevented a developer from easily switching between 32 bit and 64 bit targets in Visual Studio for testing purposes. And it – no wonder – caused a whole bunch of confusion among our users and a correspondingly huge amount of support requests: native dependencies.

AnyCPU in the Wild

It’s been a sad story from the beginning. There is that great feature of .NET which they call ‘AnyCPU’ platform target. The idea is simple: one creates an application once and simply deploys it to any platform supporting .NET. Regardless, if the target computer runs on x86 or x64 or … ok, lets stop here for now ;)! As simple as the idea is, as successfull it turns out to work in the wild. Platform specific differences between 32 and 64 bit environments are transparently abstracted away by the .NET languages (see: IntPtr) and the CLR. It is all good and fine … until native dependencies come into play.

As the name suggests, native dlls are not managed. They are compiled from pure unmanaged code. They do no abstraction work nor support such attempts by the CLR. Most of the time (at least regarding the ILNumerics native dependencies) they incorporate all the nifty pointer calculations which bring the last little quant of performance and all its danger that eventually leads us to move away from C/C++, right? Nevertheless, we sometimes still need those native libs – even though the number of such places are decreasing. All visualizations in ILNumerics run purely managed. From version 3.0 we have presented our own pure managed matrix multiplication. It works similar to the famous GOTO BLAS and handles even largest matrices in a very cache friendly way – very efficiently. It uses almost all tricks the MKL utilizes as well. And it beats all other managed implementations known to us by factors. However, it does not utilize AVX extensions (yet). Hence, is still keeps behind the MKL …

That’s where the Hassle starts

So we sometimes need native dependencies. What is the best way to incorporate them into your project without also incorporating all their disadvantages? We certainly do not want the whole application to be tied to a specific target platform just because it utilizes some routines from a native dll! We want ILNumerics to target ‘AnyCPU’ and let the final application and the machine it eventually runs on decide the bitrate. The problem with this approach is, that we need different native binaries for every platform. And even worse, the dependencies must be visible to the application at runtime.

A common deployment scheme for such native DLLs is to simply place them next to your application assembly in the same folder. According to the way a module is loaded by .NET (and Windows in general), it first looks for matching modules in the same folder where the application itself lays in. This simple scheme is sufficient for most cases – if the target platform is known in advance! However, when it comes to AnyCPU targets, this is not the case. We simply do not know if the application is to be run as 64 bit or as 32 bit process eventually.

Placing all dependencies for both 32 bit and 64 bit into the execution folder does obviously not improve the situation either. The Intel® MKL for example is compiled to arbitrary named DLLs. However, while the entry assembly can be given an individual name, differentiating between 32 and 64 bit, this is not true for dependencies of those dependencies. ‘libiomp5md.dll’ is needed by both. And it would require some serious dive into the MKL linking scheme to have individual mkl_custom.dll’s reference individually named dependencies. Hence, we cannot place all DLLs for all targets next to each other into the same folder. The former deploy scheme of ILNumerics (prior to version 3.2) used some naming scheme in order to solve those conflicts. But in the end, this did not really solve the issue but only helped preventing from accidentally mixing up files of different platforms – not without introducing new problems…

Introducing an old Pal: %PATH%

Several ‘official’ solutions are proposed for the problem:
1. Installations. The maintainer of the application (you) takes care of selecting correct binaries during installation time and installs the application for a specific target. Alternatively, on 64 bit systems, where applications have the option to run as both, 32 bit and 64 bit processes, native dependencies are placed (‘installed’) into corresponding system directories. All 64 bit dlls go into %SystemRoot%\system32 (not mistaken here) and all 32 bit DLLs go into %SystemRoot%\SysWoW64. Don’t blame me for the naming confusion. It is a good example for derived compatibility problems and actually makes sense – just not on the first sight.

If at runtime the assembly loader attempts to load a native dependency, it looks into these individual directories as well – into which one depends on the current bitrate of the process. Going that way, dependencies with similar names are nicely seperated and everything goes well. Obviously, administrative rights are necessary to store the DLLs into those system folders. And, unless one is really carefully, this may become the entry to the famous DLL hell…

EDIT: This is the deployment scheme used by all editions of ILNumerics Ultimate VS. It installs all CLR assemblies into the GAC and makes all native dlls immediately available for them at runtime. Use this scheme for your deployment if possible! We recommend to use Windows Installer Technology in order to realize a smooth deployment experience for your users.

2. AppDomain.AssemblyResolve. This is the .NET way of dealing with the issue. Unfortunately, it introduces a whole chain of other issues which are not obvious at first sight. But the biggest argument against it is the simplicity and beauty of the third option:

3. The environment path method. It has been internalized for a long time that modules are searched for in several directories, including those which are listed in the PATH environment variable. This offers an easy yet efficient way of dealing with native binaries for different platforms. All we have to do is to make sure that the native dependencies are seperated in individual (sub-)folders. At application or library startup the current bitrate of the process is examined and the PATH environment variable is altered accordingly to include the correct directory. Several variants exist to that approach. One of them is to preload the correct version of a dll from a subdirectory at startup and let the assembly loader cache handle repeated load attempts. However, due to its simplicity we stick to the PATH environment variable method. PATH can be altered even in a medium trust environment (which is important for our Web Code Component and ASP.NET in general). It needs some attention at startup once for the whole library but does not require special handling for individual dependencies afterwards.

Manually importing ILNumerics binaries into your project

Now let’s get our hands dirty! Here comes the new file scheme for deploying ILNumerics and its native dependencies. The whole setup is simplified dramatically by using the nuget packages, which is described at the end of this post. The following manual steps are only required, if you cannot or don’t want to use the nuget package manager for some reasons. One of those rare cases: if you want to use the source code distribution of the Community Edition (GPLv3) or want to setup ILNumerics without access to the official nuget repository.

The deploy package of the Professional Edition shows the following scheme:

/ - Root folder of the zip package
- bin
|- bin32 - 32 bit (Windows and Linux) native dependencies
|- bin64 - 64 bit (Windows) native dependencies
|- ILNumerics.dll - AnyCPU .NET merged assembly, used for all targets
|- ILNumerics.dll.config - Config file, needed for Linux plotting only
|- ILNumerics.dll.xml - Intellisense support file for Visual Studio
- doc - documentation folder, changelog and offline documentation
- ... (other files not considered here)

Other editions have a similar file structure. The important parts: There are two binary folders ‘bin32′ and ‘bin64′. Both include all native binaries necessary for each corresponding platform. These binaries contained for 32 bit and for 64 bit (may) have the same names but are strictly seperated into individual folders.

If you are targeting a single platform only (let’s say: x86) you might reuse the old deployment scheme: take the binaries from the bin32 folder and make sure they are found at runtime by ILNumerics.dll. So one might choose to simply place the binaries next to ILNumerics.dll. This old scheme will still work! However, in order to enable real multi-platform target support for both 32 and 64 bit, the following steps direct the solution:

Steps to incorporate ILNumerics as multi-platform target (AnyCPU) into your existing project:
1. Extract the whole distribution package into a directory on your harddisk.
2. Add a reference to ILNumerics.dll in the package as regular managed library reference for your project. Visual Studio copies the xml intellisense documentation and the corresponding .config file automatically.
3. Use Windows Explorer to copy both: bin32/ and bin64/ including all their content into the root folder of your new project. The root folder commonly is the one the *.csproj file lives in.
4. Back to Visual Studio, open the Solution Explorer and click on the ‘Show All Files’ icon. This will make all files from the directory visible – regardless if they are part of the VS project or not.
5. Find both folders (bin32 and bin64), right click and select: ‘Include In Project’.
6. Expand the content of both folders and select all DLLs contained. Press F4 to open the Property tool window.
7. Make sure to select ‘Copy to Output Directory’ -> ‘Copy If Newer’.

Your project is now setup to run with ILNumerics as AnyCPU target! You may try it out by simply switching the project target between x86, x64 (and AnyCPU if you like). In order to test if the native binaries are available at runtime, run the following snippet somewhere in your code:

ILNumerics.ILArray A = ILNumerics.ILMath.fft(ILNumerics.ILMath.rand(100,200));

FFT in ILNumerics (still) depends on the native MKL binaries. Hence, this code would fail if they would not be found at runtime. Make sure that all needed platform targets are working by switching the application targets in the Project Properties in Visual Studio and rerunning your application.

The Happy End – Recommended Way of Importing ILNumerics

Now it is certainly nice to have a setup with native binaries which runs on every platform target without ever having to exchange native dlls manually. However, the setup can be simplified further. NuGet comes in handy here. By utilizing the NuGet package manager the setup of ILNumerics for your project boils down to the follwing three simple steps:

1. Right Click on the project in the Visual Studio Solution Explorer. Select ‘Manage NuGet Packages’
2. Search for available packages by name: ‘ILNumerics’.
3. From the list of found packages, select ‘ILNumerics’ (not ‘ILNumerics32′ nor ‘ILNumerics64′ – these are deprecated now!) and install the ‘ILNumerics’ package.

If you are familiar with our older packages, you will notice that ILNumerics is now split into two individual packages: ILNumerics and ILNumerics.Native. Former basically consists out of the ILNumerics.dll (AnyCPU) only. It is a purely managed assembly, merged with several other .NET assemblies needed by the ILNumerics visualization part. This package does not include any native binaries. These come into play as dependency package ‘ILNumerics.Native’, referenced from the main ILNumerics package. It is automatically loaded when the main ILNumerics package is referenced.

NuGet does all the work described above in the manual setup for us: referencing the managed DLL, copying the bin32 and bin64 folder, including them into the project and making sure that the native binaries are deployed to your project output directory.

Note, the new AnyCPU target support is valid from version 3.2. It replaces the old (platform specific) deployment scheme immediately. This is a breaking change for all users, which have relied on the nuget packages ILNumerics.32Bit or ILNumerics.64Bit. Both old packages are deprecated now. We recommend switching to the new deployment scheme soon.

Download ILNumerics here! Please report back any problems you may find or any restrictions the new scheme may introduce for your setup. Thanks!

Fresh 3D Plotting API for Programmers and Scientists (CTP)

ILNumerics opens the Community Technology Preview for a new 3D Plotting API for interested users. Fetch the package from here:

http://ilnumerics.net/img/ILNumerics_CTPversion3d.zip

The API is already pretty stable. Nevertheless, there is still a lot of work to be done until the final release. Please report back any issues and comments! Your feedback is highly appreciated and gives you the chance to considerably influence the development at this early stage! Simply use the commenting feature here on this blog or mail to: info@ilnumerics.net. Thanks!

EDIT 2013-05-27: The newest release preview is the second official Release Candidate for Version 3:

http://ilnumerics.net/release/EUJJRWPO225PlHDjW09/ILNumerics_v3.0_RCf.zip

Since a relevant group of users encountered problems with buggy OpenGL drivers, we have tried to make the OpenGL rendering more robust. The current RC especially handles the buggy implementation of gl_ClipDistance[] on GT 3xx (M) NVIDIA cards. Some of them suffer from the lack of updated drivers since 2011 because NVIDIA does not plan to provide updates for Hybrid Power technology with Intel chipsets … :| The newest version expects a stable and standard conform implementation of OpenGL 3.1. On errors encountered it will switch to a more error tolerant shader version automatically.

We would apreciate any feedback – does this version run on your hardware? Does it produce any errors? Does it correctly fall back to GDI on unsupporting graphics cards ?

Thanks

EDIT 2013-05-30: Due to a lot of very useful feedback and bug reports, we prepared another update for the release candidate. No new features, we just made it more stable and robust against driver issues. So the ILPanel now always tries to establish an OpenGL rendering context. It even tries to circumvent common driver issues. If everything fails, it automatically switches back to plain GDI+ rendering as a last resort:

http://ilnumerics.net/release/EUJJRWPO225PlHDjW09/ILNumerics_v3.0_RCg.zip

Please keep the feedback coming! :) Thanks

Using ILArray as Class Attributes

Update: Information in this article relates to older versions of ILNumerics. For version 5 and later updated information is found here: https://ilnumerics.net/ClassRules.html.

A lot of people are confused about how to use ILArray as class member variables. The documentation is really sparse on this topic. So let’s get into it!

Take the following naive approach:

class Test {

    ILArray<double> m_a;

    public Test() {
        using (ILScope.Enter()) {
            m_a = ILMath.rand(100, 100);
        }
    }

    public void Do() {
        System.Diagnostics.Debug.WriteLine("m_a:" + m_a.ToString());
    }

}

If we run this:

    Test t = new Test();
    t.Do();

… we get … an exception :( Why that?

ILNumerics Arrays as Class Attributes

We start with the rules and explain the reasons later.

  1. If an ILNumerics array is used as class member, it must be a local ILNumerics array: ILArray<T>
  2. Initialization of those types must utilize a special function: ILMath.localMember<T>
  3. Assignments to the local variable must utilize the .a property (.Assign() function in VB)
  4. Classes with local array members should implement the IDisposable interface.
  5. UPDATE: it is recommended to mark all ILArray local members as readonly

By applying the rules 1..3, the corrected example displays:

class Test {

    ILArray<double> m_a = ILMath.localMember<double>();

    public Test() {
        using (ILScope.Enter()) {
            m_a.a = ILMath.rand(100,100);
        }
    }

    public void Do() {
        System.Diagnostics.Debug.WriteLine("m_a:" + m_a.ToString());
    }

}

This time, we get, as expected:

m_a:<Double> [100,100]
   0,50272    0,21398    0,66289    0,75169    0,64011    0,68948    0,67187    0,32454    0,75637    0,07517    0,70919    0,71990    0,90485    0,79115    0,06920    0,21873    0,10221 ...
   0,73964    0,61959    0,60884    0,59152    0,27218    0,31629    0,97323    0,61203    0,31014    0,72146    0,55119    0,43210    0,13197    0,41965    0,48213    0,39704    0,68682 ...
   0,41224    0,47684    0,33983    0,16917    0,11035    0,19571    0,28410    0,70209    0,36965    0,84124    0,13361    0,39570    0,56504    0,94230    0,70813    0,24816    0,86502 ...
   0,85803    0,13391    0,87444    0,77514    0,78207    0,42969    0,16267    0,19860    0,32069    0,41191    0,19634    0,14786    0,13823    0,55875    0,87828    0,98742    0,04404 ...
   0,70365    0,52921    0,22790    0,34812    0,44606    0,96938    0,05116    0,84701    0,89024    0,73485    0,67458    0,26132    0,73829    0,10154    0,26001    0,60780    0,01866 ...
...

If you came to this post while looking for a short solution to an actual problem, you may stop reading here. The scheme will work out fine, if the rules above are blindly followed. However, for the interested user, we’ll dive into the dirty details next.

Some unimportant Details

Now, let’s inspect the reasons behind. They are somehow complex and most users can silently ignore them. But here they are:

The first rule is easy. Why should one use anything else than a local array? So lets step to rule two:

  • Initialization of those types must utilize a special function: ILMath.localMember<T>

A fundamental mechanism of the ILNumerics memory management is related to the associated livetime of certain array types. All functions return temporary arrays (ILRetArray<T>) which do only live for exactly one use. After the first use, they get disposed off automatically. In order to make use of such arrays multiple times, one needs to assign them to a local variable. This is the place, where they get converted and the underlying storage is taken for the local, persistent array variable.

At the same time, we need to make sure, the array is released after the current ILNumerics scope (using (ILScope.Enter())) { … }) was left. Thereforem the conversion to a local array is used. During the conversion, since we know, there is going to be a new array out there, we track the new array for later disposal in the current scope.

If the scope is left, it does exactly what it promises: it disposes off all arrays created since its creation. Now, local array members require a different behavior. They commonly live for the livetime of the class – not of the current ILNumerics scope. In order to prevent the local array to get cleaned up after the scope in the constructor body was left, we need something else.

The ILMath.localMember() function is the only exception to the rule. It is the only function, which does not return a temporary array, but a local array. In fact, the function is more than simple. All it does, is to create a new ILArray<T> and return that. Since bothe types of both sides of the assignment match, no conversion is necessary and the new array is not registered in the current scope, hence it is not disposed off – just what we need!

What, if we have to assign the return value from any function to the local array? Here, the next rule jumps in:

  • Assignments to the local variable must utilize the .a property (.Assign() function in VB)

Assigning to a local array directly would activate the disposal mechanism described above. Hence, in order to prevent this for a longer living class attribute, one needs to assign to the variable via the .a property. In Visual Basic, the .Assign() function does the same. This will prevent the array from getting registered into the scope.

Example ILNumerics Array Utilization Class

Now, that we archieved to prevent our local array attribute from getting disposed off magically, we – for the sake of completeness – should make sure, it gets disposed somewhere. The recommended way of disposing off things in .NET is … the IDisposal interface. In fact, for most scenarios, IDisposal is not necessary. The array would freed, once the application is shut down. But we recommend implementing IDisposable, since it makes a lot of things more consistent and error safe. However, we provide the IDisposable interface for convenience reasons only – we do not rely on it like we would for the disposal of unmanaged ressources. Therefore, a simplified version is sufficient here and we can omit the finalizer method for the class.

Here comes the full test class example, having all rules implemented:

class Test : IDisposable {
    // declare local array attribute as ILArray<T>,
    // initialize with ILMath.localMember<T>()!
    readonly ILArray<double> m_a = ILMath.localMember<double>();

    public Test() {
        using (ILScope.Enter()) {
            // assign via .a property only!
            m_a.a = ILMath.rand(100,100);
        }
    }

    public void Do() {
        // assign via .a property only!
        m_a.a = m_a + 2; 

        System.Diagnostics.Debug.WriteLine("m_a:" + m_a.ToString());
    }

    #region IDisposable Members
    // implement IDisposable for the class for transparent
    // clean up by the user of the class. This is for con-
    // venience only. No harm is done by ommitting the
    // call to Dispose().
    public void Dispose() {
        // simplified disposal pattern: we allow
        // calling dispose multiple times or not at all.
        if (!ILMath.isnull(m_a)) {
            m_a.Dispose();
        }
    }

    #endregion
}

For the user of your class, this brings one big advantage: she can – without knowing the details – clean up its storage easily.

    using (Test t = new Test()) {
        t.Do();
    }

@UPDATE: by declaring your ILArray members as readonly one gains the convenience that the compiler will prevent you from accidentally assigning to the member somewhere in the code. The other rules must still be fullfilled. But by only using readonly ILArray<T> the rest is almost automatically.

ILArray, Properties and Lazy Initialization

@UPDATE2: Another common usage pattern for local class attributes is to delay the initialization to the first use. Let’s say, an attribute requires costly computations but is not needed always. One would usually create a property and compute the attribute value only in the get accessor:

class Class {

    // attribute, initialization is done in the property get accessor
    Tuple<int> m_a;

    public Tuple<int> A {
        get {
            if (m_a == null) {
                m_a = Tuple.Create(1);  // your costly initialization here
            }
            return m_a;
        }
        set { m_a = value }
    }
}

How does this scheme go along with ILNumerics’ ILArray? Pretty well:

class Class1 : ILMath, IDisposable {

    readonly ILArray<double> m_a = localMember<double>();

    public ILRetArray<double> A {
        get {
            if (isempty(m_a)) {
                m_a.a = rand(1000, 2000); // your costly initialization here
            }
            return m_a; // this will only return a lazy copy to the caller!
        }
        set {
            m_a.a = value;
        }
    }

    public void Dispose() {
        // ... common dispose implementation
    }
}

Instead of checking for null in the get accessor, we simply check for an empty array. Alternatively you may initialize the attribute with some marking value in the constructor. NaN, MinValue, 0 might be good candidates.

Putting on a Good Show with HDF5, ILNumerics, and PowerShell

It is certainly nice to have the option to do all kinds of numeric stuff right in your .NET application layer – without the need for interfacing any unmanaged module. But for some tasks, this still seems overkill.

Lets say, you went to that conference and want to give your new friends some insight into your brand new simulation results. The PC in the internet cafe enables you to fetch the data from your NAT storage at home. But will you be able to do anything with it on that plain Windows PC?

Or you want to localize a certain test data set but cannot remember its rather cryptic name. Or you might want to manage the latest measurement results from todays atmospheric observation satellite scans. The data are huge but often require some sort of preprocessing. There should be some easy way to filter them by the meta data within the files, right?

Other than getting the data from some application layer, we now want to interface plain old file objects. Of course, you store your data in HDF5 format, right? You do so, because HDF5 is portable, very efficient, flexible and you are in good company.

Let’s see. We have a fresh Windows PC and we know every Windows installation nowadays comes with Powershell. Powershell itself is based on the .NET framework and hence efficiently handles any .NET assembly. It should be easy to use ILNumerics with Powershell! All we still need is some way to access the HDF5 files. ILNumerics, natively is able to read and write Matlab mat files up to version 6. It currently lags on native HDF5 support.

Luckily, the HDF Group provides a large collection of high quality tools for HDF support. Among them you’ll find a .NET wrapper and … a brand new Powershell module: PSH5X! Together with Gerd Heber, the leading inventor of PSH5X, we did a feasibility study with the goal to investigate the options of utilizing HDF5 and ILNumerics together in Powershell. It can be downloaded here. We were quite impressed by the options this brings.

This blog post will describe the necessary steps to setup Powershell for ILNumerics and HDF5.

Getting Started

Basically, the installation process for any Powershell module consists of

  1. Getting the module files and its dependencies from somewhere,
  2. Deploying the module files into a special folder on your machine, and
  3. Importing the module in your session.

The PSH5X homepage gives all information on how to get ready using the HDF5 Powershell module. Just download the package and follow the three steps on the page. At the end, HDF5 signals you a successful installation by displaying its version numbers.

Since ILNumerics depends on several other modules, we provide a small bootstrapper script. Just open up your favorite Powershell IDE (PowerShell_ISE.exe comes with any recent Windows) and copy/paste the following line:

(new-object Net.WebClient).DownloadString('http://ilnumerics.net/media/InstallILNumericsPSM.ps1') | iex

If you are curious, what this does – just ommit the trailing | iex and the script is not executed but displayed for your inspection.

The installer will ask for the installation folder (global under System32/ or local in your user profile), fetches the latest ILNumerics package for the current platform from the official nuget repository and install it into the selected module folder. In addition it loads the TypeAccelerator Powershell module and installs it into the same module directory. Note, the accelerators have been slightly modified in order to make them work with Powershell 3 and hence are fetched from our ILNumerics server. However, credits fully belong to poshoholic for his great work.

Note, the installation has to be done only once. Afterwards, on the next Powershell session, simply re-import needed modules by typing – lets say:

PS> Import-Module ILNumerics 

Go!

If everything was setup correctly, we can now use the full spectrum of the involved modules:

PS> [ilmath]::rand(4,5).ToString()
<Double> [5,4]
   0,72918    0,87547    0,43167    0,94942
   0,58024    0,75562    0,96125    0,83148
   0,22454    0,20583    0,82285    0,83144
   0,13300    0,40047    0,58829    0,87012
   0,50751    0,05496    0,02814    0,48764 

Nice. But what about the MKL? Are the correct binaries really installed as well?

PS> [ilf64] $A = [ilmath]::rand(1000,1000)
PS> Measure-Command { [ilf64]$C = [ilmath]::rank($A) }
Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 920
Ticks             : 9202311
TotalDays         : 1,06508229166667E-05
TotalHours        : 0,00025561975
TotalMinutes      : 0,015337185
TotalSeconds      : 0,9202311
TotalMilliseconds : 920,2311

PS> $C.ToString()
1000

We have almost all options from C#:

PS> [ilf64] $part = $A['10:15;993:end']
PS> $part.ToString()
<Double> [11,7]
   0,08522    0,87217    0,59997    0,57363    0,22956    0,02006    0,02359
   0,33479    0,49003    0,65269    0,97772    0,28322    0,69505    0,70372
   0,30072    0,68705    0,47112    0,68627    0,65030    0,40454    0,63026
   0,15639    0,30391    0,22992    0,69310    0,65716    0,51797    0,68110
   0,72854    0,60188    0,50740    0,74499    0,13459    0,88481    0,12445
   0,80525    0,60180    0,69256    0,74825    0,64388    0,16792    0,45266 

Lets sort the first row of $part, keeping track of original positions:

PS> [ilf64] $indices = 0.0
PS> [ilf64] $sorted = [ilmath]::sort($part['0,1;:'],$indices,0,$false)
PS> $sorted.ToString()
<Double> [2,7]
   0,02006    0,02359    0,08522    0,22956    0,57363    0,59997    0,87217
   0,28322    0,33479    0,49003    0,65269    0,69505    0,70372    0,97772
PS> $indices.ToString()
<Double> [2,7]
         5          6          0          4          3          2          1
         4          0          1          2          5          6          3 

This is all interactive. Of course, we can write complete functions and even complex algorithms that way.
One of the best things: Even in Powershell ILNumerics saves your memory and meets all expectations regarding execution speed. Powershell allows you to consequently use ILNumerics’ typing and scoping rules.

In our feasibility study with Gerd Heber, we show how easy it gets to access an HDF5 file, to convert its data to ILNumerics arrays (implicitly), filter and manipulate a little and even create a full interactive 3D surface graph from it. We demonstrate how to use the type accelerators and to mimic the using statement for artificial scoping. Take a look and let us know, what you think!

ILNumerics and LINQ

As you may have noticed, ILNumerics arrays implement the IEnumerable interface. This makes them compatible with ‘foreach’ loops and all the nice features of LINQ!

Consider the following example: (Dont forget to include ‘using System.Linq’ and derive your class from ILNumerics.ILMath!)

ILArray<double> A = vec(0, 10);
Console.WriteLine(String.Join(Environment.NewLine, A));

Console.WriteLine("Evens from A:"); 
var evens = from a in A where a % 2 == 0 select a; 
Console.WriteLine(String.Join(Environment.NewLine, evens));
Console.ReadKey(); 
return; 

Some people like the Extensions syntax more:

 
var evens = A.Where(a => a % 2 == 0);

I personally find both equivalently expressive.

Considerations for IEnumerable<T> on ILArray<T>

No option exist in IEnumerable<T> to specify a dimensionality. Therefore, and since ILNumerics arrays store their elements in column major order, enumerating an ILNumerics array will be done along the first dimension. Therefore, when used on a matrix, the enumerator runs along the columns:

ILArray<double> A = counter(3,4); 
Console.WriteLine(A + Environment.NewLine); 
            
Console.WriteLine("IEnumerable:"); 
foreach(var a in A) 
    Console.WriteLine(a);

… will give the following:

<Double> [3,4]
         1          4          7         10
         2          5          8         11
         3          6          9         12

IEnumerable:
1
2
3
4
5
6
7
8
9
10
11
12

Secondly, as is well known, accessing elements returned from IEnumerable<T> is only possible in a read-only manner! In order to alter elements of ILNumerics arrays, one should use the explicit API provided by our arrays. See SetValue, SetRange, A[..] = .. and GetArrayForWrite()

Lastly, performance considerations arise by excessive utilization of IEnumerable<T> in such situations, where high performance computations are desirable. ILNumerics does integrate well with IEnumerable<T> – but how well IEnumerable<T> does integrate into the memory management of ILNumerics should be investigated with help of your favorite profiler. I would suspect, most every day scenarios do work out pretty good with LINQ since it concatenates all expressions and queries and iterates the ILNumerics array only once. However, let us know your experiences!

Why the ‘var’ keyword is not allowed in ILNumerics

One of the rules related to the new ILNumerics memory management, which potentially causes issues, is not to use the var keyword of C#. In Visual Basic, similar functionality is available by ommiting the explicit type declarations for array types.

ILArray<double> A = rand(5,4,3); // explicit array type declaration required
var B = rand(5,4,3); // NOT ALLOWED!

Lets take a look at the reasons, why this – otherwise convenient – language feature is not allowed in conjunction with array declarations. In order to make this clear, a little survey into the internals of the ILNumerics memory management is needed.

In ILNumerics, local arrays are of one of the types ILArray<T>, ILLogicalArray, ILCell. Those array types are one building block of the memory managemt in ILNumerics. By using one of those array types in a function, one can be sure, to keep the array alive and available as long as the function is not left. On the other hand – as soon as the function was left, the array will be recycled immediately.

Other array types exist in ILNumerics. They serve different purposes regarding their lifetime and mutability. Input arrays like ILInArray<T> f.e., make sure, arrays given as function parameters are unable to get altered.

Another important type is the return type of any function. Every function, property or operator in ILNumerics returns arrays of either ILRetArray<T>, ILRetLogical or ILRetCell. Return arrays are volatile or temporary arrays. Their use is restricted to exactly one time. After the first use, return arrays are disposed immediately. For expressions like the following, this behavior drastically increases memory efficiency:

abs(pow(cos(A*pi/2+t),2)

Assuming A to be a (rather large) array, 7 temporary memory storages of the size of A would be necessary in order to evaluate the whole expression. But if we take the above assumption regarding the lifetime of return arrays into account, that number is reduced to at most 2 temporary arrays. The reason: A * pi needs one storage for the result: Result1. It is than used to compute [Result1] / 2. Here, another storage is needed for the new result: Result2. At the end of the division operation, [Result1] has already been used for the first time. Since it is a Return Type, it is released and its storage recycled. For the next calculation [Result2] + t, the storage from [Result1] is already found in the memory pool of ILNumerics. Therefore, no new storage is needed and both temporary storages are alternatingly used for the subsequent evaluations.

Lets assume, the expression above does only make sense, if we can retrieve the result and use it in subsequent expressions inside our function. The most common case would be to assign the result to a local variable. Now, we get close to the interesting part: If we would allow the var keyword to be used, C# would generate a local variable B of type ILRetArray<double>:

// DO NOT DO THIS!!
var B = abs(pow(cos(A*pi/2+t),2);    // now B is of type ILRetArray<double> !

Console.Out.Write(B.Length);
Console.Out.Write(B.ToString());    //<-- fails! 

Besides the fact, that this would conflict with the function rules of ILNumerics (local array types must be of ILArray<T> or similar), B could only be used for exactly one time! In the example, B.Length does execute normaly. After that statement, B gets disposed. Therefore, the statement B.ToString() will already fail. This is, why var is not permitted.

By explicitely declaring the local array type, the compiler will use implicit type conversions in order to convert a return array to a local array, which is than available for the rest of the function block:

// this code is correct
ILArray<double> B = abs(pow(cos(A*pi/2+t),2);    // now B is of type ILArray<double> !

Console.Out.Write(B.Length);
Console.Out.Write(B.ToString());    // works as expected

See: General Rules for ILNumerics, Function Rules