Category Archives: Interesting/ useless

Fun with Visual Studio regexp search

I only recently realized that the Visual Studio regexp feature in Search & Replace is even able to handle regexp captures. Example: In order to locate and replace all line endings in your code which exist at the end of non-empty lines, exluding lines which end at ‘/’ or with other non-alphanumeric characters one can use:

Search Pattern: ([0-9a-zA-Z)])\r\n
Replace: $1;\r\n

searchandreplacevisualstudioMatches inside () parenthesis are captured. When it comes to replacing the match the new value is created by first inserting the captured value at the position marked by ‘$1′. This way it it possible to, let’s say insert a new char in the middle of a search pattern – if it fulfills certain conditions.

There appears to be an error in the MSDN documentation, unfortunately at the point describing how to reference captures in Visual Studio regexp replace patterns. Anyway, using the ‘$’ char in conjunction with the index is the common way and it works here just fine.

Multiple captures are possible as well. Just refer to the captured content by the index according to the count of ‘()’ captures in the search pattern.

The Visual Studio regexp support is really helpful when translating huge C header files to C# – to name only one situations. It saved me a huge amount of time already. Hope you find this useful too.

Array Visualizer Extension Integration Tests. A Practical Guide.

Overview

Testing Visual Studio extensions against different versions of Visual Studio poses many challenges due to the large array of Visual Studio versions. The following post describes a generally applicable solution on the basis of the example of the ILNumerics Array Visualizer extension. Furthermore, this article will elaborate on how we managed to perform (relatively) stable test integrations into our build scripts.

When implementing new features into your code, today it is a well accepted fact that your code needs to be tested against as much potential scenarios as possible. You need to write unit tests and integration tests to anticipate the behavior of your application and to ensure it is working in the expected way in all possible situations.

So how do you test a Visual Studio extension? First of all, you should define integration test scenarios, to check if your code works as expected. Make sure to include many different scenarios to achieve maximum code coverage.

What do we have to test?

The ILArrayVisualizer is a Visual Studio extension that allows you to visualize large data sets in a number of ways during your debug session. Our current version 4.11 can not only be installed in different versions of Visual Studio but also supports various project languages and data types. To be precise, the full parameter space has the following dimensions:

  • Visual Studio versions: 2010, 2012, 2013, 2015
  • Programming languages: C#, Visual Basic, F#, Fortran, C/C++
  • All common project types: dll, exe
  • Platforms: 32/ 64 bit
  • All supported array types for each individual language: 1D/ n-dim, ILArray<T>, pointers, std::array, a.s.f.
  • All numeric element types: double, float, (f)complex, (u)int32/16/64), bytes, char, …
  • Arrays may also contain special numbers (NaNs, Inf), empty shapes in various forms, uninitialized arrays, or NULL.

The huge parameter space forbids any test strategies that are based on manually performed testing. In order to cover the most important cases only, we’ve identified more than 1000 integration test cases! As a result, automated integration tests are obligatory. Furthermore, the tests need to be integrated into our build system and have to support manual (debug) as well as automated (release) triggers.

One may ask: why not simply testing the array visualizer service in a regular unit test project? Why do we need to perform integration tests inside Visual Studio at all? The answer is that Visual Studio provides a complex environment which needs to fullfill a huge amount of requirements and to support us in a vast number of situations. A complex interaction with the Visual Studio debugger uncovers incompatibilities here and there. Performing automated integration tests over nearly the whole parameter space provides good coverage of all expected usage scenarios and indeed allowed us to identify any incompatibilities – and to work around them.

First: Define a Testing Strategy, Second: Write the Code

To test the ILArrayVisualizer, we had to define a testing strategy first. The testing strategy includes procedures that are performed during each test run and therefore have to be automated in a reliable way:

  • Run the experimental instance of Visual Studio including the installed ILArrayVisualizer extension.
  • Load a predefine test project of a certain programming language as supported by the Array Visualizer.
  • Stop the debugger at a certain, predefined position in the project.
  • Inspect various array instances and check that the Array Visualizer service is able to deliver correct results for it.

Hereby, the main challenge was that integration tests for Visual Studio extensions must run inside another instance of Visual Studio and that we needed a method to ensure this reliably for all supported version of Visual Studio. The test target (our extension project) must be installed in the target Visual Studio version in advance. Consequently, the tests were run in a semi-automatic mode: Our testing framework had to be executed once for each Visual Studio version.

How to Run the Test Code in an Experimental Instance of Visual Studio

Do you know how to write standard unit tests in C#? It is quite simple: While the attribute [TestClass] is applied to each test class object, the attribute [TestMethod] is applied to each test method. You can find more information about all attributes in the official MSDN Article “Anatomy of Unit Tests”.

For the integration tests things become a little more demanding. Let’s take a look at our testing strategy one more time: For each test method we have to create a new experimental instance of Visual Studio. Since we have to run more than 1000 test cases for each Visual Studio version, creating new instances, opening the test project and other actions will take too much time. So let’s change our strategy: We will start the experimental instance only once for all test methods, leveraging the attributes mentioned above.

To begin with, we implemented a [ClassInitialize] method, which runs the experimental instance for our test methods once:

[ClassInitialize]
[HostType("VS IDE")]
[TestProperty("VsHiveName", "14.0Exp")]

public static void TestClassInitialize(TestContext testContext)
{
// test class initialize
}
  • The attribute [ClassInitialize] tells the compiler that the following public method is defined as a constructor for our test class.
  • The attribute [HostType(“VS IDE”)] defines that the following method should be executed in Visual Studio IDE. It’s also possible to create an instance of another host type. You can read more about it in MSDN.
  • The attribute [TestProperty(“VsHiveName”,”14.0Exp”)] defines that the following test method will be executed in Visual Studio Version 14.0(VS 2015) inside the experimental instance.

Why do we actually need an Experimental instance? Because it gives us a way to run, debug and test our extension code – from inside Visual Studio.  The Experimental instance runs parallel to the
‘main’ instance. It allows a clear separation in terms of configuration and installed extensions. During the development of our extension, we can build and run our project by installing the target VSIX extension only inside the experimental instance.

Experimental Instance is launched, what’s next?

As our test method runs in an experimental instance of Visual Studio, we need to obtain an environment DTE. The DTE object is the root of the automation model, which other object models often call “Application”.

ILArrayVisualizer Extension works in the COM model of Visual Studio and as such has its own GUID which is used to identify the service among the list of available serives in Visual Studio. To call the methods to be tested, we also need to obtain a service provider. By using the service provider, we can access the ILArrayVisualizer Service inside the experimental instance and call the methods to be tested.

var m_envdte = VsIdeTestHostContext.Dte;
var m_shellservice = VsIdeTestHostContext.ServiceProvider.GetService(typeof(SVsShell)) as IVsShell;

Guid packageGuid = new Guid("XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX");

m_shellservice.LoadPackage(ref packageGuid, out m_package);

m_serviceProvider = new ServiceProvider(m_envdte as Microsoft.VisualStudio.OLE.Interop. IServiceProvider);

What about the Debugger and the test methods?

To obtain a debugger session descriptor in another instance of Visual Studio, the IDE uses the following code:

var m_debugger = m_envdte.Debugger;

So, this refers to the Debugger object inside the (experimental) instance referenced by the Environment DTE object. This way it is actually possible to control other instances of the VS IDE.

Basically, each integration test simulates the actions a regular user of the Array Visualizer would take: inside a debug session she inspects certain arrays (large variation here!) in the Array Visualizer and receives a number of correct outputs for them. All test projects (i.e. the debug target handled by our virtual users) have to be predefined. Each language comes with its own test projects files, in which we defined a large set of supported array expressions, including a reasonable number of edge cases. The projects are loaded via remote control from our integration tests, are started and halted in debug mode inside the remotely controlled Visual Studio instance. Visual Studio is used to set breakpoints in them, to stop the debugger and perform queries to the Array Visualizer service – one query for each of the predefined test array objects (more than 1000 test arrays exist).

In our case, we just open one of the predefined C++, C#, Visual Basic or FORTRAN project, run it in a new debug session, and iterate all contained array expressions by calling the evaluate() method of our ILArrayVisualizer service on it – all from our testing framework.

To open the existing projects and add those, we use the following method:

EnvDTE.Solution.AddFromFile(FileName);

To build the Solution:

EnvDTE.Solution.SolutionBuild.Build(true); 
// True means, wait until build is ready.

As we get debugger session descriptor, setting the breakpoint is as easy as:

Debugger.Breakpoints.Add(FileName, LineNumber);

All necessary initialization code could be packed into the test class initializer. In this case only one experimental instance will be created for each language and platform and reused to test  many test methods at once.

Passing the Test Code to Visual Studio IDE Experimental Instance

By default, all integration tests are hosted and run in the VSTestHost.exe process. To pass the test code to be executed to the experimental instance (in another environment), we have to use the Test Host Adapter. UIThreadInvoker allows us to implement a delegate method and pass it to our test environment.

UIThreadInvoker.Initialize();
UIThreadInvoker.Invoke((ThreadInvoker)delegate ()
{
// some test cases
}

The integration tests are not only for finding bugs inside your code, but also help you write correct code. To write unit tests, we specified how a class or method should work first. Read more about it here: “Test-Driven Development”. We used this strategy to write a correct expression evaluator for C++ code for the ILArrayVisualizer. We wrote all possible test methods for each possible expression and tested the expected values – for all supported languages.

When implementing a new feature it is crucial to develop a testing strategy to ensure proper functionality and achieve maximum code coverage. In Visual Studio there are testing tools that you can use for this purpose.

ILNumerics for Scientists – Going 3D

Recap

Last time I started with one of the easiest problems in quantum mechanics: the particle in a box. This time I’ll add 1 dimension and we’ll see a particle in a 2D box. To visualize its wave function and density we need 3D surface plots.

2D Box

This time we have a particle that is confined in a 2D box. The potential within the box is zero and outside the box infinity. Again the solution is well-known and can be found on Wikipedia. This time the state of the wave function is determined by two numbers. These are typically called quantum numbers and refer to the X and the Y direction, respectively.

The absolute size of the box doesn’t really matter and we didn’t worry about it in the 1D case. However, the relative size of the length and the width make a difference. The solution to our problem reads

$\Psi_{n,k}(x,y) = \sqrt{\frac{4}{L_x L_y}} \cdot \sin(n \cdot \pi \cdot x / L_x) \cdot \sin(k \cdot \pi \cdot y / L_y)$

The Math

Very similar to the 1D case I quickly coded the wave function and the density for further plotting. I had to make sure that the arrays are fit for 3D plotting, so the code looks a little bit different compared to last post’s

     public static ILArray<double> CalcWF(int EVXID, int EVYID, double LX, double LY, int MeshSize)
     {
        ILArray<double> X = linspace<double>(0, LX, MeshSize);
        ILArray<double> Y = linspace<double>(0, LY, MeshSize);

        ILArray<double> Y2d = 1;
        ILArray<double> X2d = meshgrid(X, Y, Y2d);

        ILArray<double> Z = sqrt(4.0 / LX / LY) * sin(EVXID * pi * X2d / LX) * sin(EVYID * pi * Y2d / LY);

        return Z.Concat(X2d,2).Concat(Y2d,2);
     }

Again, this took me like 10 minutes and I was done.

The Visualization

This time the user can choose the quantum numbers for X and Y direction, the ratio between the length and the width of the box and also the number of mesh points along each axis for plotting. This makes the visualization panel a little bit more involved. Nevertheless, it’s still rather simple and easy to use. This time it took me only 45 minutes – I guess I learned a lot from last time.

The result

Here is the result of my little program. You can click and play with it. If you’re interested, you can download the Particle2DBox source code. Have fun!

Particle2DBoxThis is a screenshot of the application. I chose the second quantum number along the x axis and the fourth quantum number along the y axis. The box is twice as long in y direction as it is in x direction. The mesh size is 100 in each direction. On the left hand side you see the wave function and on the right hand side the probability density.

Using LAPACK in C#/.NET: Linear Equotation Systems in ILNumerics

If you install a math library to your .NET/C# project, LAPACK will be probably one of the key feature you expect from that: The routines provided by LAPACK (which actually means: “Linear Algebra Package”) cover a wide range of functionalities needed for nearly any numerical algorithm, in natural sciences, computer science, and social science.

The LAPACK software library is written in FORTRAN code – until 2008 it was even written in FORTRAN 77. That’s why adding LAPACK functions to an enterprise software project written in Java or C#/.NET can be quite a demanding task: The implementation of native modules often causes problems regarding maintainability and steadiness of enterprise applications.

Our LAPACK implementation for C#/.NET

ILNumerics offers a convenient implementation of LAPACK for C# and .NET: It provides software developers both the execution speed of highly optimized processor specific native code and the convenience of managed software frameworks. That allows our users to create powerful applications in a very short time.

For linear algebra functions ILNumerics uses the processor-optimized LAPACK library by the MIT and Intel’s MKL. ILMath.Lapack is a concrete interface wrapper class that provides the native LAPACK functions. The LAPACK wrapper is initialized when a call to any static method of ILMath is made. Once the corresponding binaries for your actual architecture have been found, consecutive calls will utilize them in a very efficient way.

The MKL is utilized (and needed) for all calls to any fft(A) function, for matrix decompositions (like for example linsolve, rank, svd, qr etc.). The only exception to that is ILMath.multiply – the general matrix multiplication. Matrix multiplication is just such an often needed feature, a math library simply could not go without. So we decided to implement ILMath.multiply() purely in managed code. The good thing: it is not really far behind the speed of the processor optimized version! If MKL binaries are found at runtime, those will be used, of course. But in the case of their absence, the managed version should work out just fast enough for the very most situations.

In most cases using this kind of .NET/C# LAPACK implementation means: faster results and more stable software applications. Learn more about Linear Equation Systems and other features of ILNumerics in our Documentation.

Are you afraid of software developers?

In the 1980s and 1990s software developers had to face a bunch of bad prejudices: They were known to be sociophobic nerds, neglecting their real lifes in favor of hanging in front of the computer for writing code, discussing in hacking newsgroups and eating pizza.

Even though we’re still not living in a society of hackers, geekism has become mainstream. Not only the fact that most people spend a lot of time with their smartphones and computers: nerd culture is more popular than ever. Some weeks ago Luke Maciak wrote a nice article on that topic.

The establishment towards nerdism changed, and so did the general attitude towards software developers. In a way, programmers have become role models for the 21st century – not at least because they are an important factor regarding economic growth in the digital age.

However, having visited some events for start ups in Berlin has made us come across a new kind of prejudices towards developers. Most start ups in Berlin are more or less in the tech business: They create games, offer online services or develop facebook apps. Many of them have no CTOs in their teams, though. That’s why they employ freelancer developers.

Working together with software developers on this early stage of business is challanging for start ups. They often don’t have much money to spend: that’s why the wages developers ask for seem to be too high. Start ups want a strong team spirit: that’s why they don’t like developers to work from another place than their office.

But the most important problem is: As most founders aren’t developers theirselves, they don’t understand what their expensive freelancer is actually doing when he spends his days coding at home. For theis reason, young CEOs often become nervous: As their business depends on software, they feel like being on their developer’s mercy because he seems to be the only one who is actually able to understand his code.

In most cases we can calm down our fellows: Developers are used to get paid well and work when and where they want to. There’s also no reason to be afraid that no other developer would find his way into your software’s code: Modern languages and frameworks like .NET, Java or Ruby make most applications clean and well organized. So even in case you really have to split up with your developer, it won’t be that hard to find a new one who can continue his or her antecessor’s work.

In other words: In most cases there’s no need to be afraid of software developers. It’s pretty convenient to monitor enterprise software development these days.

However, the following question shows that this kind of convenience hasn’t arrived everywhere yet: “Why does scientific computing today still use only technology of the last century?”, someone claimed on reddit some days ago. This kind of question is the reason we have created ILNumerics: For the first time it brings the convenience and the improved efficiency and maintainence of modern managed languages to the development of numerical algorithms and 3d visualizations.

Scientific Computing Online: IPython Notebook, Shiny (R) and ILNumerics

It seems that we’re facing a trend at the moment: scientific computing, math and visualization software for web browsers. With our interactive web examples we have taken a step into that direction, too: Visitors of our website can change the C# code of our plotting and visualization demos in order to create a new SVG, PNG, JPG or EXE output. This allows people to easily try out the ILNumerics syntax and our powerful 2d and 3d visualization features for .NET. In addition to that, ILView allows a convenient way to interactively explore scenes that are created with ILNumerics.

There are two other web applications that cause a lot of excitement in the scientific community at the moment: The IPython Notebook and Shiny, a tool for creating web applications in R. Let’s have a closer look…

IPython Notebook: “Interactive Computational Environment”

The IPython Notebook adresses the huge amount of Python users in the scientific community. It basically offers a new way for writing papers: It’s a web based editor for code execution, math, text and visualization. Because the IPython Notebook combines all parts you normally need to write a scientific paper, you won’t have to import / export different elements from several domain specific software applications: “Everything related to my analysis is located in one unified place”, explains Philip J. Guo in his blog (http://www.pgbovine.net/ipython-notebook-first-impressions.htm). Once you have finished your paper, you can share your IPython Notebook as HTML and PDF with your colleagues, your professor etc.

Shiny: “Easy web applications in R”

Shiny stands for a different approach: It allows you to implement own analysis into web applications. While IPython obviously adresses Python users, Shiny is based on R, a still very popular programming language among statisticians. What makes Shiny interesting are its interactivity features: Most demos on the Shiny website offer the opportunity to choose input parameters from text fields or drop-downs to dynamically change the output visualization. The code seems to be quite similar to R, so users who are familiar with that language will easily be able to create interactive data visualization applications for their websites using Shiny.

Disadvantages: Performance does matter

Both approaches make web browsers accesable for specific needs of scientific visualization: The IPython Notebook offers a convenient tool to share the results of analytics related research; Shiny allows R developers to publish particular interactive plots on the web.

However, both projects are limited – namely because of technological issues. The level of performance that can be realized with both platforms is restricted: You’ll face that at the latest when you start creating complex 3d scenes with either Python or R. This holds true for the platforms’ web applications, too…

Outlook: Scientific Computing online

For certain purposes web based scientific computing software offers new convenient solutions. But if you want to realize complex interactive 3d visualizations, you still won’t use any of them but an application on your local machine instead.

Our interactive web examples point the direction we want to go. In order to make scientific computing more powerful, we’re working on the next step of our approach: a full WebGL support for ILNumerics. Stay tuned…

Using ILArray as Class Attributes

A lot of people are confused about how to use ILArray as class member variables. The documentation is really sparse on this topic. So let’s get into it!

Take the following naive approach:

class Test {

    ILArray<double> m_a;

    public Test() {
        using (ILScope.Enter()) {
            m_a = ILMath.rand(100, 100);
        }
    }

    public void Do() {
        System.Diagnostics.Debug.WriteLine("m_a:" + m_a.ToString());
    }

}

If we run this:

    Test t = new Test(); 
    t.Do(); 

… we get … an exception :( Why that?

ILNumerics Arrays as Class Attributes

We start with the rules and explain the reasons later.

  1. If an ILNumerics array is used as class member, it must be a local ILNumerics array: ILArray<T>
  2. Initialization of those types must utilize a special function: ILMath.localMember<T>
  3. Assignments to the local variable must utilize the .a property (.Assign() function in VB)
  4. Classes with local array members should implement the IDisposable interface.
  5. UPDATE: it is recommended to mark all ILArray local members as readonly

By applying the rules 1..3, the corrected example displays:

class Test {

    ILArray<double> m_a = ILMath.localMember<double>();

    public Test() {
        using (ILScope.Enter()) {
            m_a.a = ILMath.rand(100,100);
        }
    }

    public void Do() {
        System.Diagnostics.Debug.WriteLine("m_a:" + m_a.ToString()); 
    }

}

This time, we get, as expected:

m_a:<Double> [100,100]
   0,50272    0,21398    0,66289    0,75169    0,64011    0,68948    0,67187    0,32454    0,75637    0,07517    0,70919    0,71990    0,90485    0,79115    0,06920    0,21873    0,10221 ...
   0,73964    0,61959    0,60884    0,59152    0,27218    0,31629    0,97323    0,61203    0,31014    0,72146    0,55119    0,43210    0,13197    0,41965    0,48213    0,39704    0,68682 ...   
   0,41224    0,47684    0,33983    0,16917    0,11035    0,19571    0,28410    0,70209    0,36965    0,84124    0,13361    0,39570    0,56504    0,94230    0,70813    0,24816    0,86502 ...   
   0,85803    0,13391    0,87444    0,77514    0,78207    0,42969    0,16267    0,19860    0,32069    0,41191    0,19634    0,14786    0,13823    0,55875    0,87828    0,98742    0,04404 ...   
   0,70365    0,52921    0,22790    0,34812    0,44606    0,96938    0,05116    0,84701    0,89024    0,73485    0,67458    0,26132    0,73829    0,10154    0,26001    0,60780    0,01866 ...
...

If you came to this post while looking for a short solution to an actual problem, you may stop reading here. The scheme will work out fine, if the rules above are blindly followed. However, for the interested user, we’ll dive into the dirty details next.

Some unimportant Details

Now, let’s inspect the reasons behind. They are somehow complex and most users can silently ignore them. But here they are:

The first rule is easy. Why should one use anything else than a local array? So lets step to rule two:

  • Initialization of those types must utilize a special function: ILMath.localMember<T>

A fundamental mechanism of the ILNumerics memory management is related to the associated livetime of certain array types. All functions return temporary arrays (ILRetArray<T>) which do only live for exactly one use. After the first use, they get disposed off automatically. In order to make use of such arrays multiple times, one needs to assign them to a local variable. This is the place, where they get converted and the underlying storage is taken for the local, persistent array variable.

At the same time, we need to make sure, the array is released after the current ILNumerics scope (using (ILScope.Enter())) { … }) was left. Thereforem the conversion to a local array is used. During the conversion, since we know, there is going to be a new array out there, we track the new array for later disposal in the current scope.

If the scope is left, it does exactly what it promises: it disposes off all arrays created since its creation. Now, local array members require a different behavior. They commonly live for the livetime of the class – not of the current ILNumerics scope. In order to prevent the local array to get cleaned up after the scope in the constructor body was left, we need something else.

The ILMath.localMember() function is the only exception to the rule. It is the only function, which does not return a temporary array, but a local array. In fact, the function is more than simple. All it does, is to create a new ILArray<T> and return that. Since bothe types of both sides of the assignment match, no conversion is necessary and the new array is not registered in the current scope, hence it is not disposed off – just what we need!

What, if we have to assign the return value from any function to the local array? Here, the next rule jumps in:

  • Assignments to the local variable must utilize the .a property (.Assign() function in VB)

Assigning to a local array directly would activate the disposal mechanism described above. Hence, in order to prevent this for a longer living class attribute, one needs to assign to the variable via the .a property. In Visual Basic, the .Assign() function does the same. This will prevent the array from getting registered into the scope.

Example ILNumerics Array Utilization Class

Now, that we archieved to prevent our local array attribute from getting disposed off magically, we – for the sake of completeness – should make sure, it gets disposed somewhere. The recommended way of disposing off things in .NET is … the IDisposal interface. In fact, for most scenarios, IDisposal is not necessary. The array would freed, once the application is shut down. But we recommend implementing IDisposable, since it makes a lot of things more consistent and error safe. However, we provide the IDisposable interface for convenience reasons only – we do not rely on it like we would for the disposal of unmanaged ressources. Therefore, a simplified version is sufficient here and we can omit the finalizer method for the class.

Here comes the full test class example, having all rules implemented:

class Test : IDisposable {
    // declare local array attribute as ILArray<T>, 
    // initialize with ILMath.localMember<T>()! 
    readonly ILArray<double> m_a = ILMath.localMember<double>();

    public Test() {
        using (ILScope.Enter()) {
            // assign via .a property only!
            m_a.a = ILMath.rand(100,100);
        }
    }

    public void Do() {
        // assign via .a property only! 
        m_a.a = m_a + 2; 

        System.Diagnostics.Debug.WriteLine("m_a:" + m_a.ToString()); 
    }


    #region IDisposable Members
    // implement IDisposable for the class for transparent 
    // clean up by the user of the class. This is for con-
    // venience only. No harm is done by ommitting the 
    // call to Dispose(). 
    public void Dispose() {
        // simplified disposal pattern: we allow 
        // calling dispose multiple times or not at all.
        if (!ILMath.isnull(m_a)) {
            m_a.Dispose(); 
        }
    }

    #endregion
}

For the user of your class, this brings one big advantage: she can – without knowing the details – clean up its storage easily.

 
    using (Test t = new Test()) {
        t.Do();
    }

@UPDATE: by declaring your ILArray members as readonly one gains the convenience that the compiler will prevent you from accidentally assigning to the member somewhere in the code. The other rules must still be fullfilled. But by only using readonly ILArray<T> the rest is almost automatically.

ILArray, Properties and Lazy Initialization

@UPDATE2: Another common usage pattern for local class attributes is to delay the initialization to the first use. Let’s say, an attribute requires costly computations but is not needed always. One would usually create a property and compute the attribute value only in the get accessor:

class Class {
        
    // attribute, initialization is done in the property get accessor
    Tuple<int> m_a;

    public Tuple<int> A {
        get {
            if (m_a == null) {
                m_a = Tuple.Create(1);  // your costly initialization here
            }
            return m_a; 
        }
        set { m_a = value }
    }
}

How does this scheme go along with ILNumerics’ ILArray? Pretty well:

class Class1 : ILMath, IDisposable {

    readonly ILArray<double> m_a = localMember<double>();

    public ILRetArray<double> A {
        get {
            if (isempty(m_a)) {
                m_a.a = rand(1000, 2000); // your costly initialization here
            }
            return m_a; // this will only return a lazy copy to the caller!
        }
        set {
            m_a.a = value; 
        }
    }

    public void Dispose() {
        // ... common dispose implementation
    }
}

Instead of checking for null in the get accessor, we simply check for an empty array. Alternatively you may initialize the attribute with some marking value in the constructor. NaN, MinValue, 0 might be good candidates.

Social aspects of programming languages

Leo A. Meyerovichs and Ariel S. Rabkins investigation on “Social Influences on Language Adoption” enlighted me today. From the abstract:

“Why do some programming languages succeed and others fail? … we gathered and quantitatively analyzed several large datasets, including over 200,000 SourceForge projects and multiple surveys of 1,000-13,000 programmers. We find that social factors usually outweigh intrinsic technical ones. In fact, the larger the organization, the more important social factors become. … our results help explain the process by which languages become adopted or not.”

After I found this Google Tech Talk I couldn’t resist to play around with the data they publish on their website. Filtering for some of the most relevant (IMPO) languages produced the following picture (click to enlarge):

It marks C# to be the language, people rely on most for GUI projects. Did we expect something else? ;) What really did surprise me is the fact, people already put about the same preference on C++, C# and Scala regarding the suitability for scientific computing. I wonder, if this picture would change, if we would also take ILNumerics into account!?

Go ahaed and visit the interactive visualizations yourself!
Btw, the most important slices for me in the tech talk are found around minute 28:35. Here, Leo talks about catalyst factors in the adaptation process and identifies “Simplicity, relative advantage, trialability, observability and compatibility” – nothing really new to you, I suppose. But I certainly feel comfortable, to have them all together as a nice “to keep in mind” list…

Intel MKL release Notes & Fixes Lists

Since every time I need them I find them hard to find … here are some links to fixes lists for various Intel MKL versions:

http://denali.princeton.edu/intel_mkl_10_docs/Release_Notes.htm
http://software.intel.com/en-us/articles/intel-mkl-102-fixes-list
http://software.intel.com/en-us/articles/intel-mkl-103-bug-fixes/
http://software.intel.com/en-us/articles/intel-mkl-110-bug-fixes/ as well as http://software.intel.com/en-us/articles/intel-math-kernel-library-110

First Look at Julia on Windows

I recently blogged about the upcoming Lang.NEXT 2012 conference in Redmond. And since the videos are not uploaded yet (and the talk about Julia should pretty much only soon start) I decided to use the time to do some early evaluation of the language with the beautiful suggestive name everyone seems to fall in love immediately. Since we all know how prone love is to projection, I felt I needed a more rational look on the language. And – as usual – as things get clearer you get to know each other more and more and butterflies turn into even more beautiful butterflies … or into something completely different ….

Lets start with some motivation. Julia wants to bridge the gap between established convenient mathematical (prototyping, desktop) systems and high performance computing (parallel) resources. So, basically, it wants to be comfortable and fast. “Huh?” – I hear you say, “this is what ILNumerics does as well!” – and of course you are right. But Julia originates from a very different motivation as ILNumerics. For us, the goal is to provide convenient numeric capabilities with high performance and a comfortable syntax – but to do it directly in a general purpose language. Basically, this brings a lot of advantages when it comes to deployment of your algorithm and it is much easier to utilize all those convenient development tools which are already there for C#. Furthermore, (frequent) transition from business logic to your numerical algorithms can become nasty and error prone.

Julia, on the other side, has to fight other enemies: dynamic language design. Things like dispatching schemes, type inference and – promotion, lexer and parser and certainly a lot more. I really bow to those guys! From a first view they did really succeed. And at the same time, I am glad, that Eric Lippert and his colleagues took away the hard stuff from us. But, of course: by going through all that pain of language design (ok, it sometimes might be fun as well) – you gain the opportunity to optimize your syntax to far less limits. A ‘plus’ of convenience.

Lets take a look at some code. Readers of this blog are already familiar with what turns out to become our favorite algorithm for comparing languages: the kmeans algorithm in its beauty and simplicity. Here comes the Julia version I managed to run on Windows:

function kmeansclust (X, k, maxIterations)

nan_ = 0.0 / 0.0;
n = size(X,2); 
classes = zeros(Int32,1,n); 
centers = rand(size(X,1),k); 
oldCenters = copy(centers); 
while (maxIterations > 0)
        println("iterations left: $maxIterations"); 
        maxIterations = maxIterations - 1;
        for i = 1:n
                Xexp = repmat(X[:,i],1,k);
            	dists = sum(abs(centers - Xexp),1); 
	        classes[i] = find(min(dists) == dists)[1];
        end
        for i = 1:k
            inClass = X[:,find(classes == i)];
            if (isempty(inClass))
                centers[:,i] = nan_;
            else
                centers[:,i] = mean(inClass,2);
            end
        end
        if (all(oldCenters == centers))
            break;
        end
        oldCenters = copy(centers);
 end
 (centers, classes)
end

Did you notice any differences to the Matlab version? They are subtle:

  • Line 29 returns the result as tuple – a return keyword is not required. Moreover, what is returned does not need to be defined in the function definition.
  • Julia implements reference semantics on arrays. This makes the copy() function necessary for assignments on full arrays (-> lines 7 and 27). For function calls this implies, that the function potentially alters its input! Julia states the convention to add a ! to the name of any function, which alters its input parameter.

Besides that, the syntax of Julia can be pretty much compatible to MATLAB® – which is really impressive IMO. Under the hood, Julia even offers much more than MATLAB® scripts are able to do: type inference and multiple dispatch, comprehensions, closures and nifty string features like variable expansion within string constants, as known from php. Julia utilizes the LLVM compiler suite for JIT compilation.

Julia is too young to judge, really. I personally find reference semantics for arrays somehow confusing. But numpy does it as well and nevertheless found a reasonable number of users.

While the above code run after some fine tuning, the current shape of the Windows prebuilt binaries somehow prevented a deeper look in terms of performance. It still needs some quirks and bugs removed. (The Windows version was provided only some hours earlier and it was the first publicly available version for Windows at all.) As soon as a more stable version comes out, I will provide some numbers – possibly with an optimized version (@bsxfun is not implemented yet which renders every comparison unfair). According to their own benchmarks, I would expect Julia to run around the speed of ILNumerics.