All posts by haymo

Social aspects of programming languages

Leo A. Meyerovichs and Ariel S. Rabkins investigation on “Social Influences on Language Adoption” enlighted me today. From the abstract:

“Why do some programming languages succeed and others fail? … we gathered and quantitatively analyzed several large datasets, including over 200,000 SourceForge projects and multiple surveys of 1,000-13,000 programmers. We find that social factors usually outweigh intrinsic technical ones. In fact, the larger the organization, the more important social factors become. … our results help explain the process by which languages become adopted or not.”

After I found this Google Tech Talk I couldn’t resist to play around with the data they publish on their website. Filtering for some of the most relevant (IMPO) languages produced the following picture (click to enlarge):

It marks C# to be the language, people rely on most for GUI projects. Did we expect something else? ;) What really did surprise me is the fact, people already put about the same preference on C++, C# and Scala regarding the suitability for scientific computing. I wonder, if this picture would change, if we would also take ILNumerics into account!?

Go ahaed and visit the interactive visualizations yourself!
Btw, the most important slices for me in the tech talk are found around minute 28:35. Here, Leo talks about catalyst factors in the adaptation process and identifies “Simplicity, relative advantage, trialability, observability and compatibility” – nothing really new to you, I suppose. But I certainly feel comfortable, to have them all together as a nice “to keep in mind” list…

Intel MKL release Notes & Fixes Lists

Since every time I need them I find them hard to find … here are some links to fixes lists for various Intel MKL versions:

http://denali.princeton.edu/intel_mkl_10_docs/Release_Notes.htm
http://software.intel.com/en-us/articles/intel-mkl-102-fixes-list
http://software.intel.com/en-us/articles/intel-mkl-103-bug-fixes/
http://software.intel.com/en-us/articles/intel-mkl-110-bug-fixes/ as well as http://software.intel.com/en-us/articles/intel-math-kernel-library-110

Putting on a Good Show with HDF5, ILNumerics, and PowerShell

It is certainly nice to have the option to do all kinds of numeric stuff right in your .NET application layer – without the need for interfacing any unmanaged module. But for some tasks, this still seems overkill.

Lets say, you went to that conference and want to give your new friends some insight into your brand new simulation results. The PC in the internet cafe enables you to fetch the data from your NAT storage at home. But will you be able to do anything with it on that plain Windows PC?

Or you want to localize a certain test data set but cannot remember its rather cryptic name. Or you might want to manage the latest measurement results from todays atmospheric observation satellite scans. The data are huge but often require some sort of preprocessing. There should be some easy way to filter them by the meta data within the files, right?

Other than getting the data from some application layer, we now want to interface plain old file objects. Of course, you store your data in HDF5 format, right? You do so, because HDF5 is portable, very efficient, flexible and you are in good company.

Let’s see. We have a fresh Windows PC and we know every Windows installation nowadays comes with Powershell. Powershell itself is based on the .NET framework and hence efficiently handles any .NET assembly. It should be easy to use ILNumerics with Powershell! All we still need is some way to access the HDF5 files. ILNumerics, natively is able to read and write Matlab mat files up to version 6. It currently lags on native HDF5 support.

Luckily, the HDF Group provides a large collection of high quality tools for HDF support. Among them you’ll find a .NET wrapper and … a brand new Powershell module: PSH5X! Together with Gerd Heber, the leading inventor of PSH5X, we did a feasibility study with the goal to investigate the options of utilizing HDF5 and ILNumerics together in Powershell. It can be downloaded here. We were quite impressed by the options this brings.

This blog post will describe the necessary steps to setup Powershell for ILNumerics and HDF5.

Getting Started

Basically, the installation process for any Powershell module consists of

  1. Getting the module files and its dependencies from somewhere,
  2. Deploying the module files into a special folder on your machine, and
  3. Importing the module in your session.

The PSH5X homepage gives all information on how to get ready using the HDF5 Powershell module. Just download the package and follow the three steps on the page. At the end, HDF5 signals you a successful installation by displaying its version numbers.

Since ILNumerics depends on several other modules, we provide a small bootstrapper script. Just open up your favorite Powershell IDE (PowerShell_ISE.exe comes with any recent Windows) and copy/paste the following line:

(new-object Net.WebClient).DownloadString('http://ilnumerics.net/media/InstallILNumericsPSM.ps1') | iex

If you are curious, what this does – just ommit the trailing | iex and the script is not executed but displayed for your inspection.

The installer will ask for the installation folder (global under System32/ or local in your user profile), fetches the latest ILNumerics package for the current platform from the official nuget repository and install it into the selected module folder. In addition it loads the TypeAccelerator Powershell module and installs it into the same module directory. Note, the accelerators have been slightly modified in order to make them work with Powershell 3 and hence are fetched from our ILNumerics server. However, credits fully belong to poshoholic for his great work.

Note, the installation has to be done only once. Afterwards, on the next Powershell session, simply re-import needed modules by typing – lets say:

PS> Import-Module ILNumerics 

Go!

If everything was setup correctly, we can now use the full spectrum of the involved modules:

PS> [ilmath]::rand(4,5).ToString()
<Double> [5,4]
   0,72918    0,87547    0,43167    0,94942
   0,58024    0,75562    0,96125    0,83148
   0,22454    0,20583    0,82285    0,83144
   0,13300    0,40047    0,58829    0,87012
   0,50751    0,05496    0,02814    0,48764 

Nice. But what about the MKL? Are the correct binaries really installed as well?

PS> [ilf64] $A = [ilmath]::rand(1000,1000)
PS> Measure-Command { [ilf64]$C = [ilmath]::rank($A) }
Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 920
Ticks             : 9202311
TotalDays         : 1,06508229166667E-05
TotalHours        : 0,00025561975
TotalMinutes      : 0,015337185
TotalSeconds      : 0,9202311
TotalMilliseconds : 920,2311

PS> $C.ToString()
1000

We have almost all options from C#:

PS> [ilf64] $part = $A['10:15;993:end']
PS> $part.ToString()
<Double> [11,7]
   0,08522    0,87217    0,59997    0,57363    0,22956    0,02006    0,02359
   0,33479    0,49003    0,65269    0,97772    0,28322    0,69505    0,70372
   0,30072    0,68705    0,47112    0,68627    0,65030    0,40454    0,63026
   0,15639    0,30391    0,22992    0,69310    0,65716    0,51797    0,68110
   0,72854    0,60188    0,50740    0,74499    0,13459    0,88481    0,12445
   0,80525    0,60180    0,69256    0,74825    0,64388    0,16792    0,45266 

Lets sort the first row of $part, keeping track of original positions:

PS> [ilf64] $indices = 0.0
PS> [ilf64] $sorted = [ilmath]::sort($part['0,1;:'],$indices,0,$false)
PS> $sorted.ToString()
<Double> [2,7]
   0,02006    0,02359    0,08522    0,22956    0,57363    0,59997    0,87217
   0,28322    0,33479    0,49003    0,65269    0,69505    0,70372    0,97772
PS> $indices.ToString()
<Double> [2,7]
         5          6          0          4          3          2          1
         4          0          1          2          5          6          3 

This is all interactive. Of course, we can write complete functions and even complex algorithms that way.
One of the best things: Even in Powershell ILNumerics saves your memory and meets all expectations regarding execution speed. Powershell allows you to consequently use ILNumerics’ typing and scoping rules.

In our feasibility study with Gerd Heber, we show how easy it gets to access an HDF5 file, to convert its data to ILNumerics arrays (implicitly), filter and manipulate a little and even create a full interactive 3D surface graph from it. We demonstrate how to use the type accelerators and to mimic the using statement for artificial scoping. Take a look and let us know, what you think!

apps in math — or math in apps ?

John D. Cook explains on his blog how he likes writing actual mathematical applications the best. He favors python and scipy – but for the same reasons why ASP.NET and C# developers favor ILNumerics. Both approaches take a general purpose programming language and extend it with a mathematical library so one can do mathematical programming in a general purpose language.

First Look at Julia on Windows

I recently blogged about the upcoming Lang.NEXT 2012 conference in Redmond. And since the videos are not uploaded yet (and the talk about Julia should pretty much only soon start) I decided to use the time to do some early evaluation of the language with the beautiful suggestive name everyone seems to fall in love immediately. Since we all know how prone love is to projection, I felt I needed a more rational look on the language. And – as usual – as things get clearer you get to know each other more and more and butterflies turn into even more beautiful butterflies … or into something completely different ….

Lets start with some motivation. Julia wants to bridge the gap between established convenient mathematical (prototyping, desktop) systems and high performance computing (parallel) resources. So, basically, it wants to be comfortable and fast. “Huh?” – I hear you say, “this is what ILNumerics does as well!” – and of course you are right. But Julia originates from a very different motivation as ILNumerics. For us, the goal is to provide convenient numeric capabilities with high performance and a comfortable syntax – but to do it directly in a general purpose language. Basically, this brings a lot of advantages when it comes to deployment of your algorithm and it is much easier to utilize all those convenient development tools which are already there for C#. Furthermore, (frequent) transition from business logic to your numerical algorithms can become nasty and error prone.

Julia, on the other side, has to fight other enemies: dynamic language design. Things like dispatching schemes, type inference and – promotion, lexer and parser and certainly a lot more. I really bow to those guys! From a first view they did really succeed. And at the same time, I am glad, that Eric Lippert and his colleagues took away the hard stuff from us. But, of course: by going through all that pain of language design (ok, it sometimes might be fun as well) – you gain the opportunity to optimize your syntax to far less limits. A ‘plus’ of convenience.

Lets take a look at some code. Readers of this blog are already familiar with what turns out to become our favorite algorithm for comparing languages: the kmeans algorithm in its beauty and simplicity. Here comes the Julia version I managed to run on Windows:

function kmeansclust (X, k, maxIterations)

nan_ = 0.0 / 0.0;
n = size(X,2); 
classes = zeros(Int32,1,n); 
centers = rand(size(X,1),k); 
oldCenters = copy(centers); 
while (maxIterations > 0)
        println("iterations left: $maxIterations"); 
        maxIterations = maxIterations - 1;
        for i = 1:n
                Xexp = repmat(X[:,i],1,k);
            	dists = sum(abs(centers - Xexp),1); 
	        classes[i] = find(min(dists) == dists)[1];
        end
        for i = 1:k
            inClass = X[:,find(classes == i)];
            if (isempty(inClass))
                centers[:,i] = nan_;
            else
                centers[:,i] = mean(inClass,2);
            end
        end
        if (all(oldCenters == centers))
            break;
        end
        oldCenters = copy(centers);
 end
 (centers, classes)
end

Did you notice any differences to the Matlab version? They are subtle:

  • Line 29 returns the result as tuple – a return keyword is not required. Moreover, what is returned does not need to be defined in the function definition.
  • Julia implements reference semantics on arrays. This makes the copy() function necessary for assignments on full arrays (-> lines 7 and 27). For function calls this implies, that the function potentially alters its input! Julia states the convention to add a ! to the name of any function, which alters its input parameter.

Besides that, the syntax of Julia can be pretty much compatible to MATLAB® – which is really impressive IMO. Under the hood, Julia even offers much more than MATLAB® scripts are able to do: type inference and multiple dispatch, comprehensions, closures and nifty string features like variable expansion within string constants, as known from php. Julia utilizes the LLVM compiler suite for JIT compilation.

Julia is too young to judge, really. I personally find reference semantics for arrays somehow confusing. But numpy does it as well and nevertheless found a reasonable number of users.

While the above code run after some fine tuning, the current shape of the Windows prebuilt binaries somehow prevented a deeper look in terms of performance. It still needs some quirks and bugs removed. (The Windows version was provided only some hours earlier and it was the first publicly available version for Windows at all.) As soon as a more stable version comes out, I will provide some numbers – possibly with an optimized version (@bsxfun is not implemented yet which renders every comparison unfair). According to their own benchmarks, I would expect Julia to run around the speed of ILNumerics.

Lang.NEXT 2012

I know, it is almost obligatory to show overwhelming excitement on every upcoming ‘trend technology’ conference. But this time its not a play! The Lang.NEXT 2012 in Redmond exhibits a truely interesting list of speakers and projects. One being Julia (I talked about ‘her’ in my last post). Others are:

  • IKVM.NET – enabling Java applications to run on .NET
  • Roslyn – a more than promising approach to expose an API to the C# and VB compiler services, making them more attractive for runtime utilization and
  • Dart – ‘A Well Structured Web Programming Language’ by Google

But there are also talks about C++11 and ECMAScript 6 and … Just too many to get to know really. ‘Luckily’ I never really felt excitement for functional related stuff. So its easier for me to concentrate on the ‘rest’. but its not fair. Those F# projects do deserve your attention as well. And of course D will be online on the Lang.NEXT also.

Lang.NEXT 2012 starts tomorrow. I will definitely spend some time on the recordings and eventually report back here as well.

Julia, Math .NET M#, FORTRAN .NET, managed LAPACK, MKL and outlook

With the recent advances in the ILNumerics core module we were able to improve the computational part of our libraries a lot. Not only was the execution speed increased by magnitudes – while catching up with C++ and FORTRAN the .NET platform gets more attracting to an even wider community of scientists, engineers and programmers of numerical applications.

We find ourself as part of a very exciting evolution. A whole bunch of young and not so young projects are targeting similar goals like ILNumerics: convenience and performance. One interesting among them is the Julia language. A language, very similar to the MATLAB syntax (hence to ILNumerics’ syntax as well) is combined with a JIT compiler from the LLVM suite (what else?). While the convenience of the language is out of question the speed provided by the LLVM JIT is “in the range of 2x C++”. The language is dynamic which marks an important difference to ILNumerics.

Interestingly enough, one of the developers of Julia have been involved into the creation of M# (according to this blog post):

Jeff [Bezanson] was a principal developer of M#, an implementation of the MATLAB language running on .NET

And this is where it starts getting even more interesting. Consider, having a compiler
for ‘ILM#’ (an imaginary extension of Julia/MATLAB with typesafety), outputting .NET IL code and at the same time incorporating the deterministic disposal patterns of ILNumerics! However, I have not been able to find any working MATLAB-to-.NET compiler yet and no M# project either. Anyone out there knowing where it lives today?

The idea of being able to convert complete MATLAB code branches into ILNumerics libraries, making them run at the speed of C/FORTRAN is very appealing indeed. And there is another potential language as conversion source: FORTRAN itself! While a lot of developers value the platform independence and convenience of C# over FORTRAN (especially if it comes to GUI development or even RAD) – they argueably will not love the idea of rewriting all their grown-over-the-years FORTRAN algorithms again in ILNumerics. Having the option to automatically convert that code into C#/ILNumerics would not only save them from PInvoking into native FORTRAN libraries, but even make that code run on all platforms supported by .NET.

Having this in mind, I recently did some searching for matching projects. The two attempts I found:

  • Lahey Fujitsu, LF .NET Fortran compiler. Seems to be discontinued?
  • Silverfrost FTN95: Fortran 95 for Windows

I did some tests with FTN95. With some help of Paul Laider from Salford I have been able to create a ‘fully managed’ LAPACK version right from the netlib sources with only very minor modifications to the official FORTRAN code. I say ‘fully managed’ because at the end, you’ll get a real .NET assembly. However, the compiler comes with some drawback IMO which I will wirte about in a later post.

However, this brings us further to one of our goals (and to the last CAPITALIZED buzzword from our headline): not having to rely on MKL anymore. Since we have been able to speedup the matrix multiplication to around half the speed of the MKL, having all the LAPACK stuff within C# marks a next milestone. when all is finished, the user will have the option to choose from these deployment schemes:

  • ILNumerics fully managed version. Suitable for Silverlight, Office Addons, Visual Studio Plugins etc., <8 MB, all platforms supported, no native libs
  • ILNumerics 32 or 64 bit, with native support, platform specific, around 2 times faster, considerably larger binaries

And this is still without potential improvements on the “half the speed of MKL” issue …

As always: any comments welcome.

ILNumerics Version 2.11 released

Todays release of ILNumerics is a minor bugfix release. It ensures the var() function to be memory efficient even in certain optimization scenarios. Also, if you were using ILNumerics in conjunction with a free Trial License you may have encountered exceptions which not always made clear, what the problem was. They were related to the regular ILInvalidLicenseException and thus expected behavior. Nevertheless, it should be easier now to track those exceptions down.

Some users experienced problems regarding our Evaluation Licenses recently. These should be fixed now. In case you as well had problems getting the Evaluation License to run: please try again! Everything should run smoothly now.

As always: in case of problems, our forum is here to help you out fast.