Not signed in (Sign In)

Categories

Welcome, Guest

Want to take part in these discussions? Sign in if you have an account, or apply for one below

Vanilla 1.1.4 is a product of Lussumo. More Information: Documentation, Community Support.

Welcome Guest!
Want to take part in these discussions? If you have an account, sign in now.
If you don't have an account, apply for one now.
    • CommentAuthorjaredbroad
    • CommentTimeJun 23rd 2010
     
    Hey Forum,

    I am running up against speed barriers again and reprofiling the application revealed -

    IlArray<double> ilX = Property.SampleData[":;2"];
    IlArray<double> ilY = Property.SampleData[":;5"];
    this.RawData[ePeriod] = STMath.Gradient(ilX, ilY);

    Calculating the gradient takes 1% of my application's time, but simply *selecting* the ilX and ilY arrays takes 8% each (16% total!)... any ideas how to select a sub-array faster? There are numerous selections like this throughout the code and I've done everything I can to minimise double work so it comes down to this or re-writing the gradient algorithm.

    Sample data is a 23,000 row x 15 col double data array.

    Thank you,

    Jared
    • CommentAuthorjaredbroad
    • CommentTimeJun 23rd 2010
     
    I played with this:
    //Initiliase the IL memory pool
    ILNumerics.Misc.ILMemoryPool.Pool.Reset(20000, 500);

    Which improved its overall speed by 30%, but shifted problem to same issue somewhere - a bisection method I wrote to select subarrays of the whole day's data. Same problem basically...

    return ilAllData["0:" + iProcessedCacheIdx.ToString() + ";:"];
    return ilAllData[iSearch.ToString() + ":" + (iProcessedCacheIdx - 1).ToString() + ";:"];

    The two lines above account for almost 20% of application's workload, and are only called 15,000 times each.

    They are simply selecting a subarray...thanks for assistance,

    Jared
    • CommentAuthorhaymo
    • CommentTimeJun 23rd 2010
     

    Since it appears you are mainly dealing with vectors here, the referencing feature is not suspicious. You may as well disable it completely by setting ILSettings.MinimumRefDimensions to 10 or so.

    How did you measure the workload for single functions? With a profiler?

    I assume, you are aware of the obstacles of measuring managed applications (in order to get comparable results)? Since most of the uncertainties are introduced by the garbage collector in the background. So we eventually cannot be sure, the workload is really produced by code from those functions. Rather it may be the GC dropping in while those functions are executed? You can make sure, by carefully profiling the GC and the managed heap size(s) as well. Here  the windows performance counter snap-in can be a good friend. ..

    • CommentAuthorjaredbroad
    • CommentTimeJun 23rd 2010
     
    Thanks for reply Haymo,

    Profiling the application with RedGate ANTS 5.1, it is pretty good.

    * I found the MinimumRef dimensions poking through the IL source and tried it on a hunch but it just caused the memory usage to explode (1200MB) and kept the same speed.

    * I have performance counter form controls running and when I put this: (ILNumerics.Misc.ILMemoryPool.Pool.Reset(20000, 900);) the memory usage dropped from about 600Mb to 30Mb. It is pretty stable at 25-40Mb.

    * I don't need copies of the master-data for analysis, a read-only reference to a range is fine. (i.e. Shouldn't I be able to spend 100bytes to reference a vector of 100Mb?)

    I'm only maintaining one master and the rest are subsets from the master (e.g. selecting the latest/top 10minutes of data) - Im not disposing or creating any bulk data.

    To confirm I ran the RedGate Memory Profiler and it showed 30-40% of processor time was spend in GC - any ideas how to stop this? Maybe I change how it references the arrays?
    When MinimumRef dimensions=10 it only spent 10% of time in GC.

    Thanks

    Jared
    • CommentAuthorhaymo
    • CommentTimeJun 24th 2010
     

    * it appears, your application goes in direction of 'number chrunching' ? This is not the intended area for any managed math lib. Anyway, you are not lost ... ;)

    * there is nothing wrong with the memory filling up! You bought the memory, why not use it? "stable at 40MB" is good for unmanaged applications. The managed heap will get used up completely before the GC automatically drops in. If it drops so often with such a small heap, there is the chance, it is triggered manually somewhere ? If yes, try to get around it!

    * I suggest, you first try to disable the referencing feature (MinRefDimensions or by using ILNumericsLight). Let the memory grow and see, if you get any "stable" memory usage value - according to your problem size. Dont hesitate to let this limit look large. Do use the Pool! If you do not get any OutOfMemoryExceptions, everything is fine and this is the best you can easily get.

    * Getting further is also possible - at the risk of getting wasteful as well...

    • CommentAuthorjaredbroad
    • CommentTimeJun 25th 2010
     
    Thank you for reply,

    I will keep ideas in mind. Possibly store data as basic double arrays instead of ILArrays and write my own vector wrappers if needed later. For now its not the best but not worth writing my own matrix lib.

    Just FYI I got OutOfMemoryExceptions with MinRefDimensions=10, PC idles around 3GB usage, nearly 4GB(full) when running with MinRef=10.

    From my understanding MemoryPool was used for keeping fixed dimension object storage (that would reused in next loop) keep from being disposed - my masters are basically static objects with a fixed 25,000 rows pre-allocated (never disposed) and I wrap my own search/indexing method, and the references vary in length (10 to 20,000 rows - all 16 columns) Since Im only disposing about 10 references at each loop (rather than the example 10000 pt array) it didn't seem to apply,

    How would I use the MemPool in this example?
    IlArray<double> ilX = Property.SampleData[":;2"];
    IlArray<double> ilY = Property.SampleData[":;5"];
    this.RawData[ePeriod] = STMath.Gradient(ilX, ilY);

    Export ilX, ilY to double[] and then save that to the pool? (And analyse the double arrays instead of vectors?)

    Thank you,
    • CommentAuthorhaymo
    • CommentTimeJun 25th 2010
     

    No, just do ilX.Dispose & ilY.Dispose. Since these are vectors, they own individual storage, good for reusing on matching requests. If 'SampleData' size does not change, they can be used in the next cycle.

    Also, could you please post your ILMemoryPool statistics? (http://ilnumerics.net/main.php?site=51432)

    • CommentAuthorjaredbroad
    • CommentTimeJun 25th 2010 edited
     
    I added in dispose, negligible difference (bench mark 5000 iterations = 18300ms, with dispose 18200ms)

    ILNumerics.Misc.ILMemoryPool.Pool.Info(true)
    "CurMB: 0| CurObj: 0| ReclMB: 0| ReclObj: 0"

    That could be why :)?

    - So I changed the ILMemPool settings:
    ILNumerics.Misc.ILMemoryPool.Pool.Reset(200, 900); to accept shorter arrays

    - Which predictably gives this:
    "CurMB: 188| CurObj: 15479| ReclMB: 459| ReclObj: 92629"
    But the bench mark 5000 iterations is now: 45175ms

    Profiler still showing 25% load on selecting sub-reference array,

    So! I added ILNumerics source to my project and profiled it at the same time, the percentages were shaved off through the method tree but then eventually got to "RegularSpacedList.Add(int)" being called 21-million times for 20% of total CPU (for 31,000 reference requests, about 6 per iteration of 5000 benchmark)

    for (int t = start; t <= ende;) {
    idxList.Add (t++); (20% CPU load, 21,000,000 hits)
    }

    Any thoughts? Am I doing something wrong or is this an integral part of ILNumerics?
    • CommentAuthorjaredbroad
    • CommentTimeJun 26th 2010
     
    Changing Count saved another 200ms:
    /// <summary>
    /// Add elements to this list
    /// </summary>
    /// <param name="value">new element</param>
    public new void Add(int value) {
    base.Add(value);
    int iCnt = Count;
    if (iCnt == 1) {
    m_lastValue = value;
    } else if (iCnt == 2) {

    I think calling the property "Count" actually Counts the elements of the array. It won't change between those lines so a fairly safe change.

    So its compiling a list of the indexes.. if its an "unstepped range" can we make it just note the start and end index? E.g. for a 100million long length array this command could take minutes: vs: simply noting the start and end position + master address? You already have "start/ende". And similarly make it remember a "Func" for addressing a stepped range?

    Anyway.. thanks for help, looks like that is as fast as it can be for now. Nice excuse for RAM/CPU upgrade ;)
    • CommentAuthorhaymo
    • CommentTimeJun 26th 2010
     

    Jared, you of course are absolutely right! The subarray creation by now still have a lot of potential for performance improvements. So, f.e. by noting A[":;4"], an ILRange object is created first, than an ILIndexOffset is derived from that and stored in the destination array. ILRange probably (at least) is completely avoidable. Also, for regularly spaced ranges - as you found - addressing indices do not have to be stored explicitely. Right now this is done for simplicity only. (That way, it is possible to handle ALL referencing storages the same way, when one must find out the final index into the data array. On the other hand, there would be if-else cascascades or virtual function calls). So the whole subarray creation (referencing or not) might be another big area for future improvements ...

    But the inherent problem still remains, even if the subarray creation would be more efficient. One way I am thinking of right now, would be to 'completely' avoid the "new" operator. This would imply to have all computational functions work on predefined storage and/or inline on incoming arrays. This is not done so far, since I could not figure out a way to make the nice operator overloads for ILArray work in that scheme too and not to loose the intuitive syntax. Possibly there will be 2 interfaces  ...