Unity (Pre 5.5) Memory Safe Enumerators with C5 Generic Collection Library

DISCLAIMER: The topic treated in this article is only valid for version of Unity up to 5.4

Long time ago I posted an article on how disposable value types were treated in Unity and why they used to generate unnecessary and unwanted garbage. It emerged that in the official Mono compiler as well as in the Microsoft C# compiler (but not in Unity) a violation of the C# specification lead to an optimisation of disposable structs within a using statement. Disposable value types are used in C# mainly to implement iterator blocks, which are used to iterate over collections. Two years ago I  decided to fix this issue by re-implementing the enumerators in a library called C5 which is a project for generic collection classes for C# and other CLI languages. However with the release of Unity 5.5 back in March 2017 version 4.4 of the Mono C# compiler was shipped and finally this issue was properly fixed and became history.

This solution is not relevant anymore 🙂 unless you use an old version of Unity but I would still like to share with you the solution I came up with before the release of the new Mono compiler.

C5 implements a lot of data structures not provided by the standard .NET Framework, such as persistent trees, heap based priority queues, hash indexed array lists and linked lists, and events on collection changes. The source code is available on GitHub and MIT license makes you free to modify and re-distribute it if you want. I started my journey in creating my own enumerator implementation for the main collections (ArrayList, DictionaryHash, SortedDictionary etc) and I came up with the idea of a “reusable” enumerator.

With this approach only one enumerator instance per collection iterated is used at a time. Naturally this has some limitations. For example, multiple iterations of the same collection, multithread access and LINQ will not work.

To accommodate all the cases I implemented three memory models called:

  1. Normal: An enumerator is created anytime the collection is iterated. This is the normal behaviour expected and thus is not memory safe, but supports multiple iterations, multithread and LINQ.
  2. Safe: An enumerator is created once and then re-used. This approach doesn’t generate garbage. However, if the collection is iterated using nested loops or accessed by multiple threads, a new enumerator is created. The collection will save memory unless it is forced not to do so.
  3. Strict: An enumerator is created only once. This approach doesn’t generate garbage at all cost.  if the collection is iterated using nested loops or accessed by multiple threads an exception is thrown.

The memory model is implemented as an enum and it is passed to the constructor. For example:

   HashSet<inta = new HashSet<int>(MemoryType.Strict);
Screen Shot 2016-03-02 at 14.36.38

Figure 1 – MemoryType.Normal – 56 bytes of garbage every frame

Figure 1 and 2 shows two different scenarios: in the former garbage is generated by iterating over an ArrayList while in the latter no garbage is reported by the memory profiler.

MemoryType.Normal replicates the normal behaviour. The amount of garbage generated depends really on the size of the struct that is used to iterate the collection, therefore its size can vary. Figure 2 shows instead that no garbage is generated when an ArrayList is iterated.

Screen Shot 2016-03-02 at 14.35.43
Figure 2 – C5 ArrayList with MemoryType.Safe – No garbage

This is possible by reusing the same enumerator. Although it is not shown, 56 bytes are allocated only the first time the collection is iterated.


Currently the garbage free memory model is implemented for the following collections:

  • ArrayList<T>
  • HashedArrayList<T>
  • SortedArray<T>
  • WrappedArray<T>
  • CircularQueue<T>
  • HashSet<T>
  • TreeBag<T>
  • HashBag<T>
  • HashDictionary<T>
  • TreeDictionary<T>
  • TreeSet<T>
  • LinkedList<T>
  • HasedLinkedList<T>
  • IntervalHeap<T>

This is the source code for the MemorySafeEnumerator:

 internal abstract class MemorySafeEnumerator<T> : IEnumerator<T>, IEnumerable<T>, IDisposable {
     private static int MainThreadId;
     //-1 means an iterator is not in use.
     protected int IteratorState;
     protected MemoryType MemoryType { getprivate set; }
     protected static bool IsMainThread {
         get { return Thread.CurrentThread.ManagedThreadId == MainThreadId; }
     protected MemorySafeEnumerator(MemoryType memoryType) {
         MainThreadId = Thread.CurrentThread.ManagedThreadId;
         IteratorState = -1;
     protected abstract MemorySafeEnumerator<TClone();
     public abstract bool MoveNext();
     public abstract void Reset();
     public T Current { getprotected set; }
     object IEnumerator.Current {
         get { return Current; }
     public virtual void Dispose() {
         IteratorState = -1;
     public IEnumerator<TGetEnumerator()
         switch (MemoryType) {
             case MemoryType.Normal:
                 enumerator = Clone();
             case MemoryType.Safe:
                 if (IsMainThread) {
                     enumerator = IteratorState != -1 
                     ? Clone() 
                     : this;
                     IteratorState = 0;
                 else {
                     enumerator = Clone();
             case MemoryType.Strict:
                 if (!IsMainThread) {
                     throw new ConcurrentEnumerationException("Multithread access detected! In Strict memory mode is not possible to iterate the collection from different threads");
                 if (IteratorState != -1) {
                     throw new MultipleEnumerationException("Multiple Enumeration detected! In Strict memory mode is not possible to iterate the collection multiple times");
                 enumerator = this;
                 IteratorState = 0;
                 throw new ArgumentOutOfRangeException();
         return enumerator;
     IEnumerator IEnumerable.GetEnumerator() {
         return GetEnumerator();

Everything happens in the GetEnumerator() method. In normal mode the enumerator is always cloned while in safe mode the enumerator is cloned only for multithread access and/or multiple enumerations, otherwise the same instance is reused. The strict model optimise at all cost but throws an exception for the other cases.


This solution is clearly outdated and I’m glad that Unity has eventually adopted a proper version of the Mono compiler. However I had a lot of fun coding a hand-made solution, and it was also a good opportunity to dive into the nitty-gritty implementation of C5 and I learnt a lot about data structures. Next time I will remember to publish an article on time 😀

See you soon!

A Static Code Analysis in C++ for Bullet Physics


Hello folks! I’m here again this time to talk about static analysis. If you are a developer with little to no knowledge on the subject this is the right article for you. Static analysis is the process of analyzing the code of a program without actually running it as opposed to dynamic analysis where code is analysed at run time. This process helps developers to identify potential design issues, bugs, to improve performances and to ensure conformance to coding guidelines. Continue reading “A Static Code Analysis in C++ for Bullet Physics”

Unity Mono Runtime – The Truth about Disposable Value Types

When I started making games using Unity, after almost 10 years of C# development, I was very concerned to acknowledge that foreach loops are highly avoided in Unity because they allocate unnecessary memory on the heap. Personally I love the clean syntax of a foreach. It aids readably and clarity and it also increases the abstraction level. However a very clear and neat explanation of the memory issue problem can be found in a blog article posted on Gamasutra by Wendelin Reich.

From Wendelin’s analysis it emerged that the version of the Mono compiler adopted in Unity has a different behaviour from Microsoft implementation. In particular enumerators, which are usually implemented in the .NET framework as mutable value types, are boxed by the compiler, causing an unnecessary generation of garbage. Boxing is the process of converting a value type (allocated on the stack) into a reference type, thus allocating a new instance on the heap.  Continue reading “Unity Mono Runtime – The Truth about Disposable Value Types”

Unity and Reflection – Optimising Memory using Caching on iOS



I really love reflection. Reflection is a technique used for obtaining type information at run-time. It’s not only that, with reflection is possible to examine and change information of objects, to generate (technically to emit IL) new classes, methods and so on still at runtime. It’s a powerful technique but it is known, under certain circumstances, for being slow. If you are a game developer and you are targeting mobile devices (iOS or Android for instance) using Unity, you definitely want to preserve your memory and save precious clock cycles. Moreover, with AOT (Ahead of Time compilation)  IL cannot be emitted at run-time as it is pre-generated at compile time. Therefore a large part of reflection, e.g. expression trees, anonymous types etc., is just not available.

The Problem

Recently I have worked on a dynamic prefab serializer and I needed to use reflection to retrieve types from their string representations. In general to retrieve a type in C# you have three options:

  • typeof(MyClass), which is an operator to obtain a type known at compile-time.
  • GetType() is a method you call on individual objects, to get the execution-time type of the object.
  • Type.GetType(“Namespace.MyClass, MyAssembly”) gives you a type from its string representation at runtime.

Continue reading “Unity and Reflection – Optimising Memory using Caching on iOS”

Profiling CUDA on Tegra K1 (Shield Tablet)

Recently I have struggled a lot to profile a CUDA application on the Shield Tablet. If you were thinking “What the hell would you need a CUDA app for, on a tablet?” I would understand :D. CUDA it’s not for everyday use but can be very powerful.

As of now (Late 2015), the Shield has the most powerful mobile GPU on the market (Tegra Kepler architecture with 192 streaming processors). I decided to evaluate and profile physics algorithms using such architecture.

Reading through documentations, keynotes from GDC, and presentations I found out that is currently not possible to profile a CUDA application from an APK!

NVIDIA offers the Android Works package, previously called Tegra Android Development Pack. This package provides developers with a big suite of handy tools to debug, test and deploy applications on the Shield. Recently, I’ve found this presentation from the GPU Technology Conference in 2014 about profiling CUDA apps. In general, there exist several graphical and command-line tools, but only one is available for Android. See the image below:

Graphical and Command-Line Profiling Tools

Graphical and Command-Line Profiling Tools

As you see, for Android, you can only use nvprof. Nvprof is a command-line tool to profile CUDA applications and it will be explained in the next paragraph. If you look at the red rectangle at the bottom of the picture you will notice that CUDA APK profiling is not supported yet! I.e., if you have in your APK any CUDA kernel, or calls to any library that uses CUDA….you simply can’t profile it. Continue reading “Profiling CUDA on Tegra K1 (Shield Tablet)”

Deploying Assimp Using Visual Studio and Android NDK for Tegra Devices

Hello folks, welcome back to my blog, hope you are ready for a new adventure. This time I promise it is going to be an adventure with the capital A. I’ve been working on a finite element method algorithm using C++ (and later CUDA) to prove that the latest generation of mobile devices (more specifically the Kepler architecture in the Shield Tablet) is capable of running such complex algorithms.

The Shield is shipped with Android Kit-Kat 4.4 thus using C++ or Java and OpenGL ES 2.0 is not a problem…well not just yet 😀

Setting up the environment is not too difficult too. I used the Tegra Android Development Pack, that installs, all the tools you need to start developing on Android (including extensions for Visual Studio and the whole Eclipse IDE). After a few clicks you have everything up and running.


The Problem

I need to load 3D models. Albeit I could have written my own parser (which I think it could have been less painful) I decided to use Assimp instead. Assimp is a very handy library that can handle a plenitude of different file formats. I’ve used it extensively in all my projects so far. It supports Android and iOS (as it is stated on its GitHub page).

I read the doc a lot, but I found no easy way (well at least under Windows) to generate a Visual Studio solution (sorry I’m a Visual Studio addicted) to compile it using the Android NDK. I searched on the web for a long while and I found a couple of articles that explain how to compile Assimp for Android (this: Assimp on Desktop and Mobile and this other: Compile Assimp Open Source Library For Android). The procedure is quite troublesome, requires Cygwin under Windows and a lot of patience. Luckily in the second article mentioned above, the author posted a pre-compiled assimp 3.0 version lib with headers included.

Download Assimp 3.0 lib for Android here.

Having Assimp already compiled was truly helpful. It saved me a lot of time that I would have spent figuring out how to put everything together.

Here it comes the tricky part. Assimp was compiled as a shared library (an .so). To reference it is pretty easy. The include and the lib path have to be set and then the name of the library specified. Visual Studio doesn’t use the Android.mk (whereas Eclipse does I think) that tells the Ant build and the the apk builder how pack the apk, which local shared lib to include. It is to be done in the project’s properties instead.

After setting up the whole thing, the solution compiled, linked and the apk was created correctly. I was confident that Assimp would be deployed with the apk, but I soon found out it was not. Surprisingly I got this error instead on the tablet when I ran the application:

Unfortunately, NativeActivity has stopped…

Looking at the LogCat I found this error message too:


Figure 1

“java.lang.IllegalArgumentException: Unable to load native library: /data/app-lib/com.shield.fem-1/libShieldFiniteElementMethod.so”,  which told me absolutely nothing about the nature of the problem. Fortunately the only thing I knew I changed was the reference to Assimp. It was clear to me what was that the cause of the problem. But why and how wasn’t explained at all by the log files. It was easy to spot it though. I looked at the output window and libassimp.so (see Figure 2 below) was not included at all.

Output library list

Figure 2

The Solutions

I found  two solutions for this issue. I like to call them respectively  “The easy way”, and “The way of pain”. I had already added an external library (I had to use libpng for loading textures), but in that case it went smoothly because it was a static library. Static libraries are .a (or in Windows .lib) files. All the code relating to the library is in this file, and it is directly linked into the program at compile time. Shared libraries are .so (or in Windows .dll, or in OS X .dylib) files. All the code relating to the library is in this file, and it is referenced by programs using it at run-time, reason why it is not deployed with the apk unless explicitly told.

Way of pain

DISCLAIMER: This solution involves rooting your device, so I’m not responsible if warranty voids. Please do it at your own risk

This was my first attempt to shove in libassimp. By default all the libraries stored in /system/lib on the device are loaded automatically at startup, so it is very seamless. If any lib is there the running process can use it. I used the command adb shell, (adb is installed as part of the development pack)  which gave me access to the bash-like shell on the Tablet. As I was expecting Assimp was not in the system lib folder. My first idea was to upload manually the lib into /system/lib so I ran:

 adb push libassimp.so /system/lib

Unless your Android device is rooted and the /system mounted as read-write this is the message you will get:

Failed to copy ‘libassimp.so’ to ‘/system/lib/libassimp.so’: Read-only file system

The only solution as I said is to root your device first. This can be quite painful and it depends on your model. There are a few good guides around. Use google, take a cup of coffee and have a lot of patience. Personally to root mine (a Shield Tegra) I used this guide, and the app adbd Insecure available on google play,  that lets you run adbd in root mode once your device has been rooted.

At this stage I assume your Android friend is rooted so you can finally remount the system folder in order to add read-write permissions. Use this command:

adb shell
root@shieldtablet:/ # mount -o rw,remount /system

Later if you want you can restore its original read-only permission by executing:

adb shell
root@shieldtablet:/ # mount -o ro,remount /system

OK, at that stage I had permissions to do whatever I wanted with system so I was finally able  to upload Assimp. Execututing again the command adb push showed no error this time:

uploading assimp

Figure 3 – Upload has been successful!

At this stage I didn’t have to do anything really. Once the application starts it will load Assimp (and any other libs in there) automatically.

The Easy Way

I found out this easier solution only after I went through hell using the first painful approach (trust me it took me a while to understand how to root the device and which commands to run). Here you don’t need to root your device at all, but you will have to change your code a little bit to dynamically load Assimp (shared libs in general though). Let’s start!

First of all I didn’t know it was possible to upload shared libraries through Visual Studio (d’oh!). I didn’t find it written anywhere (well maybe I didn’t search well) but looking at my projects properties I found this:

project properties

Figure 4

In the Ant build it is possible to specify Native library dependencies! At this very stage I would imagine you laughing knowing what I went through with the “way of pain” 😀

Anyway, I set references to Assimp right here, look at figure 5:

project properties 2

Figure 5

Using this approach the shared library is built seamlessly into the apk! The only drawback is that it won’t be loaded automatically! For this  issue another little trick is needed. If you try to execute/debug your program now, you will likely get again the same error message as in Figure 1.

You need to load any shared library before your native activity. To do this a Java class is to be used. Something like:

package com.your.package;

public class Loader extends android.app.NativeActivity {

It is important that Loader.java goes under the folder src in your project and that it is wrapped in a folder structure that respects your package declaration (I know if you’re a Java guy it is evident for you, but I’m more a C#/C++ one so it took me again a while to figure it out 😛 ).

The last bit: change your AndroidManifest.xml android:hasCode must be equal to True and change the android:name in the activity tag from android.app.NativeActivity to Loader (i.e. the name of your Java class)

 <!-- Our activity is the built-in NativeActivity framework class.  This will take care of integrating with our NDK code. -->

That’s finally it!


I’m a total newbie with Android development and it’s been quite hard for me to figure out how to deploy a shared library in Visual Studio as it wasn’t very intuitive. A lot of examples I found online use command line scripts to compile and/or different IDEs. The most common approach is using an .mk file where properties, libraries etc are defined inside. Mk files are (apparently) completely ignored by VS so it wasn’t possible for me to use one.

I really hope this article can help you. I am looking forward to reading your comments, hoping that there are other simpler ways to achieve what I did today.

See you soon!

C++ Tail Recursion Using 64-bit variables – Part 2

In my previous post I talked about recursion problems in a Fibonacci function using 64-bit variables as function parameters, compiled using the Microsoft Visual C++ compiler. It turned out that while tail recursion was enabled by the compiler using 32-bit types it didn’t really when switching to 64-bit ones. Just as a reminder, Tail Recursion is an optimization performed by the compiler. It is the process of transforming certain types of tail calls into jumps instead of function calls. More about tail recursion here.

My conclusion was that tail recursion is not handled properly by the Visual C++ compiler and a possible explanation could be the presence of a bug.

The calculation of Fibonacci sequences of big integers is not an everyday task but it can still be a reliable example to show how tail calls are implemented.

Not happy with my conclusions and following several suggestions of users’ comments (here on the blog, on Reddit and on StackOverflow) I wanted to understand more about this issue and to explore other solutions using different compilers.

Continue reading “C++ Tail Recursion Using 64-bit variables – Part 2”

C++ Tail Recursion Using 64-bit variables

For this second coding adventure I want to share with you a problem I run into comparing iterative and recursive functions in C++. There are several differences between recursion and iteration, this article explains the topic nicely if you want to know more. In general languages like Java, C, and Python, recursion is fairly expensive compared to iteration because it requires the allocation of a new stack frame. It is possible to eliminate this overhead in C/C++ enabling compiler optimization to perform tail recursion, which transforms certain types of recursion (actually, certain types of tail calls) into jumps instead of function calls. To let the compiler performs this optimization it is necessary that the last thing a function does before it returns is call another function (in this case itself). In this scenario it should be safe to jump to the start of the second routine. Main disadvantage of Recursion in imperative languages is the fact that not always is possible to have tail calls, which means an allocation of the function address (and relative variables, like structs for instance) onto the stack at each call. For deep recursive function this can cause a stack-overflow exception because of a limit to the maximum size of the stack, which is typically less than the size of RAM by quite a few orders of magnitude.

I have written a simple Fibonacci function as an exercise in C++ using Visual Studio to test Tail Recursion and to see how it works: Continue reading “C++ Tail Recursion Using 64-bit variables”