Profiling CUDA on Tegra K1 (Shield Tablet)

Recently I have struggled a lot to profile a CUDA application on the Shield Tablet. If you were thinking “What the hell would you need a CUDA app for, on a tablet?” I would understand :D. CUDA it’s not for everyday use but can be very powerful.

As of now (Late 2015), the Shield has the most powerful mobile GPU on the market (Tegra Kepler architecture with 192 streaming processors). I decided to evaluate and profile physics algorithms using such architecture.

Reading through documentations, keynotes from GDC, and presentations I found out that is currently not possible to profile a CUDA application from an APK!

Read More

Deploying Assimp Using Visual Studio and Android NDK for Tegra Devices

Hello folks, welcome back to my blog, hope you are ready for a new adventure. This time I promise it is going to be an adventure with the capital A. I’ve been working on a finite element method algorithm using C++ (and later CUDA) to prove that the latest generation of mobile devices (more specifically the Kepler architecture in the Shield Tablet) is capable of running such complex algorithms.

The Shield is shipped with Android Kit-Kat 4.4 thus using C++ or Java and OpenGL ES 2.0 is not a problem…well not just yet đŸ˜€

Setting up the environment is not too difficult too. I used the Tegra Android Development Pack, that installs, all the tools you need to start developing on Android (including extensions for Visual Studio and the whole Eclipse IDE). After a few clicks you have everything up and running.

Read More

C++ Tail Recursion Using 64-bit variables – Part 2

In my previous post I talked about recursion problems in a Fibonacci function using 64-bit variables as function parameters, compiled using the Microsoft Visual C++ compiler. It turned out that while tail recursion was enabled by the compiler using 32-bit types it didn’t really when switching to 64-bit ones. Just as a reminder, Tail Recursion is an optimization performed by the compiler. It is the process of transforming certain types of tail calls into jumps instead of function calls. More about tail recursion here.

My conclusion was that tail recursion is not handled properly by the Visual C++ compiler and a possible explanation could be the presence of a bug.

The calculation of Fibonacci sequences of big integers is not an everyday task but it can still be a reliable example to show how tail calls are implemented.

Not happy with my conclusions and following several suggestions of users’ comments (here on the blog, on Reddit and on StackOverflow) I wanted to understand more about this issue and to explore other solutions using different compilers.

Read More

C++ Tail Recursion Using 64-bit variables

For this second coding adventure I want to share with you a problem I run into comparing iterative and recursive functions in C++. There are several differences between recursion and iteration, this article explains the topic nicely if you want to know more. In general languages like Java, C, and Python, recursion is fairly expensive compared to iteration because it requires the allocation of a new stack frame. It is possible to eliminate this overhead in C/C++ enabling compiler optimization to perform tail recursion, which transforms certain types of recursion (actually, certain types of tail calls) into jumps instead of function calls. To let the compiler performs this optimization it is necessary that the last thing a function does before it returns is call another function (in this case itself). In this scenario it should be safe to jump to the start of the second routine. Main disadvantage of Recursion in imperative languages is the fact that not always is possible to have tail calls, which means an allocation of the function address (and relative variables, like structs for instance) onto the stack at each call. For deep recursive function this can cause a stack-overflow exception because of a limit to the maximum size of the stack, which is typically less than the size of RAM by quite a few orders of magnitude.

I have written a simple Fibonacci function as an exercise in C++ using Visual Studio to test Tail Recursion and to see how it works: Read More