Contrary to what some inexperienced developers think, NDK is not a silver bullet for any performance problem that you face in your app. Before you make a decision to rewrite a portion of your code in C, you have to think twice. Your effort might not pay off as much as you think, as NDK is not unconditionally faster than Dalvik code.
If you already spent some time learning NDK, then you know that NDK can't do everything either. There is no such things as a "100% native app", since NDK can only operate in certain areas. And no matter how hard you try, the main UI framework of your app will one way or another be executed on Dalvik. For example, if you want to process touch events in your native code, you will have to capture them in Java code (using onTouchEvent()) and then pass the data to native code. In such cases, it's worth to remember that native method invocations have a penalty. That "bridging" cost of moving argument data and securing Java code from the native code can beat the whole purpose of using native code for better performance.
In addition, if your native code calls a lot of Java methods, you should keep in mind that "reverse" JNI invocations are costly too. In general, if your native code and Java code are very intermixed, it might be very probable that you are not going to gain much from NDK.
Combined with the facts that 1) Java is not as slow as one might think and 2) native code is more bug-prone and requires more maintenance, you should really find a good ground for using NDK in every specific case. Is there any way to tell whether native code will be faster than a given piece of Java code? I don't think so. What you should really do is to test the performance of Java vs native code in every specific case.
Example app: bitmap processing
I created a simple app that looks like this:

What it does is: when you press Start, it applies a Sharpen effect to the bitmap using a simple convolution matrix. The Native toggle-button allows you to switch between the native implementation of the processing algorithm and the Java one.
(I attached the source of this app so you can play with it on your own device - the native library is pre-compiled.)
Here's the code of the Java bitmap processing algorithm:
@Override
public void processBitmap(RawBitmap src, RawBitmap dest) {
final int[] srcPixels = src.getPixels();
final int[] destPixels = dest.getPixels();
final int width = src.getWidth();
final int height = src.getHeight();
int pos = width + 1;
for (int y = 1; y < height - 1; y++) {
for (int x = 1; x < width - 1; x++) {
int left = srcPixels[pos - 1];
int right = srcPixels[pos + 1];
int top = srcPixels[pos - width];
int bottom = srcPixels[pos + width];
int center = srcPixels[pos];
int r = 5 * red(center) - red(left) - red(right) - red(top) - red(bottom);
int g = 5 * green(center) - green(left) - green(right) - green(top) - green(bottom);
int b = 5 * blue(center) - blue(left) - blue(right) - blue(top) - blue(bottom);
// clipping
r = clip(r);
g = clip(g);
b = clip(b);
destPixels[pos] = rgb(r, g, b);
++pos;
}
pos += 2;
}
}
private static int clip(int v) {
return (v < 255) ? (v < 0 ? 0 : v) : 255;
}
private static int rgb(int r, int g, int b) {
return 0xff000000 | (((r & 0xff) << 16) | ((g & 0xff) << 8) | (b & 0xff));
}
private static int red(int color) { return (color >> 16) & 0xff; }
private static int green(int color) { return (color >> 8) & 0xff; }
private static int blue(int color) { return color & 0xff; }
Here's the C counterpart. Note it's almost the same, except we need to use JNI mechanics to get access to array data:
#include <string.h>
#include <jni.h>
void
Java_com_wiseandroid_samples_ndkperf_NativeBitmapProcessor_nativeProcess(
JNIEnv* env,
jclass classObject,
jintArray srcArray,
jintArray destArray,
jint width,
jint height) {
jint* srcPixels;
jint* destPixels;
jint y, x, pos = 0;
srcPixels = (*env)->GetIntArrayElements(env, srcArray, 0);
destPixels = (*env)->GetIntArrayElements(env, destArray, 0);
pos = width + 1;
for (y = 1; y < height - 1; y++) {
for (x = 1; x < width - 1; x++) {
int left = srcPixels[pos - 1];
int right = srcPixels[pos + 1];
int top = srcPixels[pos - width];
int bottom = srcPixels[pos + width];
int center = srcPixels[pos];
int r = 5 * red(center) - red(left) - red(right) - red(top) - red(bottom);
int g = 5 * green(center) - green(left) - green(right) - green(top) - green(bottom);
int b = 5 * blue(center) - blue(left) - blue(right) - blue(top) - blue(bottom);
// clipping
r = clip(r);
g = clip(g);
b = clip(b);
destPixels[pos] = rgb(r, g, b);
++pos;
}
pos += 2;
}
(*env)->ReleaseIntArrayElements(env, srcArray, srcPixels, 0);
(*env)->ReleaseIntArrayElements(env, destArray, destPixels, 0);
}
inline int clip(int v) { return (v < 255 ? (v < 0 ? 0 : v) : 255); }
inline int red(int c) { return (c >> 16) & 0xff; }
inline int green(int c) { return (c >> 8) & 0xff; }
inline int blue(int c) { return (c & 0xff); }
inline int rgb(int r, int g, int b) {
return 0xff000000 | (((r & 0xff) << 16) | ((g & 0xff) << 8) | (b & 0xff));
}
The application does 3 iterations of processing and uses the average of all iterations as the resulting time. In addition, you might be interested in my RawBitmap class that wraps bitmap data for easier modification.
The outcome
In my case, the outcome was amazing. Native code beat Java 20x. The average time for the Java code was around 1000 ms while native code did the same thing within 50 ms. Now I can be sure that in this and similar cases it's worth writing native code in the first place.
Conclusion
As you can see, rewriting some parts of your app in native code can bring large performance gains. However, before you do that, you should spend some effort prototyping and measuring the actual performance advantages you're about to get. I suggest the following strategy for using NDK:
- Find the bottlenecks using profiling tools and focus on the problematic areas, to move to native code as few pieces as possible while gaining as much as possible.
- Make sure to switch between Java and native code as rarely as possible to reduce "bridging" costs.
- Test actual performance gains in an isolated prototype app.
- Don't leak memory in the native code. Don't forget to release JNI objects when you grab them, such as in the presented case with arrays.
Sticking to that strategy will allow you to benefit from native code as much as possible.
Source: ndkperf.zip (37.42 kb)