pcchen said:
Java still has higher overhead than C/C++ beacuse it can't put objects on the stack (has to be in the heap, IIRC), so it's highly dependent on automatic garbage collection.
This is both incorrect and spouts a common myth about garbage collector performance. First, Java runtimes can perform escape analysis on objects and allocate some of them on the stack. There are a couple non-Sun VMs that do this. Secondly, heap allocation overhead in modern VMs is on the order of stack allocation. It consists of incrementing a pointer and returning the old one. You pay almost nothing upfront for heap allocation, you pay amortized over time for collection.
The collector gradually promotes objects that live longer from a young "stack-like" generation to an older longer-lived more "malloc()-like" generation. It then runs different algorithms on the differing heaps. The "stack"-like heap, with it's high-churn/short-lifespan-objects gets a parallel multithreaded copying-collector. The older long-lived objects in the old generation get a concurrent mark-sweep collection.
Proper tuning of the size and promotion rules for the GC heaps via a builtin-profiler can reduce the "overhead" of GC to very small levels.
Other things like forced dynamic type checking and dynamic binding also make higher overhead.
None of these are forced. Dynamic binding in Java is not "forced". Not only can the JVM eliminate dynamic dispatch, it can INLINE functions across dynamic boundaries, something C++ doesn't normally do, because modern JVMs use profile based optimizations at runtime.
Just like C++, much of the time, the compiler can prove a particular virtual method gets dispatched, and remove the virtualized dispatch. The programmer can also unvirtualize a method via "final". Unlike C++, Java VMs keep detailed profile statistics about runtime method invocations, types, and class hierarchy. Using Class Hierarchy Analysis, the VM can further eliminate the possibility of a polymorphic dispatch. After CHA, using profile data, the VM can speculatively compile an inlined version of a polymorphic dispatch and do type-inferencing.
Furthermore, with respect to automatic loop bounds checking, in the majority of cases, these are eliminated. Typical examples like iterating over a collection and iterating through an array are easily removed.
If the compiler cannot be assured it can eliminate a bounds check, it does unrolling via loop splitting. It splits a loop into an initialization prolog phase, an invariant part in which bounds check is unneccessary, and an epilogue which bounds check needs to occur.
In no way does a loop in Java like
for(int i=0; i<n; i++) { a
... }
generate code that loops like this
for(int i=0; i<n; i++) { if(i < a.length) { a ... safe, check passed } }
If your application is non-interactive, GC pauses won't hurt you, and you will run with similar if not better performance than C++ (if lots of polymorphism is used). If you application is interactive, GC pauses may or may not be an issue, and tuning will be a big issue.
I'm not advocating Java for games, but there is a lot of nonsense about Java performance still going around. I've personally written code that trounced C++ code if object oriented features are used. One example was using a C++ cryptographic library which was nicely OO. Java's Cryptographic library (JCE) killed it in performance, because Java can inline polymorphic calls, and C++ can't, so thousands of iterations of MD5() or SHA() ran much faster. When I converted to C with a heavily non-OO algorithm, using lots of #define macros, Java got beaten, but the resulting code was no where near as easy to read or maintainable.
C++'s big performance win over Java is templates, I won't argue there. If you use Generic Programming instead of OO programming, you'll win.
If you want to see C++ get it's ass kicked by a real language that supports GC, OO, and GP, look at Ocaml's benchmarks.