SuperGNESTM
SNES Emulator for Android Phones

Chipping away at the bottle necks

The ARM CPU (in which most mobile devices run on) is a very interesting processor to optimize for. Your typical Intel chip has few registers, tons of MHz and a fatty L2 and L3 memory cache. ARM is different. It’s RISC based, has 16 registers and has a small CPU cache (which the Linux kernel hogs :( ). The lack of cache makes you code differently. You instead need to get as much work done as possible within those 16 registers before you have to read or write to memory. This can make a big difference in speed. In C++ terms this means local variables and constants are your friends. Using the output to assembly code option on gcc has revealed how certain routines of ours are killing the memory bus. On Intel taxing the memory bus is not a problem because the L2 cache is so fast. However for ARM we’ve took the small cache to mind and refactored our code to be more memory IO efficient and that so far it has increased our performance. More FPS improvement news will follow soon.

For more optimization info check out Writing Efficient C and C Code Optimization.

6 Responses to “Chipping away at the bottle necks”

  1. Tyler says:

    Keep up the great work guys! Still waiting on that donation link. And if you need beta testers, etc etc. ;-)

  2. frito says:

    How much performance?

  3. Paul Bruner says:

    God, don’t get me started. Look at the assembly for just the ARM processor. Not only do each of the instructions use 4 bits for a condition to run, but each one can do a shift instruction!

    Granted the MIPS32 can do shift instructions but its an entirely different animal from even programing on x86 assembly. You can do a few instructions at one time for each instruction on a 6502:P

  4. admin says:

    With the graphics optimizations so far we’ve gotten about a 5-10 frame per second increase. Still some ways to go. We’re really excited about the new CPU core code that we’re working on. Should be a big win when it gets integrated.

  5. Frito says:

    Man yong released snesoid and it runs every game that I can find besides ones that have 3d and a few others.
    It runs smooth with sound on also. I’d say 20 to 50 range.

Leave a Reply