Reversing Games with... Hashcat???

Wherein I investigate New Super Mario Bros. Wii, use Hashcat to help me recover symbols from the Nvidia Shield port, and go on a bunch of tangents about the other Nintendo games that use the same engine.


I've been poking at the NSMB series for a long time, starting with the level editor I wrote for the original DS game in 2007, and then re-wrote and open-sourced. When the Wii version was released in 2009, I built Reggie! Level Editor for it, and I subsequently spent a few years reverse-engineering the game engine while working on the Newer Super Mario Bros. Wii mod.

Nintendo has a long tradition of shipping games with interesting leftovers. Many first-party GameCube games came with .map files produced by the linker, telling you the name and address of every symbol in the executable. This was less common on the Wii, but they still sent out a few titles with un-stripped .sel files (used by their bargain-basement shared library system) that effectively gave us the same info. They even made the same mistake on the Switch; multiple versions of Splatoon 2 included full symbols.

They've never done this for any of the New Super Mario Bros. games, though. We don't know if there's an official name for this engine, but we've found the following titles so far that all use it:

This cross-pollination of code has some fun effects. The DS games all share the same crash debug screen. The single-purpose "Save Data Update Channel" for Skyward Sword includes random pieces of AC: City Folk.


I've researched NSMBW heavily, and built a ton of knowledge about the game engine. It always irked me though that I had very few official names for aspects of the engine. The state machine system in NSMBW gave us the names for classes that used it, since the binary included a plaintext name for every state (like daYoshi_c::StateID_Jump). I also had some names for engine classes, since City Folk and SM64DS were both compiled with RTTI (runtime type information) enabled.

We did have full symbols for the Zelda titles on the GameCube. Both Wind Waker and Twilight Princess are based off what appeared to be a predecessor to the engine in NSMB, sharing many of its concepts and even the cryptic naming conventions that led to functions like fpcLyIt_OnlyHereLY and mDoDvdThd_param_c::addition.

They're not the same, though. One major example is the actor/process system: in WW/TP, each actor/process has a table of function pointers that has to be awkwardly passed around. In the NSMB iteration, these are just virtual functions on the fBase_c class which can be overridden by subclasses as necessary.

I'd lost hope that Nintendo would ever leak symbols for NSMBW (or one of the closely related titles), since it seemed like they'd moved on. The 3DS and Wii U iterations of the NSMB franchise are significally different from an internal standpoint, and future titles would almost certainly diverge further even if we were somehow lucky enough to get symbols for them. Unless...?

NVIDIA joins the battle!

Nintendo has historically offered their games in China through limited partnerships with other companies, leading to odd things like the iQue Player (a digital-distribution-only console based on the Nintendo 64) and, more recently, the release of some high-definition Wii ports for the NVIDIA Shield in 2017.

Very few games were ported, but NSMBW was one of them. This immediately sparked my interest - ongoing development on the game meant another chance for somebody to mess up and leak useful info.

Since the Shield runs Android, the games are shipped as Android APKs (albeit they won't run on other devices since they include DRM that ties them to Nvidia's hardware). The assets are all encrypted, but a tool like iQiPack can be used to unpack them. Combined with some traditional Android reversing techniques, it's pretty easy to learn about how they're put together.

Lingcod Overview

The Shield games use a bespoke Wii emulator called Lingcod. It cannot run unmodified Wii games; it appears to rely heavily on high-level emulation. Games are modified and recompiled, but the code is still PowerPC (Lingcod does just-in-time recompilation). The game's DVD filesystem is present, along with various Lingcod-specific configuration files and assets.

The NSMB-specific nsmb.ini file contains some interesting info about how the game is put together, including configuration parameters that weren't available in the non-Shield builds:

###################################################################################################
# New Super Mario Brothers [Production Build] use preprocessor define USE_NSMB_PRD
###################################################################################################
#  DVDRoot     = %P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source\US\PRD\RVL\NDEVImage\DVDRoot
#  ELF Files   = %P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source\US\PRD\RVL\NDEVImage\WIIMJ2DNP.elf
#  Source Code = %P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source
###################################################################################################

###################################################################################################
# New Super Mario Brothers [Developer Build] use preprocessor define USE_NSMB_DEV
###################################################################################################
#  DVDRoot     = %P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source\US\DEV\RVL\NDEVImage\DVDRoot
#  ELF Files   = %P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source\US\DEV\RVL\NDEVImage\WIIMJ2DNV.elf
#  Source Code = %P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source
###################################################################################################

###################################################################################################
# New Super Mario Brothers [Debug Build] use preprocessor define USE_NSMB_DBG
###################################################################################################
#  DVDRoot     = %P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source\US\DBG\RVL\NDEVImage\DVDRoot
#  ELF Files   = %P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source\US\DBG\RVL\NDEVImage\WIIMJ2DND.elf
#  Source Code = %P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source
###################################################################################################

@include "nsmb/nsmb_controller.ini"

####################################################################
### Use this block to boot NSMB Production version
####################################################################
@if USE_NSMB_PRD
[ELF_FILE]
ELF_LAUNCH=WIIMJ2DNP.elf
ELF_SAVE_GAME=default       # You can use this .INI setting to switch between different 'save' game files
ELF_SOURCE=%P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source

[DATA_ASSET_MANAGER]
DATA_LOCATION=%P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source\US\PRD\RVL\NDEVImage
FILE_CACHE_SIZE=32          # Size of the disk-file cache in MB
pakfile1=nsmb_prd.pak
dvdroot1=DVDRoot
@endif
####################################################################

####################################################################
### Use this block to boot NSMB Pilot version
####################################################################
@if USE_NSMB_PLT
[ELF_FILE]
ELF_LAUNCH=WIIMJ2DNL.elf
ELF_SAVE_GAME=default       # You can use this .INI setting to switch between different 'save' game files
ELF_SOURCE=%P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source

[DATA_ASSET_MANAGER]
DATA_LOCATION=%P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source\US\PLT\RVL\NDEVImage
FILE_CACHE_SIZE=32          # Size of the disk-file cache in MB
pakfile1=nsmb_plt.pak
dvdroot1=DVDRoot
@endif
####################################################################

####################################################################
### Use this block to boot NSMB Developer version
####################################################################
@if USE_NSMB_DEV
[ELF_FILE]
ELF_LAUNCH=WIIMJ2DNV.elf
ELF_SAVE_GAME=default       # You can use this .INI setting to switch between different 'save' game files
ELF_SOURCE=%P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source

[DATA_ASSET_MANAGER]
DATA_LOCATION=%P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source\US\DEV\RVL\NDEVImage
FILE_CACHE_SIZE=32          # Size of the disk-file cache in MB
pakfile1=nsmb_dev.pak
dvdroot1=DVDRoot
@endif
####################################################################

####################################################################
### Use this block to boot NSMB Debug version
####################################################################
@if USE_NSMB_DBG
[ELF_FILE]
ELF_LAUNCH=WIIMJ2DND.elf
ELF_SAVE_GAME=default       # You can use this .INI setting to switch between different 'save' game files
ELF_SOURCE=%P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source

[DATA_ASSET_MANAGER]
DATA_LOCATION=%P4ROOT%\odin\northport\%GAME_BRANCH%\nsmb\source\US\DBG\RVL\NDEVImage
FILE_CACHE_SIZE=32          # Size of the disk-file cache in MB
pakfile1=nsmb_dbg.pak
dvdroot1=DVDRoot
@endif
####################################################################

[TEXTURE_OVERRIDE_HASH]
NORMAL_MAP_HASH_PC=14499773120875347533
NORMAL_MAP_HASH_ANDROID=13351154790720945832

[NSMB]
NOTCH_SPEED=0x1C0   ; Original Value in Game: 0x230, the larger the faster
CANNON_SPEED=0x140      ; Original Value in Game: 0x190, the larger the faster
WIRE_ADD_RATE=0x500     ; Original Value in Game: 0x400, the larger the slower
WIRE_SUB_RATE=0x500     ; Original Value in Game: 0x400, the larger the slower
AUTO_PILOTING=false
AUTO_PILOTING_COURSE_INDEX=-1 ; when AUTO_PILOTING is enabled, -1 means run all game courses. Value in [0, 68] is to run a specific course (see d_s_restart_crsin_static.cpp)
CREDIT_SCREEN_SHORTCUT=false    ; Change this to true will load credit screen on start
NINTENDO_LOGO_TIME=2            ; Display time of Nintendo logo if no key is pressed
NINTENDO_LOGO_CANCEL_TIME=1     ; Minimal display time before keypress can dismiss Nintendo
NV_LOGO_TIME=2                  ; Display time of LightSpeed Studios logo if no key is pressed
NV_LOGO_CANCEL_TIME=0           ; Minimal display time before keypress can dismiss LightSpeed Studios logo
MOC_SCREEN_TIME=2               ; Display time of MoC Anti-addiction screen if no key is pressed
MOC_SCREEN_CANCEL_TIME=0        ; Minimal display time before keypress can dismiss MoC Anti-addiction screen
DOF_SCALE=1.5                   ; Additional multiplier after Depth of Field effect is scaled down based on EFB multiply

[SYSTEM_MENU]
# about, controls & achievements menu options are opt-in. 
ENABLE_ABOUT_PAGE=1
ENABLE_CONTROLS_PAGE=1

There's a patch_functions.ini file which explains which functions get high-level emulated, with some fun comments - all the C standard library functions from string.h are marked as "On by default unless we find issues", except for memcmp, which is disabled with the note "Disabling because implementation on aarch64 doesn't match with ppc".

This also sheds light on how the Lingcod options are implemented. There are stub functions added to the game, like lingcod_getIniState and lingcod_OpenMovie, which do nothing by default but activate specific behaviours when running inside Lingcod.

Ultimately, though, my goal was to learn more about the game itself and not so much about the emulator. From nsmb.ini I now knew that there was a source file named d_s_restart_crsin_static.cpp, which is neat, but... not all that useful.

Debug builds of the game would have had linker maps inside the maps/ directory, but unfortunately this was still missing from the Shield production builds. The existence of patch_functions.ini suggested however that there must be some form of symbol map somewhere - how else would Lingcod be able to resolve lingcod_getIniState to the address 0x801007B0?

Reading the Symbol Map

GameCube and Wii games include their main executables using the .dol format, which is incredibly simple. The header specifies the address, offset and size of each section, along with the entry point address. No other metadata is provided.

Lingcod instead includes an .alf file. It was fairly easy to figure out the format using just a hex editor. There's a minimal header, a set of code/data sections and then... the coveted symbol table. Unfortunately, there's a catch. It doesn't include names; it just includes two seemingly random 32-bit values for each symbol. The two values were identical for some symbols, but different for many others. I assumed (correctly) that these represented the mangled and de-mangled names of the symbol in question.

I needed to know how the hashes were computed. After some fairly uninteresting reversing work on libnsmb.so (a native ARM64 binary contained in the Android APK, with crudely obfuscated strings), I had my answer. It was a very simple algorithm:

h = 0x00001505
for each character c:
  h = (h * 33) ^ c

I had a pool of possible symbol names to test this theory on, courtesy of other Wii games (since lots of library code is shared across titles) and my own prior reversing work on the game. I modified my crude ALF parser to test all the hashes against that list, with vaguely promising results.

All of the SDK functions I expected to be present were indeed named by this table. I'd even successfully named the destructors for some non-SDK classes, thanks to the names leaked by the state machine system and by Animal Crossing's RTTI. It was surprisingly rewarding to see those come up.

04:8016c030 00000044 00000000 | b'#6F194B24' b'#A781B404'
04:8016c080 00000068 00000000 | __dt__Q23m3d5mdl_cFv b'#C1CD0CA3'
04:8016c0f0 00000154 00000000 | b'#9E61F1C2' b'#152DDF2A'
04:8016c250 00000038 00000000 | b'#95BE8615' b'#A75404F5'
04:8016c290 00000008 00000000 | b'#A430FC17' b'#69091E07'
04:8016c2a0 00000034 00000000 | b'#DAFC0AD7' b'#ABE0A0B7'
04:8016c2e0 00000074 00000000 | b'#2A507E91' b'#3E25FCDB'
04:8016c360 00000008 00000000 | b'#2E0671CD' b'#0543E930'
04:8016c370 00000018 00000000 | b'#EAD2A71D' b'#53EFA331'
04:8016c390 00000060 00000000 | __dt__Q23m3d9scnLeaf_cFv b'#3729AD16'
04:8016c3f0 00000044 00000000 | b'#82298AEC' b'#39361280'
04:8016c440 00000008 00000000 | b'#06E948DE' b'#C2E3DE32'

Looking into libnsmb.so also gave me one further lead. I suspected that Nvidia may have decided to high-level emulate more of the game's functions, even if they weren't listed in any of the INIs. That was indeed the case. One function was patched, and I had the name for it: searchNodeByID__9fLiMgBa_cCF9fBaseID_e

That demangles to fLiMgBa_c::searchNodeByID( fBaseID_e ) const. After comparing it with my heavily documented copy of NSMBW (EU 1.0), this turned out to be the function that searches a linked list for a fBase_c actor/process instance with a particular unique ID.

This is where I stopped in April 2019. Brute-forcing names seemed unrealistic. How was I supposed to figure anything out when I had to guess the names of functions and classes and the signatures... especially with a naming scheme this opaque?

The Inevitable AC:NH Tangent

In March 2020, a lot of things happened, including the release of Animal Crossing: New Horizons. That one game with the turnips and Raymond.

With AC:NH, I sunk myself into reversing a Nintendo game for the first time in ages, as an escape from my impending dissertation deadline (I passed, thankfully) and the state of the Outside World.

AC:NH stores tons of tabular data in .bcsv files (spicy CSVs, minus the comma separation, which I guess just makes them Vs). The fields in these are identified using CRC32 checksums of their name and type, like ResName string64 or Month2021 u8.

There's also a bunch of .byml files (spicy YAML) which encode the structure of the savefile, used when migrating save data from one game version to another. In these, fields are identified using two 32-bit MurmurHash3 hashes, one of the field name and one of the type name.

I once again had a reason to try and crack hashes. These were slightly less intimidating than NSMBW's symbols, because I didn't have to figure out quite as many components at once.

OpenCL Moment

So, I wondered... what if I used hashcat and offloaded it to my GPU? Can a lowly GTX 1060 help me discover Tom Nook's secrets?

Hashcat includes a CRC32 kernel. It doesn't have MurmurHash3, but all the kernels are written using OpenCL so it's pretty easy for me to take the CRC32 one and replace the calculations.

Brute-forcing these hashes on a character-by-character basis is not really an option. All of the ones I've dealt with, across AC:NH and NSMBW, are 32-bit. This generates a vast amount of collisions. Naively assuming an equal distribution, there's a 1/4,294,967,296 chance of a given string matching a particular hash. That seems tiny. However, if I want to brute-force a 10-character string composed of letters, numbers and underscores, then that's 63^10 possibilities... and I would expect to receive around 229,321,954 strings that claim to match my hash.

That's not going to realistically work, even if I have the computing power to compute that many hashes. What else?

Compiling Dictionaries

Hashcat lets you combine two lists of words together, so that seemed to be my best option. If I submit "Red, Green, Blue" and "Sandwich, Turnip" then it'll try RedSandwich, RedTurnip, GreenSandwich, GreenTurnip, BlueSandwich and BlueTurnip.

I wrote some scripts which would try and assemble lists of words that I expected to see in the names - and crucially, combinations of words as well. If I want to figure out the name "SpecialNpcBitFlag", then my list needs to include two parts of that string so that hashcat can assemble them together and go "aye, this is it, Special + NpcBitFlag matches this one hash" or whatever.

My main approach was to gather strings from as many sources as possible:

I didn't just throw these into the list directly - I also split them up into their component parts. The command EventFlowActionSetDeliveryItem (seen in many scripts) gave me ActionSet, ActionSetDelivery, ActionSetDeliveryItem, SetDelivery, SetDeliveryItem and DeliveryItem. This undoubtedly results in lots of nonsense combinations but improves my chances of successfully guessing what a multiple-word name is going to contain.

This iterative technique gave me tons of names. I was able to document the vast majority of the fields in the AC:NH savefile, in turn increasing my understanding of the game's code since so many subsystems would read from/write to structures in it.

I did some fun things with Animal Crossing, but I eventually burned out hard and wanted to move onto something different.

Returning to the Mushroom Kingdom

In late 2020, I decided to try dabbling with NSMBW again. After my success with cracking well over a thousand hashes in New Horizons, I wondered if I could apply the same tricks to the Mario symbol map.

Firstly, here's a bit of background on the exact problem space. NSMBW is written in C++. One of the cool features in C++ is that you can overload functions (create multiple functions with the same name that take different parameters). In order to distinguish these, compilers use what is called C++ name mangling, where some garbage is added to the function name.

CodeWarrior Name Mangling

If I define the function add_numbers(int a, int b), the CodeWarrior compiler (used by all GameCube and Wii games) will mangle this name by adding __Fii to the end, giving the end result add_numbers__Fii.

Methods (and functions in a namespace) also have the name of the class/namespace added, so that m3d::anmTexPat_c::checkFrame(float, long) const becomes checkFrame__Q23m3d11anmTexPat_cCFfl. It looks like nonsense but ultimately makes sense once you split it up into its components.

Mangled Explanation
checkFrame method name
Q23m3d11anmTexPat_c the Q signifies a multi-part name with 2 elements, followed by m3d (3 characters long) and anmTexPat_c (11 characters long)
C the method is const
F this is a method or function (no distinction is made), and the arguments follow
f a float
l a long

Certain special methods, like constructors and operators, are identified by fixed names beginning with two underscores. The constructor for EGG::Allocator that accepts EGG::Heap*, long as its two arguments is mangled to __ct__Q23EGG9AllocatorFPQ23EGG4Heapl:

Mangled Explanation
__ct this is a constructor
Q23EGG9Allocator a multi-part name encoding EGG::Allocator
F this is a method or function
P a pointer to...
Q23EGG4Heap ...an EGG::Heap
l a long

All the primitive types get a single-character code. Modifiers like pointers, references, signedness and const are specified before the type.

Static class variables and global variables inside namespaces are also mangled, but in a less painful form. C++ does not let you have two globals with the same name, so there's no need to specify any types, you just specify what the variable is inside of.

Mangled Demangled
m_tmpCtProfName__7fBase_c fBase_c::m_tmpCtProfName
TYPE_NAME__Q34nw4r3g3d6ScnMdl nw4r::g3d::ScnMdl::TYPE_NAME

Once you get the hang of this scheme, it's alright. It's a nightmare for cracking names, though. The same hashed string encodes the name of the function/method, the name of the class, the types of the arguments, and whether it is const or not (for methods).

The First Attempt

I started up a new Rust project and wrote code which would try and create some word lists, in a similar fashion to what I did for AC:NH. I took strings from a bunch of places:

Using these, I compiled two major lists. One included words (and combinations of words) which I expected to see inside method names. The other one included class names that I expected to see.

My first attempt was very basic. I tried hashing everything in my word list combined by everything in my class list, and then coupled that up with common signatures like (const char*), (void) and (float, float). While rudimentary, this got me a bunch of low-hanging fruit such as:

I knew I needed to expand my class list, since without knowing what classes existed, I had no hope of even trying to crack the method names. An easy way to do that was by relying on symbols that many classes would have.

C++ Classes

Any class with at least one virtual function will have a vtable (containing pointers to all its virtual method implementations), which is always named __vt__[class name].

Destructors are almost always __dt__[class name]Fv (except for when multiple inheritance is involved, but NSMBW almost never uses that), as they cannot have arguments.

Note for nitpickers: technically, destructors do have an argument. Some compilers generate multiple variants of the destructor for each class. CodeWarrior on PowerPC only generates one, and accepts a hidden argument in r4 which specifies how it should behave.

This, however, is not encoded in the mangled name; it ends with Fv, claiming that it only accepts void.

Constructors can sometimes have arguments, which makes them a bit trickier, but we can still try this technique by assuming that they will take no arguments and that the symbol will take the form __ct__[class name]Fv. We will almost certainly miss some, but it'll still hopefully help a bit.

So, I tried combining words to see if any matching constructors, destructors or vtables would appear. I successfully found a bunch more. I knew I would need to understand the game's naming scheme better in order to expand my heuristics, though.

Analysing the Developers' Minds

I mentioned that Animal Crossing: City Folk was compiled with runtime type information, which gives me a bunch of class names. Some things immediately become obvious here.

I have a full linker map for Wind Waker and Twilight Princess, so I can draw similar conclusions for those games.

The linker map also lists which .cpp file and which library (if any) a particular symbol came from, which is really interesting for tracing the lineage of these bits of code.

Functions Filenames Library Purpose
d d_com_inf_game.cpp, etc None Game-specific code
f f_pc_base.cpp, etc None Actor/process management
mDo m_Do_audio.cpp, etc None GameCube-specific functionality
mRe m_Re_controller_pad.cpp None Wii-specific functionality
c c_angle.cpp, etc SComponent.a General utility code
s s_basic.cpp SStandard.a Two very simple functions

They also include JSystem (internal Nintendo middleware used in lots of their first-party GameCube games and even some Wii titles), and of course the Dolphin/Revolution SDKs that every game uses.

Finally, I can also glean some info from what I know about New Super Mario Bros. Wii itself. It creates lots of small heaps for memory allocation, and many of these are named by passing a string to the heap creation function. Some of these leak names...

2Dリソース用ヒープ(d2d::ResAccMultLoader_c::create)
dBgActorManager_c::m_allocator
dBgTexMng_c::m_allocator
dCaptureMng_c::m_allocator
ダイナミックリンク制御用ヒープ(dDyl::cCc_frmHeap)
daMask_c::m_allocator
dMaskMng_c::m_allocator
dRes_c::info_c::mDataHeap
dSys_c::RootHeapMEM1
dSys_c::RootHeapMEM2
各プロセスが個別で持てるヒープ(fBase_c::mHeap)
2D表示用ヒープ(m2d::create)
アニメ切り替え用アロケータ(m3d::banm_c::m_heap)
ゲーム用汎用ヒープ1(mHeap::gameHeaps[1])
ゲーム用汎用ヒープ2(mHeap::gameHeaps[2])
汎用ファイル読み込み用ヒープ(mHeap::archiveHeap)
DVD読み込みコマンド用ヒープ(mHeap::commandHeap)
ダイナミックリンク用ヒープ(mHeap::dylinkHeap)
アサートヒープ(mHeap::assertHeap)

This development team seems to be very bad at sticking to any consistent naming scheme. All these games contain myriad examples of classes and methods that alternate between naming styles. Some ignore the _c suffix entirely.

Static Initialisers

When you create a complex object as a global variable (either at the top level of a C++ file, within a namespace, or as a static field in a class), that object's constructor needs to run at some point. Usually, compilers will generate a 'static initialiser' function for each translation unit (equivalent to a .cpp file in most cases), and reference it from the .ctors section.

The compiler used in NSMBW names these using the format __sinit_file_name_cpp. Up to now, I've managed to brute force most of these, giving me useful insight into how the game is laid out:

.ctors:802F2480                 .section ".ctors"
.ctors:802F2480 __init_cpp_exceptions_reference:.long __init_cpp_exceptions
.ctors:802F2480                                         # DATA XREF: __init_cpp+14↑o
.ctors:802F2484                 .long __sinit__d_3d_cpp
.ctors:802F2488                 .long __sinit__d_CourseSelectGuide_cpp
.ctors:802F248C                 .long __sinit__d_WarningBattery_cpp
.ctors:802F2490                 .long __sinit__d_WarningErrorInfo_cpp
.ctors:802F2494                 .long __sinit__d_WarningNunchuk_cpp
.ctors:802F2498                 .long __sinit__d_WarningOther_cpp
.ctors:802F249C                 .long __sinit__d_WarningYoKo_cpp
.ctors:802F24A0                 .long __sinit__d_a_boss_demo_cpp
.ctors:802F24A4                 .long __sinit__d_a_bullet_cpp
.ctors:802F24A8                 .long __sinit__d_a_en_bigpile_cpp
.ctors:802F24AC                 .long __sinit__d_a_en_blockmain_cpp
.ctors:802F24B0                 .long __sinit__d_a_en_bros_base_cpp
.ctors:802F24B4                 .long __sinit__d_a_en_carry_cpp
.ctors:802F24B8                 .long __sinit__d_a_en_coin_main_cpp
.ctors:802F24BC                 .long __sinit__d_a_en_dfpakkun_cpp
.ctors:802F24C0                 .long __sinit__d_a_en_door_cpp
.ctors:802F24C4                 .long __sinit__d_a_en_dpakkun_cpp
.ctors:802F24C8                 .long __sinit__d_a_en_dpakkun_base_cpp
.ctors:802F24CC                 .long __sinit__d_a_en_jimen_pakkun_base_cpp
.ctors:802F24D0                 .long __sinit__d_a_en_kuribo_base_cpp
.ctors:802F24D4                 .long __sinit__d_a_en_lkuribo_base_cpp
.ctors:802F24D8                 .long __sinit__d_a_en_net_nokonoko_base_cpp
.ctors:802F24DC                 .long __sinit__d_a_en_obj_coinblock_cpp
.ctors:802F24E0                 .long __sinit__d_a_en_shell_cpp
.ctors:802F24E4                 .long __sinit__d_a_en_super_bigpile_cpp
.ctors:802F24E8                 .long __sinit__d_a_en_togezo_base_cpp
.ctors:802F24EC                 .long __sinit__d_a_fireball_base_cpp
.ctors:802F24F0                 .long __sinit__d_a_lift_down_on_base_cpp
.ctors:802F24F4                 .long __sinit__d_a_move_pipe_cpp
.ctors:802F24F8                 .long __sinit__d_a_net_enemy_cpp
.ctors:802F24FC                 .long __sinit__d_a_player_base_cpp
.ctors:802F2500                 .long __sinit__d_a_player_demo_manager_cpp
.ctors:802F2504                 .long hashname_93e17764_93e17764
.ctors:802F2508                 .long __sinit__d_a_player_manager_cpp
.ctors:802F250C                 .long __sinit__d_a_right_base_cpp
.ctors:802F2510                 .long __sinit__d_a_rot_objs_base_cpp
.ctors:802F2514                 .long __sinit__d_a_spin_child_base_cpp
.ctors:802F2518                 .long __sinit__d_actor_cpp
.ctors:802F251C                 .long __sinit__d_actor_groupid_mng_cpp
.ctors:802F2520                 .long __sinit__d_actor_state_cpp
.ctors:802F2524                 .long __sinit__d_audio_cpp
.ctors:802F2528                 .long __sinit__d_base_actor_cpp
.ctors:802F252C                 .long __sinit__d_bc_cpp
.ctors:802F2530                 .long __sinit__d_bg_cpp
.ctors:802F2534                 .long __sinit__d_bg_actor_mng_cpp
.ctors:802F2538                 .long __sinit__d_bg_unit_cpp
.ctors:802F253C                 .long __sinit__d_capture_mng_cpp
.ctors:802F2540                 .long __sinit__d_cc_cpp
.ctors:802F2544                 .long __sinit__d_center_save_mng_cpp
.ctors:802F2548                 .long __sinit__d_coin_cpp
.ctors:802F254C                 .long __sinit__d_dylink_cpp
.ctors:802F2550                 .long __sinit__d_effactor_mng_cpp
.ctors:802F2554                 .long __sinit__d_effectmanager_cpp
.ctors:802F2558                 .long __sinit__d_enemy_boss_cpp
.ctors:802F255C                 .long __sinit__d_enemy_boss_koopa_jr_base_cpp
.ctors:802F2560                 .long __sinit__d_enemy_carry_cpp
.ctors:802F2564                 .long __sinit__d_enemy_death_cpp
.ctors:802F2568                 .long __sinit__d_enemy_jr_clown_base_cpp
.ctors:802F256C                 .long __sinit__d_enemy_state_cpp
.ctors:802F2570                 .long __sinit__d_enemy_toride_kokoopa_cpp
.ctors:802F2574                 .long __sinit__d_font_manager_cpp
.ctors:802F2578                 .long __sinit__d_fukidashiInfo_cpp
.ctors:802F257C                 .long __sinit__d_game_common_cpp
.ctors:802F2580                 .long hashname_3560d3e1_3560d3e1
.ctors:802F2584                 .long hashname_ac80e780_ac80e780
.ctors:802F2588                 .long __sinit__d_ice_param_cpp
.ctors:802F258C                 .long __sinit__d_iggy_wan_kusari_cpp
.ctors:802F2590                 .long __sinit__d_info_cpp
.ctors:802F2594                 .long hashname_5ee5d3b9_5ee5d3b9
.ctors:802F2598                 .long __sinit__d_line_mng_cpp
.ctors:802F259C                 .long __sinit__d_lytbase_cpp
.ctors:802F25A0                 .long __sinit__d_mask_draw_cpp
.ctors:802F25A4                 .long __sinit__d_mask_mng_cpp
.ctors:802F25A8                 .long __sinit__d_md_actor_cpp
.ctors:802F25AC                 .long __sinit__d_message_cpp
.ctors:802F25B0                 .long __sinit__d_objblock_mng_cpp
.ctors:802F25B4                 .long __sinit__d_pad_cpp
.ctors:802F25B8                 .long __sinit__d_player_model_manager_cpp
.ctors:802F25BC                 .long hashname_1f109cbd_1f109cbd
.ctors:802F25C0                 .long __sinit__d_remocon_mng_cpp
.ctors:802F25C4                 .long __sinit__d_replay_play_cpp
.ctors:802F25C8                 .long __sinit__d_screen_cpp
.ctors:802F25CC                 .long __sinit__d_stage_cpp
.ctors:802F25D0                 .long __sinit__d_system_cpp
.ctors:802F25D4                 .long __sinit__d_tencoin_mng_cpp
.ctors:802F25D8                 .long __sinit__d_wan_kusari_cpp
.ctors:802F25DC                 .long __sinit__d_wm_MapModel_cpp
.ctors:802F25E0                 .long hashname_e6eff101_e6eff101
.ctors:802F25E4                 .long hashname_88d7c583_88d7c583
.ctors:802F25E8                 .long __sinit__d_wm_actor_cpp
.ctors:802F25EC                 .long __sinit__d_wm_connect_cpp
.ctors:802F25F0                 .long __sinit__d_wm_csvdata_cpp
.ctors:802F25F4                 .long __sinit__d_wm_demo_actor_cpp
.ctors:802F25F8                 .long __sinit__d_wm_enemy_cpp
.ctors:802F25FC                 .long __sinit__d_wm_lib_cpp
.ctors:802F2600                 .long __sinit__d_wm_obj_actor_cpp
.ctors:802F2604                 .long __sinit__d_wm_player_base_cpp
.ctors:802F2608                 .long __sinit__d_stage_field_cpp
.ctors:802F260C                 .long hashname_5a32e048_5a32e048
.ctors:802F2610                 .long hashname_4febefa7_4febefa7
.ctors:802F2614                 .long hashname_80b1673c_80b1673c
.ctors:802F2618                 .long __sinit__d_s_stage_static_cpp
.ctors:802F261C                 .long __sinit__d_s_world_map_static_cpp
.ctors:802F2620                 .long __sinit__d_wm_bgm_sync_cpp
.ctors:802F2624                 .long hashname_a31805a9_a31805a9
.ctors:802F2628                 .long hashname_6e8484a6_6e8484a6
.ctors:802F262C                 .long hashname_c00c4da8_c00c4da8
.ctors:802F2630                 .long hashname_669cde8b_669cde8b
.ctors:802F2634                 .long __sinit__d_reset_cpp
.ctors:802F2638                 .long hashname_5db7568e_5db7568e
.ctors:802F263C                 .long __sinit__d_GoalManager_cpp
.ctors:802F2640                 .long __sinit__d_SmallScoreManager_cpp
.ctors:802F2644                 .long __sinit__d_WarningManager_cpp
.ctors:802F2648                 .long __sinit__d_a_cursor_cpp
.ctors:802F264C                 .long __sinit__d_a_en_eatcoin_cpp
.ctors:802F2650                 .long __sinit__d_a_en_hatena_balloon_cpp
.ctors:802F2654                 .long __sinit__d_a_enemy_ice_cpp
.ctors:802F2658                 .long __sinit__d_a_farBG_cpp
.ctors:802F265C                 .long __sinit__d_a_fireball_player_cpp
.ctors:802F2660                 .long __sinit__d_a_ice_cpp
.ctors:802F2664                 .long __sinit__d_a_iceball_cpp
.ctors:802F2668                 .long __sinit__d_a_player_cpp
.ctors:802F266C                 .long __sinit__d_a_yoshi_cpp
.ctors:802F2670                 .long __sinit__d_fukidashiManager_cpp
.ctors:802F2674                 .long __sinit__d_gamedisplay_cpp
.ctors:802F2678                 .long __sinit__d_pausewindow_cpp
.ctors:802F267C                 .long __sinit__d_s_boot_cpp
.ctors:802F2680                 .long __sinit__s_StateID_cpp
.ctors:802F2684                 .long __sinit__c_math_cpp
.ctors:802F2688                 .long __sinit__f_manager_cpp
.ctors:802F268C                 .long __sinit__m_angle_cpp
.ctors:802F2690                 .long __sinit__m_dvd_cpp
.ctors:802F2694                 .long __sinit__m_ef_cpp
.ctors:802F2698                 .long __sinit__m_mtx_cpp
.ctors:802F269C                 .long __sinit__m_pad_cpp
.ctors:802F26A0                 .long __sinit__m_vec_cpp
.ctors:802F26A4                 .long __sinit__lyt_bounding_cpp
.ctors:802F26A8                 .long __sinit__lyt_pane_cpp
.ctors:802F26AC                 .long __sinit__lyt_picture_cpp
.ctors:802F26B0                 .long __sinit__lyt_textBox_cpp
.ctors:802F26B4                 .long __sinit__lyt_window_cpp
.ctors:802F26B8                 .long __sinit__ut_TextWriterBase_cpp
.ctors:802F26BC                 .long __sinit__ut_IOStream_cpp
.ctors:802F26C0                 .long __sinit__ut_FileStream_cpp
.ctors:802F26C4                 .long __sinit__ut_DvdFileStream_cpp
.ctors:802F26C8                 .long __sinit__ut_DvdLockedFileStream_cpp
.ctors:802F26CC                 .long __sinit__ut_NandFileStream_cpp
.ctors:802F26D0                 .long __sinit__ut_LockedCache_cpp
.ctors:802F26D4                 .long __sinit__ut_TextWriterBase_cpp_001
.ctors:802F26D8                 .long __sinit__g3d_state_cpp
.ctors:802F26DC                 .long __sinit__snd_AxManager_cpp
.ctors:802F26E0                 .long __sinit__snd_BasicSound_cpp
.ctors:802F26E4                 .long __sinit__snd_SeqSound_cpp
.ctors:802F26E8                 .long __sinit__snd_Sound3DManager_cpp
.ctors:802F26EC                 .long __sinit__snd_SoundSystem_cpp
.ctors:802F26F0                 .long __sinit__snd_StrmSound_cpp
.ctors:802F26F4                 .long __sinit__snd_WaveSound_cpp
.ctors:802F26F8                 .long __sinit__ef_effectsystem_cpp
.ctors:802F26FC                 .long __sinit__ef_particlemanager_cpp
.ctors:802F2700                 .long __sinit__ef_resource_cpp
.ctors:802F2704                 .long __sinit__ef_emform_cpp
.ctors:802F2708                 .long __sinit__ef_drawstrategyimpl_cpp
.ctors:802F270C                 .long __sinit__lyt_pane_cpp_001
.ctors:802F2710                 .long __sinit__lyt_picture_cpp_001
.ctors:802F2714                 .long __sinit__lyt_textBox_cpp_001
.ctors:802F2718                 .long __sinit__lyt_window_cpp_001
.ctors:802F271C                 .long __sinit__lyt_bounding_cpp_001
.ctors:802F2720                 .long __sinit__eggDisplay_cpp
.ctors:802F2724                 .long __sinit__eggController_cpp
.ctors:802F2728                 .long __sinit__eggMatrix_cpp
.ctors:802F272C                 .long __sinit__eggVector_cpp
.ctors:802F2730                 .long __sinit__eggDrawHelper_cpp
.ctors:802F2734                 .long __sinit__eggDrawGX_cpp
.ctors:802F2738                 .long __sinit__eggFrustum_cpp
.ctors:802F273C                 .long hashname_4adc928a_4adc928a
.ctors:802F2740                 .long __sinit__eggScreen_cpp
.ctors:802F2744                 .long __sinit__eggScreenEffectBase_cpp
.ctors:802F2748                 .long __sinit__eggAudioFxMgr_cpp
.ctors:802F274C                 .long __sinit__eggAudioUtility_cpp
.ctors:802F2750                 .long __sinit__eggEffect_cpp

I had always wondered why dCourseSelectGuide_c and the Warning classes were at the very beginning of the executable. It turns out that they simply capitalised the filenames, so the linker picked them first. Naming conventions are hard.

The labels starting with hashname_... are placeholders where I haven't yet discovered the actual name - the first hash is the mangled one, and the second hash is the demangled one. For functions that aren't mangled at all, the same hash is in both slots.

Unfortunately this is just a subset of the names. If a particular .cpp file doesn't contain anything that needs to be statically constructed, then the compiler won't generate a static initialiser function for it.

This still gives us a fairly good outline of the game and shows just how closely it matches the framework set out by Wind Waker/Twilight Princess.

Hot Tricks

Anyhow, that's probably enough of looking at the game's structure. We've got a big pool of info to look at based on other games with the same heritage. And... as we find out more about the symbols in NSMBW, we can draw on that to expand our knowledge, pick up on more of the subtle naming patterns used by the developers, and hopefully get further towards cracking over 30,000 hashes.

Let's start attacking more of them! We're going to need some shenanigans.

Detecting False Positives

I outgrew the capabilities of my simple Rust brute-forcer and decided to employ hashcat once again. The hashing algorithm is so simple compared to CRC32 and MurmurHash3 that it was trivial to implement in OpenCL as a hashcat kernel, and really fast even on my lowly GTX 1060.

This gave me an amusing issue. Hashcat was able to try out so many possibilities that I had a massive amount of possible results, but they were full of false positives. I know that the hash 4A4C2014 resolves to fBase_c::createChild( unsigned short, fBase_c*, unsigned long, unsigned char ), but the following symbols (and thousands more) all hash to the exact same thing when mangled:

The good thing is that I can actually exclude almost all of the false positives programmatically.

Remember, the ALF symbol map includes two hashes for each symbol: one is the mangled form, and one is the demangled form. If I can demangle candidates and produce the exact same format, then I can hash that and see if they both match.

So, I decided to try and replicate NVIDIA's demangler. A bit of manual testing showed me the basic rules. Function arguments were surrounded by spaces, void was included in empty argument lists, and special names like __ct for constructors were simply output without being changed.

I needed a larger data set to be sure, so I went back to the Shield game collection. Twilight Princess is available for it, and I had linker maps for both the original builds of WW and TP. I didn't want to use the TP map since it was already demangled, using a scheme that didn't match NVIDIA's, so I stuck to the symbols that were in both WW and TP.

My test script's logic was simple. I went through every symbol I knew about in NSMBW and WW. I hashed it and searched the Shield symbol tables for that symbol, and if I found it, then I attempted to demangle it and then checked my result against the symbol table's demangled hash.

As it turns out, their de-mangling code is extremely sloppy and actually breaks on quite a few examples.

I managed to implement matching output for all of these except for function pointers, which were figured out almost a year later (thanks, RoadrunnerWMC!).

With a little more work, I had a script which would parse through Hashcat's output and tell me which of the many, many results actually matched up with the corresponding demangled hash.

A Known-Suffix Attack

When I was working on figuring out the BCSV field header hashes in AC:NH, a really cool breakthrough came thanks to GitHub user zbanks who shared a trick that allowed you to discover pairs of CRC32 hashes that differed only in specific characters. Can I pull off some similar nonsense here?

The hashing algorithm used by Lingcod's symbol table is even simpler than CRC32. For each character, you multiply the hash value by 33 and then XOR the result by the character.

I thought it would be cool if I could invert this process, by taking a hash and a known final character, and then removing that character. It seemed to me, intuitively, that this wouldn't work - surely you're throwing away information when you multiply the hash by 33 and then truncate the result to 32 bits? I decided to test it anyway, and to my surprise, it actually worked.

Say that hash(H, c) = (H * 33) XOR c. We know hash(H, c), and we presumably know character c, but we don't know H. If you XOR the hash by c again, you get (H * 33), truncated to 32 bits. If the result is divisible by 33, then you've won. If it's not, then you know bits were chopped off, so you can simply add 0x100000000 until you get a value of (H * 33) that is divisible by 33.

This is unbelievably powerful because it means that in certain cases, I can discover parts of connected symbols rather than having to guess the whole thing at once. Consider the following symbols, which I've already found:

I knew that all of these were connected because they're all called from one particular function in a large switch block, depending on the value in a field. There are no parameters other than the implicit this parameter, so the signature is almost certainly ( void ).

I can use my truncation trick (which I've called "tail" in my scripts) to remove ( void ) from the demangled hashes for these functions, or __13daPyDemoMng_cFv from the mangled hashes.

I've written another tool which takes a set of hashes and then tries to tail every word in my word list against each one. If tail(hash1, "Jump") == tail(hash2, "Land"), then it's very possible that these two hashes are connected and that the suffixes are indeed "Jump" and "Land". At that point, I can use the newly obtained hash of their common prefix and apply other techniques to it.

Guessing Function Arguments

I can get away with throwing a few typical signatures at a lot of functions, but that approach doesn't scale. Consider the following function: set__9dBg_ctr_cFP8dActor_cffffPFP8dActor_cP8dActor_c_vPFP8dActor_cP8dActor_c_vPFP8dActor_cP8dActor_cUc_vUcUcP7mVec3_c

This demangles to: dBg_ctr_c::set( dActor_c*, float, float, float, float, void(*)(dActor_c*, dActor_c*), void(*)(dActor_c*, dActor_c*), void(*)(dActor_c, dActor_c, unsigned char), unsigned char, unsigned char, mVec3_c* )

There is no way I could have brute-forced the signature to this 11-argument function (incorporating 3 function pointer types) before the heat death of the universe. Manual analysis is necessary.

Integer and Struct Arguments

You can whittle down the possibilities for a particular type by looking at how it's accessed in the function, and how it's provided to the function by its callers. There are a few obvious patterns that guarantee you an easy win:

Virtual functions can make for easy red herrings in this regard. There are quite a few empty stub functions in classes like dActor_c which accept parameters. The only way to tell here is to track down the locations that call that particular function, and look at subclasses' implementations of that function if possible.

Bonus trick: The known-suffix attack can be used as an oracle to confirm if you've figured out a virtual method's signature correctly.

If you have two hashes for the same method in different classes, then you can try and remove the class name and the signature from each one. If the result is the same hash, then you know you have the right value.

Floating-point Arguments

Using the PowerPC EABI calling convention, floating-point arguments are passed using the floating-point registers. This muddles up the process further as there's no way to tell how these were originally ordered. Consider the following examples:

void func(float x, int a, int b);  // fp1 = x, r3 = a, r4 = b
void func(int a, float x, int b);  // r3 = a, fp1 = x, r4 = b
void func(int a, int b, float x);  // r3 = a, r4 = b, fp1 = x

These three signatures are all different, but you can't tell from the resulting assembly since the arguments will always go into the same locations.

Exploring the Engine

Now I've gone over a bunch of tricks for bruteforcing symbols, it's probably a good time to explain some other parts of how NSMBW is put together. A lot of these topics were pretty well-understood based on my original research in 2010-2012, but we didn't have official names for many of these aspects until now.

Processes ('Bases')

I've mentioned fBase_c a few times, which is the core class underpinning all actors and various other processes.

A quick primer on the engine's structure, so you know what's going on here. Practically all game logic exists inside of a process. Processes are stored in a tree, and each one has a 'profile' which determines how it behaves (is it a Goomba, or Mario, or the HUD, or the abstract concept of a world map?) and the priority it has relative to other processes.

Scenes

At any given point, you have a root 'scene' process (inheriting from dScene_c) which acts as a controller for that part of the game. It's effectively a singleton - the game won't stop you from creating extras, but it would break things (and not even in a cool way). NSMBW contains the following scenes.

Profile Name Class Name Purpose
BOOT dScBoot_c The game's initialisation sequence (wrist strap warning screen, "hold the Wii Remote sideways", savefile creation and validation)
GAME_SETUP dScGameSetup_c Controls the 'File Select', 'Number of Players' and controller pairing screens
MULTI_PLAY_COURSE_SELECT dScMulti_c Allows a level to be selected in the special multiplayer modes
RESULT dScResult_c Results of a multiplayer mode session
WORLD_9_DEMO dScWorld9DeMo_c Cutscene that unlocks World 9
AUTO_SELECT Unknown Completely empty and unused
CRSIN dScCrsin_c Level loading sequence ("World 1-1" screen)
GAMEOVER dScGameOver_c Displays the 'Game Over' and 'Continue?' screens/prompts
MOVIE dScMovie_c Displays the cutscenes that don't take place in-game (intro, ending)
RESTART_CRSIN Unknown Enters a level without displaying the typical load screen (used for title screen, Peach's Castle)
SELECT Unknown Completely empty and unused
STAGE dScStage_c The classic 2D platforming gameplay everyone knows and loves
WORLD_MAP dScWMap_c A three-dimensional world map that lets you select a level

Each scene is responsible for loading all the data and creating all the other processes it needs. The game is able to transition between scenes by storing parameters, asking the current scene to tear itself down, and then creating the new one afterwards.

Actors and Other Processes

Actors all inherit from dBaseActor_c, although typically they will use a more specific subclass (dActor_c in-game, dWmActor_c on the world map, etc.) which allows them to play nicely with scene-specific functionality. This gives them fields and methods for dealing with funky stuff like positioning themselves inside the game world.

There's a smattering of processes that aren't actors or scenes. These all inherit from dBase_c, and are sometimes simply referred to as 'bases'. These are pretty much just user interface elements or singleton managers (such as dBgGm_c which renders the level, and dWarningManager_c which controls messages that can pop up mid-game).

Structure

Each fBase_c object stores a bunch of data.

The last part is the most interesting, but it's also quite difficult to understand. Most of the functions are inlined, so you have no insight into how the objects are laid out in the original types. A partial decompilation of the constructor for fBase_c looks like this:

  this->mUniqueID = fBase_c::m_rootUniqueID;
  this->mParam = fBase_c::m_tmpCtParam;
  this->mProfName = fBase_c::m_tmpCtProfName;
  this->mGroupType = fBase_c::m_tmpCtGroupType;
  cTreeNd_c::cTreeNd_c(&this->mManager.connectNode);
  this->mManager.connectNode.owner = this;
  this->mManager.executeNode.prev = 0;
  this->mManager.executeNode.next = 0;
  this->mManager.executeNode.mOwner = this;
  this->mManager.executeNode.mCurrentPriority = 0;
  this->mManager.executeNode.mNextPriority = 0;
  this->mManager.drawNode.prev = 0;
  this->mManager.drawNode.next = 0;
  this->mManager.drawNode.mOwner = this;
  this->mManager.drawNode.mCurrentPriority = 0;
  this->mManager.drawNode.mNextPriority = 0;
  this->mManager.otherNode.prev = 0;
  this->mManager.otherNode.next = 0;
  this->mManager.otherNode.mOwner = this;
  this->list54.head = 0;
  this->list54.tail = 0;
  if (++fBase_c::m_rootUniqueID == -1) {
    for (;;) { }
  }
  fManager_c::m_connectManage.addTreeNode(&this->mManager.connectNode, fBase_c::m_tmpCtConnectParent);
  tableNum = this->mManager.getSearchTableNum();
  fManager_c::m_searchManage[tableNum].prepend(&this->mManager.otherNode); // assumed function name

I originally assumed that the list and tree nodes were direct members of fBase_c, because of how they were constructed. However, at the end of this snippet, fManager_c::getSearchTableNum(void) is called. This reads this->otherNode.mOwner, which happens to be at offset 0x3C inside the manager object. Therefore I know that the Manager object includes that field, as well as everything that appears before it (the connect, execute and draw nodes).

LoopProc

The game's main loop is mediated by five core operations: Connect, Create, Delete, Execute and Draw.

Fun fact: If you look at the crash debugger in the DS games, it shows the current LoopProc and ProfName for whichever process caused the crash.

There are five global data structures for these:

Every tick, the static fManager_c::mainLoop(void) method is called. This method acts on those lists in that order:

connectProc is responsible for some process management tasks. If fBase_c::deleteRequest() is called on a process, an internal flag is set, and the process is moved to the Delete list during the next Connect loop. Similarly, if a process has been created successfully, it will be moved to the Execute and Draw lists during the next Connect loop.

The four 'Pack' methods are fairly straightforward in theory, but not in practice. fBase_c has three virtual methods for each of these processes:

LoopProc Method 1 ("Do") Method 2 ("Pre") Method 3 ("Post")
Create create(void) preCreate(void) postCreate(fBase_c::MAIN_STATE_e)
Delete doDelete(void) preDelete(void) postDelete(fBase_c::MAIN_STATE_e)
Execute execute(void) preExecute(void) postExecute(fBase_c::MAIN_STATE_e)
Draw draw(void) preDraw(void) postDraw(fBase_c::MAIN_STATE_e)

The names of the 'Post' methods eluded me for ages because none of my brute-forcing managed to hit upon the obligatory fBase_c::MAIN_STATE_e type - I simply did not expect it to be all capitalised. It was finally figured out in August 2021 by RootCubed long after I had given up.

Each of the four 'Pack' methods calls a generic method (whose name I'm missing), passing it a pointer to the corresponding methods.

It calls the 'Pre' method. If it returns a non-zero value (likely true), it will then invoke the 'Do' method. Afterwards, in all cases, it calls the 'Post' method, with an argument telling it what happened.

Return from preX Return from doX Argument to postX
true 0 3 (waiting for something)
true 1 2 (completed successfully)
true any other value 1 (something went wrong)
false N/A (not executed) 0 (we cancelled the operation early)

These two return values are probably separate enum types under fBase_c, in a similar fashion to MAIN_STATE_e. It would take a miracle to figure these out. There is precisely one symbol that will contain them, and it's the aforementioned generic method.

Its signature will look something like this... fBase_c::unknownName( fBase_c::UNKNOWN_ENUM_1_e (fBase_c::*)(void), fBase_c::UNKNOWN_ENUM_2_e (fBase_c::*)(void), void (fBase_c::*)(fBase_c::MAIN_STATE_e) )

Process Lifecycle

Naming aside - what's the intended use case for these? We can get a better idea by looking at how the default implementations of these methods behave.

It looks like Nintendo reserves the 'doX' methods for actor-specific behaviour, while using the 'preX' and 'postX' methods to supply common shared behaviour in base classes.

There's no way we can figure out what the actual names of the enum values are, but we can get a good sense of what they signify from this.

If a 'doX' method returns 0, this seems to mean that the process is waiting for something (like setup/teardown of another object), and this allows a process to remain in the create/delete state across multiple ticks. The engine will continue calling the create/delete method in each tick until something changes.

If a 'doX' method returns 1, that means the operation succeeded - this is the "happy path".

If a 'doX' method returns any other value, that means the operation failed entirely. This is seen in the empty/unused actor EN_GHOST_JUGEM, where create() returns 2 which leads to the actor being deleted immediately.

Collision Detection

"It's just a 2D Mario game! How complicated can it be?"

Movement in a 2D platformer looks straightforward to implement, but... it really isn't. Making a game that plays in a satisfying fashion is actually really tricky! I'm not going to go into the fine intricacies on everything involved, but I want to give you an overview on how NSMBW implements this.

There are three totally distinct forms of collision detection in this game. Ouch.

Actor-to-Actor Collisions

Players and enemies are able to detect collisions with each other by using the class dCc_c (sometimes unofficially called ActivePhysics because that's the terrible name 15-year-old me came up with for it). Each instance represents a collidable entity which can be a rectangle, circle or 'daikei' (Japanese for trapezoid).

Each activated instance of dCc_c is stored into a global linked list, and the game checks collisions between them each tick. It's configured using a structure where you can specify the following properties:

After dCc_c has been initialised using this structure, other fields can be modified to configure the 'LineKind' (which is used for actors that are placed on climbable fences and which can be on either side of the fence) and the layer (not actually used by Nintendo in practice).

Category

Category, Attack and the two associated bitfields are the most interesting aspects here, as they allow actors to interact with each other in a somewhat generic fashion.

For a collision between actors A and B to occur, then the following condition must be true: ((A.categoryBitfield & (1 << B.category)) != 0) || ((B.categoryBitfield & (1 << A.category)) != 0)

In plain English... the category bitfield determines which kinds of actors an actor wants to hear from. For A and B to collide, then A must be interested in B, and B must also be interested in A.

Category Description
0 Player/Yoshi generic
1 Player/Yoshi attacks
3 General entities
4 Balloons
5 Powerups/collectables
6 Projectiles

Attack

Additionally, if an actor is of Category 1, then its attack value is also checked against the opposing actor's attack bitfield.

Entity actors that inherit from dEn_c can all make use of a default HitCallback that looks at the properties in dCc_c and then calls an appropriate virtual method based on the opposing actor's properties. This means that we have official names for most of them, based off the names of these virtual methods!

Attack Description
1 Fire (shot by the player)
2 Ice (shot by the player)
3 Star (invincibility)
5 Slip (sliding down slopes)
7 HipAttk/HipAttack (ground pounding)
8 WireNet/NetPunch (hitting a climbable fence)
9 Shell (a Koopa/Buzzy Beetle/Spiny shell kicked by the player)
10 PenguinSlide/PenguinSlip (sliding as Penguin Mario)
11 Spin (spin jumping)
13 SpinFall (drilling as Propeller Mario)
14 Fire (used for explosions, AFAIK)
15 YoshiEat
16 YoshiMouth
17 Cannon (player shot out of a pipe cannon)
18 SpinLiftUp
19 YoshiBullet
20 YoshiFire
21 Ice

There are additional virtual methods that handle collisions for 'Large', 'Rolling' and 'Screw', but these are never called by the game. Perhaps they mapped to the gaps at 4, 6 and 12.

Actor-to-Solid ("BG") Collisions

The second collision plane in NSMBW is used for terrain and solid* objects (which Nintendo calls "BG") - this is far more complicated.

With actor-to-actor collisions, an instance of dCc_c allows your actor to exist in the world and detect other actors. No such luck for BG collisions - creating a BG object and detecting BG objects are entirely distinct concerns.

*I wrote half of this section assuming that these objects were always solids, and then I realised that this isn't always the case... there's some non-solids that are treated this way, such as climbable fences and coins. Oops.

Tiles

NSMBW sticks doggedly to the template set out by Super Mario Bros. in 1985, even though it came out on the Wii in 2009.

Levels are built out of tiles placed on a strict grid. Each tile is 16 by 16... I would normally have said pixels here, but the raw graphics are actually 24x24, and the pixel dimensions on-screen are dependent on the level's configuration, which region and aspect ratio you're playing the game in, the camera movement, and so on.

Each tile (Nintendo calls them "units") has two 32-bit values called type and kind which determine how it behaves - this is where they've moved up from SMB1. There's a staggering amount of options so I won't list them all, but here's the general gist of what they can do:

BG Objects

Actors can create BG objects in the game world by creating an instance of the dBg_ctr_c class. This is used by actors such as the question/brick blocks, the moving stone blocks from tower levels and the ice blocks that appear when you freeze an enemy.

Each dBg_ctr_c object can be either rectangular or round, and can be rotated. The game is smart enough to keep track of actors that are standing on one, so that they receive the correct displacement if the object moves/rotates.

They're mainly configured using one field which determines the overall behaviour type (standard, coin, climbable fence, ...) and a bitfield which allows extra flags to be specified (icy surface, forbids sliding, ...) - I've never successfully documented all of them as some of them don't seem to have obvious purposes or effects.

They also have two sets of callbacks (each set includes one function for each movement direction). The first set is called 'CheckRev', and returns a boolean - I'm not 100% clear on what the effects of this are. The second set is called when an actor touches the BG object under certain conditions.

Sensors

dBc_c (likely standing for BG Check) is the class used when an actor needs to detect/sense where it is in the world - this allows it to engage with tiles and with BG objects. It is complicated but also immensely powerful.

It's configured by using structures I call 'sensors' (I don't know the official name for them yet). There are three sensors, but you don't need to configure and use all of them.

Each sensor is either a point or a line (vertical for the Wall sensor, horizontal for the Head/Foot sensor) which is defined using an X/Y offset from the actor's base position. It cannot be rotated.

Each sensor also has a bitfield assigned to it, which contains flags determining how it interacts with various parts of the world. It took me a long time to figure out how these worked, but it was a major "eureka!" moment because it explained so many aspects of the game that you just take for granted.

The best way to explain this is by looking at a few examples from the game itself.

Various methods on dBc_c produce extra result bitfields which actors can query to determine whether a particular sensor has interacted with specific kinds of object. At its most basic level, this allows enemies to turn around when they hit a wall (or reach the edge of a platform), but there's other stuff exposed as well.

Rideable Actors

The final collision plane is used by a small subset of actors, but still important! This actually encompasses a few different kinds of colliders.

Rideable Surfaces (dRide_ctr_c)

Every kind of rideable surface inherits from dRide_ctr_c - and each active instance is stored in a global linked list, just like with dCc_c and dBg_ctr_c.

Although these are only straight lines, actors can employ multiple instances to create more complex shapes.

Ride Actors

Every actor automatically includes an instance of dRc_c (which I believe stands for 'Ride Check'), which allows it to interact with rideable surfaces. It uses various linked lists to keep track of which surface an actor is standing on ('riding'), and automatically update its position as the actor moves around.

In most cases, you don't need to interact with dRc_c directly as this is done for you when calling methods on dBc_c. So, if you use dBc_c to allow your actor to sense solid objects and avoid falling through the ground/going through walls, then your actor should automatically be able to ride on surfaces as well.

Combining Collision Types

Lots of actors need to employ more than one of these collision types to function correctly. dActor_c includes an instance of dCc_c and dBc_c, but these are disabled by default, and some actors actually create extras to do more complex things.

Once again, I think the best way to explain these is with some examples from the game. These are displayed using the code I wrote for Newer Super Mario Bros. Wii which adds a wireframe overlay over different kinds of collision objects.

If you have any of the Newer mods, you can press + on the world map and select Star Coins, and then press the - button sixteen times. This will enable the debug rendering.

 Non-Rectangular Objects

These rotating blocks are built out of four dBg_ctr_c instances, one forming each side. They overlap at the corners, but that's not a problem at all.

Parabeetles

Regular Parabeetles use a single dRide2Point_c, a flat line. Heavy Parabeetles use three instances of it, creating a hump-like shape that roughly matches the shape of the enemy. Both of these also include a rectangular dCc_c which is visible in the middle - this is responsible for hurting the player if they come into contact with the underside of the Parabeetle.

Take a closer look at Mario and the Propeller Block as well. Since both of these actors are able to collide with terrain, they have dBc_c sensors activated - these are the green lines/dots you can see around the Propeller Block and the icy blue lines you can see around Mario.

King Bills

The King Bill is actually one of the most interesting enemies in the game, collision-wise - and my half-assed debugging code doesn't really do it justice for a couple of reasons, but I hope the screenshot still helps.

First off, it uses two dCc_c instances so that it actually hurts the player when touched. Although the screenshot shows two boxes, one of them is actually a circle - this is how the bullet shape is created.

Secondly, the King Bill breaks blocks it passes through, as shown there. They accomplish this by creating 15 separate dBc_c instances, each of which has a sensor assigned depending on the direction that the King Bill is travelling in. The sensor's flags are configured to destroy any blocks they touch.

By carefully positioning the sensors to follow the curve, they can create the illusion that the circular end of the King Bill is destroying blocks even though the sensors are actually just straight lines. Since the blocks are on a grid anyway, it's not obvious that they're faking it.

The extra sensors aren't visible in the debug layer, because there wasn't an easy way for me to iterate over them in the debugging code. There's no linked list of dBc_c instances as there is for most of the other collision-related classes. Therefore, I can only display info for actors that use the default instance that exists in dActor_c.

Mysteries

There's a few aspects of this game's internals that I've devoted specific sections to, because I wanted to cover them in depth for specific reasons - some have been fully solved, while others have awkward bits and pieces missing.

What The Hell Is fBaseID_e?

I mentioned that the only NSMBW-specific symbol leaked by NVIDIA's emulator was searchNodeByID__9fLiMgBa_cCF9fBaseID_e, or fLiMgBa_c::searchNodeByID( fBaseID_e ) const. What can I learn about this?

I was really confused by fBaseID_e, because _e sounded to me like it would be an enumeration. This ID in question is essentially a serial number for each fBase_c instance, and is used to allow actors to reference other actors without having to hold a pointer that may be invalidated (e.g. Mario referencing the block he is carrying). After 0xFFFFFFFF processes have been constructed, the ID overflows and the game goes into a deliberate infinite loop inside the fBase_c constructor.

(In hindsight, it was a pretty lucky break that Nvidia gave us that particular symbol by HLEing it in their emulator - there is no way I would have discovered it myself!)

Anyhow, why would the developers use an enum for a type that has no meaningful fixed values? My guess is that it's for type-checking.

It can't be a typedef, because those are not reflected in mangled symbols - if it was a typedef of unsigned long, then the symbol would just contain Ul rather than 9fBaseID_e. It can't be a structure, because if it was, this compiler would not pass it by value in a single register.

State Machines

The state machine system in this game has been one of the biggest challenges in the symbol quest so far. It is heavily reliant on C++ templates. I've managed to locate the following classes so far...

Interface Description
sStateIDChkIf_c validator for state IDs (never used for anything meaningful in this game, from what I can see)
sStateIDIf_c state ID interface
sStateIf_c calls state methods on the class that owns the state machine
sStateFctIf_c factories that create an implementation of sStateIf_c
sStateMgrIf_c state manager
Class Description
sStateMethod_c abstract state machine
sStateMethodUsr_FI_c a subclass of sStateMethod_c that implements some of its pure virtuals
sStateIDChk_c an implementation of sStateIDChkIf_c that just accepts all state IDs
sStateID_c a base class for state IDs which includes no actions
sFStateID_c<T> an implementation of sStateIDIf_c which can call methods on an instance of class T
sFStateVirtualID_c<T> a variant of sFStateID_c<T> for states that can have their implementations overridden in a subclass of T
sFState_c<T> an implementation of sStateIf_c for class T
sFStateFct_c<T> an implementation of sStateFctIf_c for class T
sFStateMgr_c<T,M> an implementation of sStateMgrIf_c for class T and state method class M
sFStateStateMgr_c<T,M1,M2> a class which holds an sFStateMgr<T,M1> and an sFStateMgr<T,M2>, multiplexing between them

I've figured out the names and signatures of every single method. The only thing I'm missing in the entire system is the direct base class of each sFStateMgr_c<...> and sFStateStateMgr_c<...> instantiation. I know the underlying code will look something like this...

template <class T,M> // implied, but seemingly incorrect
class UnknownMysteryClass : public sStateMgrIf_c {
  sStateIDChk_c chk;
  sFStateFct_c<T> fct;
  M method;

  UnknownMysteryClass(T *owner, const sStateIDIf_c &initialState)
    : fct(owner)
    , method(chk, fct, initialState)
  { }

  virtual ~UnknownMysteryClass() { }
  virtual void initializeState() { method.initializeStateMethod(); }
  virtual void executeState() { method.executeStateMethod(); }
  virtual void finalizeState() { method.finalizeStateMethod(); }
  virtual void changeState(const sStateIDIf_c &id) { method.changeStateMethod(id); }
  virtual void refreshState() { method.refreshStateMethod(); }
  virtual sStateIf_c *getState() const { return method.getState(); }
  virtual sStateIDIf_c *getNewStateID() const { return method.getNewStateID(); }
  virtual sStateIDIf_c *getStateID() const { return method.getStateID(); }
  virtual sStateIDIf_c *getOldStateID() const { return method.getOldStateID(); }
};

template <class T,M>
class sFStateMgr_c : public UnknownMysteryClass<T,M> {
  sFStateMgr_c(T *owner, const sStateIDIf_c &initialState)
    : UnknownMysteryClass(owner, initialState)
  { }
  virtual ~sFStateMgr_c() { }
};

However, I still can't figure out the name of the mystery class that sits in between. All of these constructors (with the exception of M, which is always sStateMethodUsr_FI_c), are inlined, which makes it a bit tricky to infer the exact structure of the classes. I do know that the mystery class has to contain fct and method, since it defines virtual methods that reference those fields.

A very similar structure holds for sFStateStateMgr_c<T,M1,M2>. There's a base class with a name I can't crack which defines all of the virtual methods and the instances of M1 and M2.

I was able to figure out sFStateMgr_c and sFStateStateMgr_c by using the known-suffix attack on two different instantiations of them - I wrote a script that would put together different permutations of classes I expected to see after T, and try each one until it found a match.

The same trick hasn't worked for the base classes, however. Since it includes instances of sStateIDChk_c and sStateFct_c<T>, I hypothesised that these could possibly also be template arguments on it. That doesn't seem to be the case, though; I've tried permutations of all the different state classes I currently know about, as well as primitive types, with no luck.

Bespoke Struct and Enum Types

I've run into a bunch of structures and enumerations specific to parts of the game logic. These are rather frustrating because they add extra variance to the guessing process, but I've still managed to derive a few function names involving them, such as:

Figuring this out required me to make informed guesses (what namespace could this be defined under...?).

Data structures are worse, as they seem to treat the naming conventions as a mere suggestion. I have found all of the following examples:

I've made pretty good progress on these, but there are still a bunch of unknowns that are really bothering me, like the sensor structures used by dBc_c and the configuration structure used by dCc_c.

The c Module

One thing that's pretty cool is that, as I highlighted in the section about static initialisers, the original game's build tools will link specific .cpp files into the output in alphabetical order. This means you can often guess at what letter a particular module's naming is going to start with, based on where it occurs in the executable.

c_math and c_lib

Consider this block from the linker map in Wind Waker, which lists the functions imported from c_math.cpp:

  00240964 0005ec 80246044  1 .text     SComponent.a c_math.cpp
  00240964 000058 80246044  4 cM_rad2s__Ff  SComponent.a c_math.cpp
  002409bc 000034 8024609c  4 U_GetAtanTable__Fff   SComponent.a c_math.cpp
  002409f0 0001a0 802460d0  4 cM_atan2s__Fff    SComponent.a c_math.cpp
  00240b90 000048 80246270  4 cM_atan2f__Fff    SComponent.a c_math.cpp
  00240bd8 000010 802462b8  4 cM_initRnd__Fiii  SComponent.a c_math.cpp
  00240be8 0000e8 802462c8  4 cM_rnd__Fv    SComponent.a c_math.cpp
  00240cd0 000038 802463b0  4 cM_rndF__Ff   SComponent.a c_math.cpp
  00240d08 000048 802463e8  4 cM_rndFX__Ff  SComponent.a c_math.cpp
  00240d50 000010 80246430  4 cM_initRnd2__Fiii     SComponent.a c_math.cpp

U_GetAtanTable(float, float) exists in NSMBW as well, with the same name. I noticed that the functions surrounding it were suspiciously similar. It turns out that they'd simply done some refactoring and moved these into either a namespace called cM, or made them static methods of a class called cM - the CodeWarrior mangling scheme doesn't distinguish, so we can't know for sure.

The same thing has occurred with some of the cLib_ utility functions, which have been comfortably relocated into a namespace or class called cLib - with the only exception being templated/type-specific functions which Nintendo have unified into templated functions under sLib.

Zelda WW/TPNSMB Wii
cLib_addCalcPos(cXyz*, const cXyz&, float, float, float)cLib::addCalcPos(mVec3_c*, const mVec3_c&, float, float, float)
cLib_chasePos(cXyz*, const cXyz&, float)cLib::chasePos(mVec3_c*, const mVec3_c&, float)
cLib_targetAngleX(const cXyz*, const cXyz*)cLib::targetAngleX(const mVec3_c&, const mVec3_c)
cM_rad2s(float)cM::rad2s(float)
cM_rnd(void)cM::rnd(void)
T cLib_calcTimer<T>(T*)T sLib::calcTimer<T>(T*)
cLib_addCalc(float*, float, float, float, float)sLib::addCalc(float*, float, float, float, float)
cLib_chaseUC(unsigned char*, unsigned char, unsigned char)int sLib::chaseT<T>(T*, T, T)
cLib_chaseS(short*, short, short)
cLib_chaseF(float*, float, float)

Lists and Trees

In spite of this, there's still a few aspects that are eluding me... WW/TP have a linked list implementation in c_list.cpp, and a tree implementation in c_tree.cpp and c_node.cpp, with broadly similar APIs.

  0024a15c 000014 80250a7c  4 cLs_Init(node_list_class *)   SComponent.a c_list.o
  0024a170 000084 80250a90  4 cLs_SingleCut(node_class *)   SComponent.a c_list.o
  0024a1f4 000074 80250b14  4 cLs_Addition(node_list_class *, node_class *)     SComponent.a c_list.o
  0024a268 000084 80250b88  4 cLs_Insert(node_list_class *, int, node_class *)  SComponent.a c_list.o
  0024a2ec 000048 80250c0c  4 cLs_GetFirst(node_list_class *)   SComponent.a c_list.o
  0024a334 000004 80250c54  4 cLs_Create(node_list_class *)     SComponent.a c_list.o
  0024a338 00001c 80250c58  4 cLsIt_Method(node_list_class *, int (*)(node_class *, void *), void *)    SComponent.a c_list_iter.o
  0024a354 00001c 80250c74  4 cLsIt_Judge(node_list_class *, void *(*)(node_class *, void *), void *)   SComponent.a c_list_iter.o
  0024a370 000030 80250c90  4 cNd_LengthOf(node_class *)    SComponent.a c_node.o
  0024a3a0 000030 80250cc0  4 cNd_First(node_class *)   SComponent.a c_node.o
  0024a3d0 000030 80250cf0  4 cNd_Last(node_class *)    SComponent.a c_node.o
  0024a400 000050 80250d20  4 cNd_Order(node_class *, int)  SComponent.a c_node.o
  0024a450 000034 80250d70  4 cNd_SingleCut(node_class *)   SComponent.a c_node.o
  0024a484 000020 80250da4  4 cNd_Cut(node_class *)     SComponent.a c_node.o
  0024a4a4 000034 80250dc4  4 cNd_Addition(node_class *, node_class *)  SComponent.a c_node.o
  0024a4d8 000078 80250df8  4 cNd_Insert(node_class *, node_class *)    SComponent.a c_node.o
  0024a550 000028 80250e70  4 cNd_SetObject(node_class *, void *)   SComponent.a c_node.o
  0024a578 000008 80250e98  4 cNd_ClearObject(node_class *)     SComponent.a c_node.o
  0024a580 000014 80250ea0  4 cNd_ForcedClear(node_class *)     SComponent.a c_node.o
  0024a594 000014 80250eb4  4 cNd_Create(node_class *, void *)  SComponent.a c_node.o
  0024a5a8 000090 80250ec8  4 cNdIt_Method(node_class *, int (*)(node_class *, void *), void *)     SComponent.a c_node_iter.o
  0024a638 00008c 80250f58  4 cNdIt_Judge(node_class *, void *(*)(node_class *, void *), void *)    SComponent.a c_node_iter.o
  0024a6c4 000004 80250fe4  4 cTr_SingleCut(node_class *)   SComponent.a c_tree.o
  0024a6c8 00002c 80250fe8  4 cTr_Addition(node_lists_tree_class *, int, node_class *)  SComponent.a c_tree.o
  0024a6f4 00002c 80251014  4 cTr_Insert(node_lists_tree_class *, int, node_class *, int)   SComponent.a c_tree.o
  0024a720 000058 80251040  4 cTr_Create(node_lists_tree_class *, node_list_class *, int)   SComponent.a c_tree.o
  0024a778 000074 80251098  4 cTrIt_Method(node_lists_tree_class *, int (*)(node_class *, void *), void *)  SComponent.a c_tree_iter.o
  0024a7ec 000070 8025110c  4 cTrIt_Judge(node_lists_tree_class *, void *(*)(node_class *, void *), void *)     SComponent.a c_tree_iter.o

In NSMBW, these are rather different, and harder to understand because of how many functions have been inlined by the compiler. I've successfully located the cTreeNd_c (tree node) and cTreeMg_c classes, which appear to serve the roles played by cNd/node_class and cTr/node_lists_tree_class in the GC Zeldas:

cTreeNd_c::cTreeNd_c(void)
cTreeNd_c::forcedClear(void)
cTreeMg_c::addTreeNode(cTreeNd_c*, cTreeNd_c*)
cTreeMg_c::removeTreeNode(cTreeNd_c*)
cTreeNd_c::getTreeNext(void) const
cTreeNd_c::getTreeNextNotChild(void) const

There are four linked list functions in NSMBW, located directly after cLib and directly before cM3d (3D math), which suggests that they're probably still in c_list.cpp. These take the form of:

I hoped that similar rules would apply here, and that they would have turned these into classes like cListMg_c and cListNd_c. That doesn't seem to be the case, though.

What Next?

I worked on this pretty heavily for a few months around the end of 2020 and the beginning of 2021, but then put it aside to focus on other projects. I started writing this blog post in March 2021, left off halfway through, and have written the rest of it in February 2022. Isn't ADHD fun?!

Shoutout to RoadrunnerWMC and RootCubed for discovering some of the things that absolutely drove me nuts here, like the broken demangling for function pointers and the real name of fBase_c::MAIN_STATE_e. I actually had to remove a paragraph I'd written about how it was impossible to find out what MAIN_STATE_e was, because... now we actually have it.

But wait, there's more- I want to go on another small tangent before I cap off this post.

The Post-Wii Era

I've written at depth here about the other games that use variations on the same engine. The NSMB, Animal Crossing and Zelda teams are all tied to it. So what happened after that?

There's not a whole lot to go off in the newer games, but there's scraps of information we can pull at.

On the Wii U side, we've got two leads. The mainline Zelda series brings us Breath of the Wild, which I can't really say much about. To my knowledge, it's vastly different from any of the other games Nintendo has built - it doesn't even use LunchPack. And there's New Super Mario Bros. U of course, which is obviously derived from NSMBW, but I've not really investigated it in depth.

On the 3DS side, there's Animal Crossing: New Leaf and New Super Mario Bros. 2. I've only looked into these minimally, but there's a bunch of interesting strings in these games.

The byzantine naming scheme like dWhatever_c has seemingly been abandoned. Both games have shifted to naming all profiles in a far more consistent way, with prefixes 'Ac', 'Bs' and 'Sc' for actors, bases and scenes respectively.

However, the thing I found most interesting is that there's a particular save management class which exists in both City Folk (Wii) and New Leaf (3DS), where we have names from the former via RTTI and from the latter via debug strings. These are straight up identical.

City Folk New Leaf
dSvMgr_c::stepLoad_c BsSvMgr::stepLoad
dSvMgr_c::stepSaveConnectNetHst_c BsSvMgr::stepSaveConnectNetHst
dSvMgr_c::stepSaveConnectNetVst_c BsSvMgr::stepSaveConnectNetVst
dSvMgr_c::stepSaveContinueNetHst_c BsSvMgr::stepSaveContinueNetHst
dSvMgr_c::stepSaveContinueNetVst_c BsSvMgr::stepSaveContinueNetVst
dSvMgr_c::stepSaveInterruptNetHst_c BsSvMgr::stepSaveInterruptNetHst
dSvMgr_c::stepSaveInterruptNetVst_c BsSvMgr::stepSaveInterruptNetVst
dSvMgr_c::stepSaveNormal_c BsSvMgr::stepSaveNormal

What's more, NSMB2 includes a bunch of asserts that give us the original paths to files in the source repository.

d:\home\bigred\Project\CTR\BIGRED\src\dNetECommerceListup.cpp
d:\home\bigred\Project\CTR\BIGRED\src\dNetECommerceMount.cpp
d:\home\bigred\Project\CTR\BIGRED\src\dNetECommercePurchase.cpp
d:\home\bigred\Project\CTR\BIGRED\src\dNetGameCntMgr.cpp
d:\home\bigred\Project\CTR\BIGRED\src\dNetGameMgr.cpp
d:\home\bigred\Project\CTR\BIGRED\src\dNetGameRcvMgr.cpp
d:\home\bigred\Project\CTR\BIGRED\src\dNetGameSndMgr.cpp

There's that pesky d prefix on files again...

Between these similarities and the continued use of the actor/base/scene terminology, I'm reasonably convinced that ACNL and NSMB2 both use what is the next 'generation' in this engine lineage.

After that, it basically cuts off. We've not had a new NSMB since NSMB2/NSMBU. New Horizons switches to LunchPack 2, which seems to be Nintendo EAD's go-to engine nowadays. Guess it's time to pay a dFarewell_c to this weird part of Nintendo's development history which I have spent far too much time investigating... :p


I hope you enjoyed this absurdly long exploration into NSMBW and various other Nintendo EAD games. I originally intended for this post to just be about the Hashcat symbol table nonsense, but it just kept on growing. There's still more I could talk about, like their middleware (JSystem, NintendoWare, EGG, sead) and the DRM in the Shield release, but I've got to draw the line somewhere!


Previous Post: Experimenting with AI Dungeon
Next Post: mpw-emu: Emulating 1998-Vintage Mac Compilers