MSBT (File Format): Difference between revisions
Shibboleet (talk | contribs) (be more specific about how the hash table on LBL1 works) |
Shibboleet (talk | contribs) |
||
Line 69: | Line 69: | ||
</pre> | </pre> | ||
For example, a file with '''0x65''' label entries with a label string of '''Pichan001_Flow001''' will be at index 0 of the label list, with '''PichanRacer000_Flow002''' being at index 1, and so on. | For example, a file with '''0x65''' label entries with a label string of '''Pichan001_Flow001''' will be at index 0 of the label list, with '''PichanRacer000_Flow002''' being at index 1, and so on. If there is a gap between two indices, then it is filled with the previous entry's offset until the next index is determined. | ||
== ATR1 == | == ATR1 == |
Latest revision as of 16:05, 7 March 2024
MSBT stands for Message Binary Text, it contains all of the text used in Super Mario Galaxy 2.
Header
Offset | Type | Description |
---|---|---|
0x00 | String | MsgStdBn in ASCII. |
0x08 | UInt16 | Endianess. 0xFEFF for Big Endian, 0xFFFE for Little Endian. |
0x0A | UInt32 | Version. |
0x0E | UInt16 | Number of sections. |
0x10 | UInt16 | Padding. |
0x12 | UInt32 | File length. |
0x16 | Byte[0xA] | Padding. |
Sections
After the header follows the sections. It is worth noting that these sections are padded to the nearest 0x10th byte with the value 0xAB.
LBL1
LBL1 is a section that contains the "labels" that are assigned to text. It is a hash table to determine the index that the label is stored at. The indices of the offsets are determined by the "hash" of the number of entries in the list and the label name.
Header
Offset | Type | Description |
---|---|---|
0x00 | String | LBL1 in ASCII. |
0x04 | UInt32 | Section size. Does not account for this header or padding. |
0x08 | Byte[0x8] | Padding. |
Label Entries
After the header come the label entries.
Offset | Type | Description |
---|---|---|
0x00 | UInt32 | Label count. |
0x04 | LabelArray[N] | Labels, where N is the label count read above. |
A label is defined by the following structure:
Offset | Type | Description |
---|---|---|
0x00 | UInt32 | Number of strings associated with this entry. |
0x04 | UInt32 | Offset to the strings associated with that label, offset relative to the field defining the amount of entries. Reads N strings, depending on the count read above. |
Label Indices
The following code can determine the index of a specific label:
def hash_label(label, bucket_count): val = 0 for c in label: val = ((val * 0x492) + ord(c)) & 0xFFFFFFFF return val % bucket_count
For example, a file with 0x65 label entries with a label string of Pichan001_Flow001 will be at index 0 of the label list, with PichanRacer000_Flow002 being at index 1, and so on. If there is a gap between two indices, then it is filled with the previous entry's offset until the next index is determined.
ATR1
Header
Offset | Type | Description |
---|---|---|
0x00 | String | ATR1 in ASCII. |
0x04 | UInt32 | Section size. Does not account for this header or padding. |
0x08 | Byte[0x8] | Padding. |
Attribute Entries
Offset | Type | Description |
---|---|---|
0x00 | UInt32 | Attribute set count. |
0x04 | UInt32 | Attribute size. The size of each attribute in bytes. |
0x08 | AttributeSetArray[N] | Attribute sets, where N is the set count read above. |
The attribute structure is dependent on the game, but in Super Mario Galaxy 2 each attribute is of size 0x0C and has the following structure:
Offset | Type | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
0x00 | Byte | NPC Sound ID. TODO: Add a list of NPC Sounds here? | ||||||||||
0x01 | Byte | Camera type.
| ||||||||||
0x02 | Byte | Trigger type.
| ||||||||||
0x03 | Byte | Dialog style.
| ||||||||||
0x04 | UInt16 | Camera ID. This is the ID that you find inside the CameraParam.bcam file. Needs to be higher than 0. | ||||||||||
0x06 | Byte | MessageArea ID. This matches with MessageArea's Obj Arg 0. | ||||||||||
0x07 | Byte | |||||||||||
0x08 | UInt32 | Offset to a UTF-16 encoded, null terminated, string. Seems to be unused. |
TXT2
Header
Offset | Type | Description |
---|---|---|
0x00 | String | TXT2 in ASCII. |
0x04 | UInt32 | Section size. Does not account for this header or padding. |
0x08 | Byte[0x8] | Padding. |
Text Entries
Offset | Type | Description |
---|---|---|
0x00 | UInt32 | Text count. |
0x04 | Text[N] | Texts, where N is the text count read above. |
A text entry is a UTF-16 string, with embedded "tags" that affect the way the text displays and functions. A tag begins with a value of 0x000E and then the "tag". The following "tags" are possible:
Type | Description |
---|---|
0x0 | System Group |
0x1 | Display Group |
0x2 | Sound Group |
0x3 | Picture Group |
0x4 | Font Size Group |
0x5 | Localize Group |
0x6 | Number Group |
0x7 | String Group |
0x9 | Race Time Group |
0xA | Font Group |
A "tag" can have a "command", which tells it how to function.
System Group
Offset | Type | Description |
---|---|---|
0x00 | UInt16 | Type. [0 == Japanese, 3 == Color] |
0x02 | UInt16 | Data size. |
0x04 | UInt16 | The color to use. |
Display Group
Offset | Type | Description |
---|---|---|
0x00 | UInt16 | Display type. |
0x02 | UInt16 | Data size. |
0x04 | UInt16 | Padding. |
0x06 | UInt16 | Number of frames to wait if the type is 0. |
Sound Group
Plays a given sound.
Offset | Type | Description |
---|---|---|
0x00 | UInt16 | Should always be 2. |
0x02 | UInt16 | |
0x04 | UInt16 | String length. |
0x06 | String[Length] | UTF-16 string containing the sound effect to play. |
Picture Group
Offset | Type | Description |
---|---|---|
0x00 | UInt16 | Character index. |
0x02 | UInt16 | Font. |
0x04 | UInt16 | Character ID. |
Font Size Group
Offset | Type | Description |
---|---|---|
0x00 | UInt16 | Font Size. |
Localize Group
Inserts the current player's name.
Offset | Type | Description |
---|---|---|
0x00 | Byte[0x6] |