BCSV (File format)

The content described on this page is 100% documented.

BCSV stands for Binary Comma Separated Values and is the most common data format used in both Super Mario Galaxy games. Some older GameCube titles, such as Luigi's Mansion and Donkey Kong Jungle Beat, use this data format as well. As the name suggests, BCSV is a binary variant of comma-separated values (CSV). This means that the data is laid out in a table-like structure. The column names are hashed for faster access. The data is flatbuffer-like and is loaded directly into memory, meaning that it does not have to be deserialized first. The game supports reading data as signed and unsigned integers (8, 16 and 32 bit), single-precision floats and strings. All BCSV files are padded to the nearest 32 byte boundary with '@' (0x40). There is no consistent file extension for BCSV data. Instead, the game contains various BCSV, BANMT, BCAM, PA and TBL files. BCSV files that use TBL as their file extension are expected to be sorted in ascending order by some specific field. Each string is a null-terminated SHIFT-JIS (Codepage 932) encoded string.

Header

Each BCSV file starts with a header:

Offset	Type	Description
0x00	u32	Entry count
0x04	u32	Field count
0x08	u32	Offset to the entry data section
0x0C	u32	The size of each entry in bytes

Fields Section

Right after the header comes the list of fields. The structure of a single field is as follows:

Offset	Type	Description
0x00	u32	Name hash
0x04	u32	Bitmask
0x08	u16	Offset to the data under this field in an individual entry
0x0A	u8	Data shift amount
0x0B	u8	The type of data that this field uses

Data types

Fields may cover one of the following data types:

Name	ID	Size (in bytes)	Description
LONG	0x00	4	32-bit integer. Signedness is not specified. ANDed with the bitmask and shifted right by the field's shift amount.
STRING	0x01	32	Embedded string. Deprecated. Use STRING_OFFSET instead.
FLOAT	0x02	4	Single-precision floating-point value.
LONG_2	0x03	4	32-bit integer. Signedness is not specified. ANDed with the bitmask and shifted right by the field's shift amount.
SHORT	0x04	2	16-bit integer. Signedness is not specified. ANDed with the bitmask and shifted right by the field's shift amount.
CHAR	0x05	1	8-bit integer. Signedness is not specified. ANDed with the bitmask and shifted right by the field's shift amount.
STRING_OFFSET	0x06	4	32-bit offset into string table.

Field Order & Entry Size

For efficiency and hardware limitations, the field offsets and total entry size are calculated depending on a special ordering of the fields. This only affects the order of the data in an entry and not the order of the fields in this section. When saving, the tool should ensure that the field offsets and total entry size are calculated depending on this order: STRING < FLOAT < LONG < LONG_2 < SHORT < CHAR < STRING_OFFSET. A sample implementation from pyjmap can be found on Github which shows how to calculate these properly.

Data Section

Contains the individual data entries. The structure of their data is specified by the BCSV's fields. Each entry is aligned to four bytes.

String Pool

Right after the data comes the string pool which contains all strings used within the BCSV.

BCSV (File format)

Contents

Header

Fields Section

Data types

Field Order & Entry Size

Data Section

String Pool

Navigation menu

BCSV (File format)

Header

Fields Section

Data types

Field Order & Entry Size

Data Section

String Pool

Navigation menu

Search