Chunk

From Crash Bandicoot Hacking Wiki
Jump to navigation Jump to search

Chunks (a.k.a. pages) are the highest level data containers in NSF files. Each chunk is 64KB (0x10000 bytes) in size, in congruence with the 64KB pages allocated to contain them.

Format

Uncompressed chunks

Non-texture chunks

Offset Field Size Value
0x0 Magic Number 2 bytes 0x1234
0x2 Type 2 bytes !=1
0x4 ID 4 bytes *
0x8 Entry Count 4 bytes c
0xC Checksum 4 bytes *
0x10 Entry Offsets 4 x c bytes *
0x10 + (4 x c) End of last Entry 4 bytes *
0x10 + (4 x (c+1) ) Entries * x c bytes *

Texture chunks

Offset Field Size Value
0x0 Magic Number 2 bytes 0x1234
0x2 Type 2 bytes 1
0x4 Texture Page EID 4 bytes *
0x8 Tex. Page Entry Type 4 bytes 5
0xC Checksum 4 bytes *
0x10 Texture data 0xFFF0/65520 bytes *

Compressed chunks

Offset Field Size Value
0x0 Magic Number 2 bytes 0x1235
0x2 Unused 2 bytes *
0x4 Length 4 bytes l
0x8 Skip 4 bytes s
0xC Compressed data n bytes *
0xC+n Skipped data s bytes *
0xC+n+s Uncompressed data (0x10000 - l) bytes *

Structure

struct chunk
{ 
  const unsigned short magic = 0x1234;
  unsigned short type;
  union
  {
    struct // normal chunks
    {
      unsigned long CID;
      unsigned long entrycount;
      unsigned long checksum;
      union
      {
        unsigned long entryoffset[entrycount];
        entry *entryptr[entrycount];
      };
      entry entries[entrycount];
    };
    struct // texture chunks
    {
      unsigned long EID;
      unsigned long entrytype = 5;
      unsigned long checksum;
      unsigned char data[0xFFF0];
    };
  };
};

struct chunk_compressed
{
  const unsigned short magic = 0x1235;
  const unsigned short unused = 0;
  unsigned long length;
  unsigned long skip;
  unsigned char compressed[];
  unsigned char skipped[skip];
  unsigned char uncompressed[0x10000-length];
};

Types

Type Name
0 Normal Chunk
1 Texture Chunk
2 Prototype Sound Chunk
3 Sound Chunk
4 Wavebank Chunk
5 Speech Chunk

Non-texture chunks

Non-texture chunks are containers for 'entries'-the intermediate level data containers in NSF files.

Texture chunks

Texture chunks contain palette and texture data used by game models.

Chunk IDs/CIDs

Each chunk has a unique ID. A chunk ID has been assigned to each chunk according to its index in the NSF file. The formula for a chunk ID, or CID, is: '(index << 1) - 1', where chunk indices start at 1. This is equivalent to '(index << 1) | 1' when chunks are zero-indexed. Thus, a chunk's index is contained in the leftmost 31 bits of its CID. The rightmost bit or LSB in a CID is an indicator bit or tag, and is always 1. A CID is typically stored in a tagged union with its replacement entry pointer as the key in an entry hash table slot. Its replacement pointer locates an entry contained in the chunk with that CID.

Texture chunks do not have a CID field with precomputed value. The texture paging routines manually compute CIDs for texture chunks.

Checksums

Each chunk includes a 32-bit checksum which can be used to verify the integrity of the chunk data. At least some PAL versions of the games enforce these checksums.

The checksum is computed as follows:

  1. Start with the value 0x12345678.
  2. For each byte of the 64K chunk:
    • Add that byte to the current value. For the bytes from the checksum field itself (bytes 12 through 15), consider their value to be zero.
    • Circularly left-shift the current value by three bits.
  3. The final value is the correct checksum.

This procedure is also described by this C function:

uint32_t nsfChecksum(const unsigned char *data)
{
    uint32_t checksum = 0x12345678;
    for (int i = 0;i < 65536;i++) {
        if (i < 12 || i >= 16)
            checksum += data[i];
        checksum = checksum << 3 | checksum >> 29;
    }
    return checksum;
}