Classic Game Resource Reader simplifies parsing game resource files

Project description

Classic Game Resource Reader

cgrr holds utility functions used by other modules for parsing game resource files.

Package contents

At present, cgrr.py provides three things:

verify, a simple function to verify that certain files exist in a path
File, a namedtuple to be used with verify
FileReader, a class used for reading files into dictionaries

verify

Pass this function a list of files (instances of the File namedtuple) and a path and it will verify that those files exist in that path. It is intended to be used to verify that a certain program resides in the given path, e.g. by checking that the program's main executable is in the expected place.

identifying_files = [
    File("ARCHERY.EXE", 31616,  "d8fae202edcc48d51a72026cbfbe7fa8"),
]
path = "path/to/archery"
verify(identifying_files, path)

The call to verify above will return True iff a file path/to/archery/ARCHERY.EXE exists, is 31,616 bytes, and has md5 hash d8fae202edcc48d51a72026cbfbe7fa8. If identifying_files contains multiple File namedtuples, all of the files described in the list must be present.

File

File is simply a namedtuple representing a file. The fields of the namedtuple are path, size, and md5.

To create a new File:

example = File("path/to/example.tle", 12345, "0123456789abcdef0123456789abcdef")

The path should be given relative to some base path (e.g. the main path to the program to be identified by that file) which will be passed to verify separately.

size is the file size in bytes.

md5 is the md5 hash of the file.

FileReader

FileReader is a factory that produces readers for specific file formats. A reader provides two methods, pack and unpack, used for parsing and unparsing data from files. Under the hood, it uses the struct module.

Construct a file reader with FileReader(format), where format is a string describing the file format, such as:

score_reader = FileReader("""
<
Uint32      score         # Score at index 0x00, before name
string[16]  name
options[6]  game_options  # A six byte field with a custom data format
""")

The format of each line is

TYPE VARIABLE_NAME

TYPE[COUNT] VARIABLE_NAME

If COUNT is not specified, it defaults to 1.

Optionally, a line may contain a single character describing the endianness of the numbers in the file, in the style of struct. By default, little-endian ('<') integers are assumed.

Characters following a pound sign ('#') are treated as comments and ignored.

If TYPE is one of the builtin types supported by the struct module (e.g. Uint16), it will be processed by struct. For builtin types, COUNT is treated as the repeat count for struct: Uint32[4] means four 32-bit unsigned integers (16 bytes), and string[4] means a 4 byte string.

Otherwise, TYPE is treated as a user-defined type. Then COUNT is the number of bytes occupied by the variable, and the FileReader will look for a function named parse_TYPE (e.g. parse_options) when unpacking the data. If found, the function will be called with the bytestring as an argument and the return value assigned as the value of the variable. Similarly, the FileReader will pass the variable to a function named unparse_TYPE (e.g. unparse_options) which should return a bytestring of length COUNT when packing the data. If those functions are not defined, the bytes will be returned as-is.

The Struct used by this module can be accessed directly as score_reader.struct, if desired.

The reader specified above will extract three variables from a 26-byte file: score, a (little-endian) 32-bit unsigned integer; name, a 16-byte string; and game_options, a 6-byte field in a custom format.

Given a file in the required format, the file can be parsed with:

data = scorefile.read(26)
scores = score_reader.unpack(data)

which will produce scores, a dictionary with three entries

scores = {"name" : "SomeName", "score" : 1234, "game_options" : b'......'}

Given a dictionary with these entries, pack can be used to generate a scorefile in the original format.

data = score_reader.pack( {"name" : "Cheater",
                           "score" : 9999,
                           "game_options" : b'......'} )
scorefile.write(data)

Since we didn't define parse_options and unparse_options functions, the six bytes devoted to that variable are simply assigned directly. It might be more useful to parse the options, however:

def parse_options(b):
    return { 'option' + str(i) : b[i] for i in range(6) }

def unparse_options(o):
    return bytes([o['option' + str(i)] for i in range(6)])

FileReader.from_offsets

If you know the offsets of data in a file, but not necessarily the format of the whole file, the from_offsets constructor may be more useful.

Construct a file reader with from_offsets(format_def), where format_def is a string describing the file format, such as:

FileReader.from_offsets('''
<
0x00 Uint32      score    # Score at index 0x00, before name
0x04 string[16]  name
0x14 options[6]  options  # A six byte field with a custom data format
0x1a EOF
''')

The format of each line is

OFFSET TYPE VARIABLE_NAME

OFFSET TYPE[COUNT] VARIABLE_NAME

The final line of format_def may be:

FILE_LENGTH EOF

OFFSET and FILE_LENGTH must be specified in hexadecimal. The number must begin with '0x' and may use either capital or lowercase, i.e. 0x1a and 0x1A are equivalent.

It is not required to specify offsets in any particular order.

Optionally, a line may contain a single character describing the endianness of the numbers in the file, in the style of struct. By default, little-endian ('<') integers are assumed.

For an explanation of the remaining segment of each line, see the documentation for FileReader.

This function is useful if a file format contains unknown segments, because from_offsets will automatically fill in the unknown segments with dummy variables. So:

FileReader.from_offsets('''
<
0x00 Uint32      score    # Score at index 0x00, before name
0x04 string[16]  name
0x24 options[6]  options  # A six byte field with a custom data format
0x50 EOF
''')

is equivalent to:

FileReader('''
<
Uint32      score   # 0x00-0x03: Score at index 0x00, before name
string[16]  name    # 0x04-0x13
unknown[16] unk1    # 0x14-0x23
options[6]  options # 0x24-0x29: A six byte field with a custom data format
unknown[38] unk2    # 0x2a-0x4f
''')

The EOF statement is not required, but if not specified, the variable with the highest offset specified will also be presumed to be the end of the file.

Example usage

cgrr.py is used by other modules in the CGRR project. For example:

cgrr-gameboy, which reads and edits Game Boy ROM headers
cgrr-gamecube, which reads and edits GameCube GCI files
cgrr-mariospicross, which reads and edits puzzles for the Game Boy game Mario's Picross
cgrr-pokemon, which reads and edits save files for Pokemon games

License

CGRR is available under the GPL v3 or later. See the file COPYING for details.

Project details

Release history Release notifications | RSS feed

This version

1.3.0

Jul 26, 2018

1.2.1

Jul 25, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cgrr-1.3.0.tar.gz (21.5 kB view details)

Uploaded Jul 26, 2018 Source

Built Distribution

cgrr-1.3.0-py3-none-any.whl (10.5 kB view details)

Uploaded Jul 26, 2018 Python 3

File details

Details for the file cgrr-1.3.0.tar.gz.

File metadata

Download URL: cgrr-1.3.0.tar.gz
Upload date: Jul 26, 2018
Size: 21.5 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for cgrr-1.3.0.tar.gz
Algorithm	Hash digest
SHA256	`f50670cee22c09081029764643288786f86ae22fbd9362476cc5821f759d2689`
MD5	`99e9aa5519abbe6c35e5c8e9acf6d89c`
BLAKE2b-256	`4325f1a38097816e09435e234279f3f44b4e9040db9379de1411ea4142f886d1`

See more details on using hashes here.

File details

Details for the file cgrr-1.3.0-py3-none-any.whl.

File metadata

Download URL: cgrr-1.3.0-py3-none-any.whl
Upload date: Jul 26, 2018
Size: 10.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No

File hashes

Hashes for cgrr-1.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`10e35bc2e4f92278097976653e4f878799136cf81a4f8f1aa708c5f273212126`
MD5	`3c18eef9a81dce316429f284bbeca934`
BLAKE2b-256	`c8e616fa73e91a0d5b2d11c4a735d5addbf845a0b07ea267ca7bb7868edf5fdc`