Skip to main content

Package of Hachoir parsers used to open binary files

Project description

hachoir-parser is a package of most common file format parsers written for Hachoir framework. Not all parsers are complete, some are very good and other are poor: only parser first level of the tree for example.

A perfect parser have no “raw” field: with a perfect parser you are able to know each bit meaning. Some good (but not perfect ;-)) parsers:

  • Matroska video
  • Microsoft RIFF (AVI video, WAV audio, CDA file)
  • PNG picture
  • TAR and ZIP archive

GnomeKeyring parser requires Python Crypto module: http://www.amk.ca/python/code/crypto.html

Website: http://bitbucket.org/haypo/hachoir/wiki/hachoir-parser

hachoir-parser 1.3.4 (2010-07-26)

  • update matroska parser to support WebM videos

hachoir-parser 1.3.3 (2010-04-15)

  • fix setup.py: don’t use with statement to stay compatible with python 2.4

hachoir-parser 1.3.2 (2010-03-01)

  • Include the README file in the tarball
  • setup.py reads the README file instead of using README.py to break the build dependency on hachoir-core

hachoir-parser 1.3.1 (2010-01-28)

  • Create MANIFEST.in to include extra files: README.py, README.header, tests/run_testcase.py, etc.
  • Create an INSTALL file

hachoir-parser 1.3 (2010-01-20)

  • New parsers:
    • BLP: Blizzard Image
    • PRC: Palm resource
  • HachoirParserList() is no more a singleton: use HachoirParserList.getInstance() to get a singleton
  • Add tags optional argument to createParser(), it can be used for example to force a parser
  • Fix ParserList.print_(): first argument is now the title and not ‘out’. If out is not specified, use sys.stdout.
  • MP3: support encapsulated objects (GEOB in ID3)
  • Create a dictionary: Windows codepage => charset name (CODEPAGE_CHARSET)
  • ASN.1: support boolean and enum types; fix bit string parser
  • MKV: use textHandler()
  • AVI: create index parser, use file size header to detect padding at the end
  • ISO9660: strip nul bytes in application name
  • JPEG: add ICC profile chunk name
  • PNG: fix transparency parser (tRNS)
  • BPLIST: support empty value for markers 4, 5 and 6
  • Microsoft Office summary: support more codepages (CP874, Windows 1250..1257)
  • tcpdump: support ICMPv6 and IPv6
  • Java: add bytecode parser, support JDK 1.6
  • Python: parse lnotab content, fill a string table for the references
  • MPEG Video: parse much more chunks
  • MOV: Parse file type header, create the right MIME type

hachoir-parser 1.2.1 (2008-10-16)

  • Improve OLE2 and MS Office parsers: - support small blocks - fix the charset of the summary properties - summary property integers are unsigned - use TimedeltaWin64 for the TotalEditingTime field - create minimum Word document parser
  • Python parser: support magic numbers of Python 3000 with the keyword only arguments
  • Create Apple/NeXT Binary Property List (BPLIST) parser
  • MPEG audio: reject file with no valid frame nor ID3 header
  • Skip subfiles in JPEG files
  • Create Apple/NeXT Binary Property List (BPLIST) parser by Robert Xiao

hachoir-parser 1.2 (2008-09-03)

  • Create FLAC parser, written by Esteban Loiseau
  • Create Action Script parser used in Flash parser, written by Sebastien Ponce
  • Create Gnome Keyring parser: able to parse the stored passwords using Python Crypto if the main password is written in the code :-)
  • GIF: support text extension field; parse image content (LZW compressed data)
  • Fix charset of IPTC string (guess it, it’s not always ISO-8859-1)
  • TIFF: Sebastien Ponce improved the parser: parse image data, add many tags, etc.
  • MS Office: guess the charset for summary strings since it could be ISO-8859-1 or UTF-8

hachoir-parser 1.1 (2008-04-01)

Main changes: add “EFI Platform Initialization Firmware Volume” (PIFV) and “Microsoft Windows Help” (HLP) parsers. Details:

  • MPEG audio:
    • add createContentSize() to support hachoir-subfile
    • support file starting with ID3v1
    • if file doesn’t contain any frame, use ID3v1 or ID3v2 to create the description
  • EXIF:
    • use “count” field value
    • create RationalInt32 and RationalUInt32
    • fix for empty value
    • add GPS tags
  • JPEG:
    • support Ducky (APP12) chunk
    • support Comment chunk
    • improve validate(): make sure that first 3 chunk types are known
  • RPM: use bzip2 or gzip handler to decompress content
  • S3M: fix some parser bugs
  • OLE2: reject negative block index (or special block index)
  • ip2name(): catch KeybordInterrupt and don’t resolve next addresses
  • ELF: support big endian
  • PE: createContentSize() works on PE program, improve resource section detection
  • AMF: stop mixed array parser on empty key

hachoir-parser 1.0 (2007-07-11)

Changes:

  • OLE2: Support file bigger than 6 MB (support many DIFAT blocks)
  • OLE2: Add createContentSize() to guess content size
  • LNK: Improve parser (now able to parse the whole file)
  • EXE PE: Add more subsystem names
  • PYC: Support Python 2.5c2
  • Fix many spelling mistakes

Minor changes:

  • PYC: Fix long integer parser (negative number), add (disabled) code to disassemble bytecode, use self.code_info to avoid replacing self.info
  • OLE2: Add “.msi” file extension
  • OLE2: Fix to support documents generated on Mac
  • EXIF: set max IFD entry count to 1000 (instead of 200)
  • EXIF: don’t limit BYTE/UNDEFINED IFD entry count
  • EXIF: add “User comment” tag
  • GIF: fix image and screen description
  • bzip2: catch decompressor error to be able to read trailing data
  • Fix file extensions of AIFF
  • Windows GUID use new TimestampUUID60 field type
  • RIFF: convert class constant names to upper case
  • Fix RIFF: don’t replace self.info method
  • ISO9660: Write parser for terminator content

Parser list

Archive

  • 7zip: Compressed archive in 7z format
  • ace: ACE archive
  • bzip2: bzip2 archive
  • cab: Microsoft Cabinet archive
  • gzip: gzip archive
  • mar: Microsoft Archive
  • rar: Roshal archive (RAR)
  • rpm: RPM package
  • tar: TAR archive
  • unix_archive: Unix archive
  • zip: ZIP archive

Audio

  • aiff: Audio Interchange File Format (AIFF)
  • fasttracker2: FastTracker2 module
  • flac: FLAC audio
  • itunesdb: iPod iTunesDB file
  • midi: MIDI audio
  • mod: Uncompressed amiga module
  • mpeg_audio: MPEG audio version 1, 2, 2.5
  • ptm: PolyTracker module (v1.17)
  • real_audio: Real audio (.ra)
  • s3m: ScreamTracker3 module
  • sun_next_snd: Sun/NeXT audio

Container

  • asn1: Abstract Syntax Notation One (ASN.1)
  • matroska: Matroska multimedia container
  • ogg: Ogg multimedia container
  • ogg_stream: Ogg logical stream
  • real_media: RealMedia (rm) Container File
  • riff: Microsoft RIFF container
  • swf: Macromedia Flash data

File System

  • ext2: EXT2/EXT3 file system
  • fat12: FAT12 filesystem
  • fat16: FAT16 filesystem
  • fat32: FAT32 filesystem
  • iso9660: ISO 9660 file system
  • linux_swap: Linux swap file
  • msdos_harddrive: MS-DOS hard drive with Master Boot Record (MBR)
  • ntfs: NTFS file system
  • reiserfs: ReiserFS file system

Game

  • blp1: Blizzard Image Format, version 1
  • blp2: Blizzard Image Format, version 2
  • lucasarts_font: LucasArts Font
  • spiderman_video: The Amazing Spider-Man vs. The Kingpin (Sega CD) FMV video
  • zsnes: ZSNES Save State File (only version 143)

Image

  • bmp: Microsoft bitmap (BMP) picture
  • gif: GIF picture
  • ico: Microsoft Windows icon or cursor
  • jpeg: JPEG picture
  • pcx: PC Paintbrush (PCX) picture
  • png: Portable Network Graphics (PNG) picture
  • psd: Photoshop (PSD) picture
  • targa: Truevision Targa Graphic (TGA)
  • tiff: TIFF picture
  • wmf: Microsoft Windows Metafile (WMF)
  • xcf: Gimp (XCF) picture

Misc

  • 3do: renderdroid 3d model.
  • 3ds: 3D Studio Max model
  • bplist: Apple/NeXT Binary Property List
  • chm: Microsoft’s HTML Help (.chm)
  • gnomekeyring: Gnome keyring
  • hlp: Microsoft Windows Help (HLP)
  • lnk: Windows Shortcut (.lnk)
  • ole2: Microsoft Office document
  • pcf: X11 Portable Compiled Font (pcf)
  • pdf: Portable Document Format (PDF) document
  • tcpdump: Tcpdump file (network)
  • torrent: Torrent metainfo file
  • ttf: TrueType font

Program

  • elf: ELF Unix/BSD program/library
  • exe: Microsoft Windows Portable Executable
  • java_class: Compiled Java class
  • pifv: EFI Platform Initialization Firmware Volume
  • prc: Palm Resource File
  • python: Compiled Python script (.pyc/.pyo files)

Video

  • asf: Advanced Streaming Format (ASF), used for WMV (video) and WMA (audio)
  • flv: Macromedia Flash video
  • mov: Apple QuickTime movie
  • mpeg_ts: MPEG-2 Transport Stream
  • mpeg_video: MPEG video, version 1 or 2

Total: 78 parsers

Project details


Release history Release notifications

This version
History Node

1.3.4

History Node

1.3.3

History Node

1.3.2

History Node

1.3.1

History Node

1.3

History Node

1.2.1

History Node

1.2

History Node

1.1

History Node

1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
hachoir_parser-1.3.4-py2.4.egg (996.6 kB) Copy SHA256 hash SHA256 Egg 2.4 Jul 26, 2010
hachoir_parser-1.3.4-py2.5.egg (987.5 kB) Copy SHA256 hash SHA256 Egg 2.5 Jul 26, 2010
hachoir_parser-1.3.4-py2.6.egg (989.2 kB) Copy SHA256 hash SHA256 Egg 2.6 Jul 26, 2010
hachoir-parser-1.3.4.tar.gz (359.2 kB) Copy SHA256 hash SHA256 Source None Jul 26, 2010

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page