Skip to main content

Core of Hachoir framework: parse and edit binary files

Project description

Hachoir project

Hachoir is a Python library used to represent of a binary file as a tree of Python objects. Each object has a type, a value, an address, etc. The goal is to be able to know the meaning of each bit in a file.

Why using slow Python code instead of fast hardcoded C code? Hachoir has many interesting features:

  • Autofix: Hachoir is able to open invalid / truncated files
  • Lazy: Open a file is very fast since no information is read from file, data are read and/or computed when the user ask for it
  • Types: Hachoir has many predefined field types (integer, bit, string, etc.) and supports string with charset (ISO-8859-1, UTF-8, UTF-16, …)
  • Addresses and sizes are stored in bit, so flags are stored as classic fields
  • Endian: You have to set endian once, and then number are converted in the right endian
  • Editor: Using Hachoir representation of data, you can edit, insert, remove data and then save in a new file.



For the installation, use or see:

hachoir-core 1.3.3 (2010-02-26)

  • Add writelines() method to UnicodeStdout

hachoir-core 1.3.2 (2010-01-28)

  • includes also the documentation

hachoir-core 1.3.1 (2010-01-21)

  • Create to include ChangeLog and other files for

hachoir-core 1.3 (2010-01-20)

  • Add more charsets to GenericString: CP874, WINDOWS-1250, WINDOWS-1251, WINDOWS-1254, WINDOWS-1255, WINDOWS-1256,WINDOWS-1257, WINDOWS-1258, ISO-8859-16
  • Fix initLocale(): return charset even if config.unicode_stdout is False
  • initLocale() leave sys.stdout and sys.stderr unchanged if the readline module is loaded: Hachoir can now be used correctly with ipython
  • HachoirError: replace “message” attribute by “text” to fix Python 2.6 compatibility (message attribute is deprecated)
  • StaticFieldSet: fix Python 2.6 warning, object.__new__() takes one only argument (the class).
  • Fix GenericFieldSet.readMoreFields() result: don’t count the number of added fields in a loop, use the number of fields before/after the operation using len()
  • GenericFieldSet.__iter__() supports iterable result for _fixFeedError() and _stopFeeding()
  • New seekable field set implementation in hachoir_core.field.new_seekable_field_set

hachoir-core 1.2.1 (2008-10)

  • Create configuration option “unicode_stdout” which avoid replacing stdout and stderr by objects supporting unicode string
  • Create TimedeltaWin64 file type
  • Support WINDOWS-1252 and WINDOWS-1253 charsets for GenericString
  • guessBytesCharset() now supports ISO-8859-7 (greek)
  • durationWin64() is now deprecated, use TimedeltaWin64 instead

hachoir-core 1.2 (2008-09)

  • Create Field.getFieldType(): describe a field type and gives some useful informations (eg. the charset for a string)
  • Create TimestampUnix64
  • GenericString: only guess the charset once; if the charset attribute if not set, guess it when it’s asked by the user.

hachoir-core 1.1 (2008-04-01)

Main change: string values are always encoded as Unicode. Details:

  • Create guessBytesCharset() and guessStreamCharset()
  • GenericString.createValue() is now always Unicode: if charset is not specified, try to guess it. Otherwise, use default charset (ISO-8859-1)
  • RawBits: add createRawDisplay() to avoid slow down on huge fields
  • Fix SeekableFieldSet.current_size (use offset and not current_max_size)
  • GenericString: fix UTF-16-LE string with missing nul byte
  • Add __nonzero__() method to GenericTimestamp
  • All stream errors now inherit from StreamError (instead of HachoirError), and create and OutputStreamError
  • humanDatetime(): strip microseconds by default (add optional argument to keep them)

hachoir-core 1.0 (2007-07-10)

Version 1.0.1 changelog:
  • Rename parser.tags to parser.PARSER_TAGS to be compatible with future hachoir-parser 1.0
Visible changes:
  • New field type: TimestampUUID60
  • SeekableFieldSet: fix __getitem__() method and implement __iter__() and __len__() methods, so it can now be used in hachoir-wx
  • String value is always Unicode, even on conversion error: use
  • OutputStream: add readBytes() method
  • Create Language class using ISO-639-2
  • Add hachoir_core.profiler module to run a profiler on a function
  • Add hachoir_core.timeout module to call a function with a timeout
Minor changes:
  • Fix many spelling mistakes
  • Dict: use iteritems() instead of items() for faster operations on huge dictionaries

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for hachoir-core, version 1.3.3
Filename, size File type Python version Upload date Hashes
Filename, size hachoir_core-1.3.3-py2.4.egg (176.6 kB) File type Egg Python version 2.4 Upload date Hashes View hashes
Filename, size hachoir_core-1.3.3-py2.5.egg (176.7 kB) File type Egg Python version 2.5 Upload date Hashes View hashes
Filename, size hachoir_core-1.3.3-py2.6.egg (176.6 kB) File type Egg Python version 2.6 Upload date Hashes View hashes
Filename, size hachoir-core-1.3.3.tar.gz (91.5 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page