Skip to main content

A package to ease Excel files mass data extraction

Project description



xlscrap is a MIT-licensed package to ease Excel files mass data extraction

See the documentation.


Have you ever feel the pain of extracting data from a lot of Excel files ?

  • When you have hundreds or thousands file that look similar but differ in slighty annoying details.
  • When data cells coordinates can't be used because they change
  • When you have to spot dozens or hundreds fields with different strategies.
  • When the same field moves in different sheet position or name
  • When the same field label changes
  • When the data cell is on the right of the label or below the label
  • When you need to check that the collected data is correct.

xlscrap helps you to scrap data out of your Excel files.


>>> import xlscrap
>>> s = xlscrap.Scrapper()
>>> s.field('name')
>>> s.field('age')
>>> s.field('address')
>>> s.table('pets', fields=['name', 'breed', 'age'])
>>> s.scrap('excel-files/*.xls*')
looking for 4 fields in 5 files in excel-files/*.xls*,
file 1/5, found 4/4 fields in diana.xlsx
file 2/5, found 4/4 fields in bob.xls
file 3/5, found 3/4 fields in richard.ods
file 4/5, found 0/4 fields in alien.xls
file 5/5, found 4/4 fields in maria.xlsm
>>> s.result
    {'name': 'Diana',
    'age': 47,
    'address': '44 rue du Louvre\n75000 Paris\nFrance'
    'pets': []},


  • set gitlab URL in
  • clone gitlab/github
  • complete quickstart in README

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for xlscrap, version 0.1.0
Filename, size File type Python version Upload date Hashes
Filename, size xlscrap-0.1.0.tar.gz (3.4 kB) File type Source Python version None Upload date Hashes View
Filename, size xlscrap-0.1.0-py3-none-any.whl (3.4 kB) File type Wheel Python version py3 Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page