Skip to main content

Library for extracting cellar data

Project description

Cellar extractor

This library contains two functions to get cellar case law data from eurlex.

Version

Python 3.9

Contributors

pranavnbapat
Pranav Bapat
Cloud956
Piotr Lewandowski
shashankmc
shashankmc
gijsvd
gijsvd

How to install?

pip install cellar-extractor

What are the functions?

  • Cellar Extractor
    1. get_cellar
    2. Gets all the ECLI data from the eurlex sparql endpoint and saves them in the CSV or JSON format, in-memory or as a saved file.
    3. get_cellar_extra
    4. Gets all the ECLI data from the eurlex sparql endpoint, and on top of that scrapes the eurlex websites to acquire the full text, keywords, case law directory code and eurovoc identifiers. The full text is returned as a JSON file, rest of data as a CSV. Can be in-memory or as saved files.
  • ECHR - Work in progress
  • Rechtspraak - rechtspraak_extractor
  • What are the parameters?

    1. get_cellar
    2. Parameters:
      • max_ecli: int, optional
      • Maximum amount of ECLIs to retrieve
        Default: 100
      • sd: date, optional, default '2022-05-01'
      • The start publication date (yyyy-mm-dd)
      • ed: date, optional, default current date
      • The end publication date (yyyy-mm-dd)
      • save_file: ['y', 'n'],optional, default 'y'
      • Save data as in a data folder, or return in-memory
      • file_format: ['csv', 'json'],optional, default 'csv'
      • Returns the data as a JSON/dictionary, or a CSV/Pandas Dataframe object.
    3. get_cellar_extra
      • ed=None, save_file='y', max_ecli=100, sd="2022-05-01",threads=10
      • max_ecli: int, optional
      • Maximum amount of ECLIs to retrieve
        Default: 100
      • sd: date, optional, default '2022-05-01'
      • The start publication date (yyyy-mm-dd)
      • ed: date, optional, default current date
      • The end publication date (yyyy-mm-dd)
      • save_file: ['y', 'n'],optional, default 'y'
      • Save the full text of cases as JSON file / return as a dictionary and save the rest of the data as a CSV file / return as a Pandas Dataframe object
      • threads: ['csv', 'json'],optional, default 'csv'
      • Extracting the additional data takes a lot of time. The use of multi-threading can cut down this time. Even with this, the method may take a couple of minutes for a couple of hundred cases. A maximum number of 15 recommended, as this method may also affect the devices internet connection.
        Default: 100

    Examples

    import cellar_extractor as cell
    
    Below are examples for in-file saving:
    
    cell.get_cellar(save_file='y', max_ecli=200, sd='2022-01-01', file_format='csv')
    cell.get_cellar_extra(max_ecli='100, sd='2022-01-01', threads=15)
    
    Below are examples for in-memory saving:
    
    df = cell.get_cellar(save_file='n', file_format='csv', sd='2022-01-01', max_ecli=1000)
    df,json = cell.get_cellar_extra(save_file='n', max_ecli=1000, sd='2022-01-01', threads=15)
    

    License

    License: Apache 2.0

    Previously under the MIT License, as of 28/10/2022 this work is licensed under a Apache License, Version 2.0.

    Apache License, Version 2.0
    
    Copyright (c) 2022 Maastricht Law & Tech Lab
    
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at
        
        http://www.apache.org/licenses/LICENSE-2.0
    
    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
    

    Project details


    Download files

    Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

    Source Distribution

    cellar_extractor-1.0.14.tar.gz (13.3 kB view details)

    Uploaded Source

    File details

    Details for the file cellar_extractor-1.0.14.tar.gz.

    File metadata

    • Download URL: cellar_extractor-1.0.14.tar.gz
    • Upload date:
    • Size: 13.3 kB
    • Tags: Source
    • Uploaded using Trusted Publishing? No
    • Uploaded via: twine/4.0.1 CPython/3.9.10

    File hashes

    Hashes for cellar_extractor-1.0.14.tar.gz
    Algorithm Hash digest
    SHA256 2fb5aef385272f88bb0f2cde1caf253c4c001d3323ac76c4b1807af8bd88f1bd
    MD5 1a85d4e4669ee5c911839eef62b046d2
    BLAKE2b-256 53d64990d4f278ce896ceeac7cb5a3769b0ff7b2cd33459d9903e16f18978cda

    See more details on using hashes here.

    Supported by

    AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page