Skip to main content

Scrape problems submissions from different platforms

Project description

Scraping Engine for Competitive Programming Accelerated Retriever (Secpar)

Overview

Secpar is a Python command-line tool designed to scrape code submissions from various online programming platforms and store them in a GitHub repository. It supports platforms such as Codeforces, CSES (University of Helsinki), and Vjudge. This documentation provides a detailed overview of the Scraper's functionalities and how to use them.

Demo Repo: CP-Submissions

Table of Contents

  1. Features

  2. Installation

  3. Usage

  4. Command-Line Interface

  5. Scraper Configuration

  6. Customization

  7. Data Storage

  8. FAQs

  9. Contributing

  10. License

  11. Upcoming Feature

1. Features

  • Supported Platforms: Secpar supports the following programming platforms:

    • Codeforces

    • CSES (University of Helsinki)

    • Vjudge

  • GitHub Integration: Code submissions are stored in a GitHub repository, making it easy to manage and share your solutions.

  • Automatic README Generation: Secpar automatically generates a README file for your GitHub repository, listing all your code submissions with problem details, links, and tags.

  • Incremental Scraping: Secpar keeps track of previously scraped submissions, ensuring that only new submissions are added to your repository.

  • Multi-accounts Scraping: Scrape the same platform more than once in case of having multiple accounts on that platform without worry of redundancy.

  • Authentication: Securely authenticate with the supported platforms to access your submissions.

  • Customization: Customize the formatting of your README and configure other scraper options.

2. Installation

To use Secpar, follow these installation steps:

  1. Clone the Repository: Clone Secpar repository to your local machine:

    git clone https://github.com/your-username/scraper.git
    
  2. Install Dependencies: Install the necessary Python dependencies by navigating to the repository directory and running:

    pip install -r requirements.txt
    
  3. Configuration: Set up your GitHub repository and obtain a GitHub access token.

  4. Initialize: Run the initialization command to set up your user data and repository configuration:

    python main.py -c init
    

    Follow the prompts to provide your GitHub username, repository name, and access token.

3. Usage

Secpar has two primary modes of operation: initialization and scraping.

Initialization

Initialization is the first step to configure your scraper for a GitHub repository.

  1. Run the initialization command:

    python main.py -c init
    
  2. Follow the prompts to enter your GitHub username, repository name, and access token.

Scraping

Scraping allows you to retrieve code submissions from supported platforms and store them in your GitHub repository.

  1. To scrape submissions, use the following command:

    python main.py -s PLATFORM_NAME
    

    Replace PLATFORM_NAME with one of the supported platforms: codeforces, cses, or vjudge.

  2. Depending on the platform, you may need to provide additional information such as your platform username and password.

  3. Secpar will fetch new submissions and update your GitHub repository.

4. Command-Line Interface

Secpar provides a command-line interface with the following options:

  • -c, --command: Specify the command (init for initialization or update for scraping).

  • -s, --scrap: Specify the platform to scrape data from (codeforces, cses, or vjudge).

  • --help: Display usage instructions and available options.

Example usage:

python main.py -c init

python main.py -s codeforces

5. Scraper Configuration

Secpar can be configured in several ways:

  • GitHub Configuration: Set up your GitHub repository details and access token during initialization.

  • Platform Authentication: Authenticate with your platform credentials (e.g., Codeforces username and password) for scraping.

  • Customization: Customize the formatting of your README and configure scraper options in the code (e.g., maximum requests, submission per update).

6. Customization

You can customize Secpar's behavior by modifying the source code. Here are some customization options:

  • Formatting: Customize the formatting of the generated README for each platform. You can modify the formatting in the corresponding Formatter class.

  • Configuration: Adjust Secpar's settings, such as maximum requests, submissions per update, or other platform-specific parameters.

7. Data Storage

Secpar stores code submissions and related information in your GitHub repository. Each submission is listed in the README with details such as problem name, language, solution link, tags, and submission date.

8. FAQs

  • What platforms does Secpar support?

    • Secpar currently supports Codeforces, CSES (University of Helsinki), and Vjudge.
  • Is it safe to store my GitHub access token?

    • Access tokens should be stored securely. Secpar stores them in a configuration file, and it's essential to protect this file.
  • How can I customize the README format?

    • You can customize the README format by modifying the corresponding Formatter class for each platform.

9. Contributing

Contributions to Secpar are welcome! Feel free to fork the repository, make improvements, and create pull requests.

10. License

Secpar is released under the MIT License. See the LICENSE file for details.

11. Upcoming Features

  • More platforms: platforms such as Atcoder and CodeChef are currently being worked on.

Note: This documentation provides an overview of Secpar's functionality and usage. For detailed code explanations, refer to the source code and comments in Secpar's repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Secpar-1.1.3.tar.gz (32.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

Secpar-1.1.3-py3-none-any.whl (47.2 kB view details)

Uploaded Python 3

File details

Details for the file Secpar-1.1.3.tar.gz.

File metadata

  • Download URL: Secpar-1.1.3.tar.gz
  • Upload date:
  • Size: 32.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.7

File hashes

Hashes for Secpar-1.1.3.tar.gz
Algorithm Hash digest
SHA256 130e279c5877feaf27ad6982ddb21a969c01c7866f23166b271f75a8a0483e9a
MD5 bd41f9f0351d49ec1f6964fc737ec55f
BLAKE2b-256 9022ea99b11d918e665fd1469815e89e2ecf704a59c3e82c7ea854b8f87fbc41

See more details on using hashes here.

File details

Details for the file Secpar-1.1.3-py3-none-any.whl.

File metadata

  • Download URL: Secpar-1.1.3-py3-none-any.whl
  • Upload date:
  • Size: 47.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.7

File hashes

Hashes for Secpar-1.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f8498aade22c31d9e6bd612751a5e0d4b1b100400ea1216afa157dd990c83c1e
MD5 e67d7e79375459fa3977fa2fe4699b7a
BLAKE2b-256 5b7d41b1f85aed0f9c2b71e279806ef10bef01536b5735a568f16d4fb47ef103

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page