Skip to main content

Scrape problems submissions from different platforms

Project description

Scraping Engine for Competitive Programming Accelerated Retriever (Secpar)

PyPI

Downloads

PyPI

Overview

Secpar is a Python command-line tool designed to scrape code submissions from various online programming platforms and store them in a GitHub repository. It supports platforms such as Codeforces, CSES (University of Helsinki), and Vjudge. This documentation provides a detailed overview of the Scraper's functionalities and how to use them.

Demo Repo: CP-Submissions

Table of Contents

  1. Features

  2. Installation

  3. Usage

  4. Command-Line Interface

  5. Scraper Configuration

  6. Customization

  7. Data Storage

  8. FAQs

  9. Contributing

  10. License

  11. Upcoming Features

1. Features

  • Supported Platforms: Secpar supports the following programming platforms:

    • Codeforces

    • CSES (University of Helsinki)

    • Vjudge

  • GitHub Integration: Code submissions are stored in a GitHub repository, making it easy to manage and share your solutions.

  • Automatic README Generation: Secpar automatically generates a README file for your GitHub repository, listing all your code submissions with problem details, links, and tags.

  • Incremental Scraping: Secpar keeps track of previously scraped submissions, ensuring that only new submissions are added to your repository.

  • Multi-accounts Scraping: Scrape the same platform more than once in case of having multiple accounts on that platform without worry of redundancy.

  • Authentication: Securely authenticate with the supported platforms to access your submissions.

  • Customization: Customize the formatting of your README and configure other scraper options.

2. Installation

To use Secpar, follow these installation steps:

Install Secpar: Install Secpar package on your local machine:

pip install Secpar

3. Usage

Secpar has two primary modes of operation: initialization and scraping.

Initialization

Initialization is the first step to configure your scraper for a GitHub repository.

  1. Run the initialization command:

    python secpar -c init
    
  2. Follow the prompts to enter your GitHub username, repository name, and access token.

Scraping

Scraping allows you to retrieve code submissions from supported platforms and store them in your GitHub repository.

  1. To scrape submissions, use the following command:

    secpar -s PLATFORM_NAME
    

    Replace PLATFORM_NAME with one of the supported platforms: codeforces, cses, or vjudge.

  2. Depending on the platform, you may need to provide additional information such as your platform password.

  3. Secpar will fetch new submissions and update your GitHub repository.

Note:

To upload submissions codes for codeforces you need to have Tor installed and inside your torrc file place these two line:

SocksPort 9050

ControlPort 9051

Open tor tab and make sure you can browse using it and keep it open till you scrape using the terminal.

4. Command-Line Interface

Secpar provides a command-line interface with the following options:

  • -c, --command: Specify the command (init for initialization or update for scraping).

  • -s, --scrape: Specify the platform to scrape data from (codeforces, cses, or vjudge).

  • -h, --help: Display usage instructions and available options.

Example usage:

python secpar -c init

python secpar -s codeforces

5. Scraper Configuration

Secpar can be configured in several ways:

  • GitHub Configuration: Set up your GitHub repository details and access token during initialization.

  • Platform Authentication: Authenticate with your platform credentials (e.g., Codeforces username and password) for scraping.

  • Customization: Customize the formatting of your README and configure scraper options in the code (e.g., maximum requests, submission per update).

6. Customization

You can customize Secpar's behavior by modifying the source code. Here are some customization options:

  • Formatting: Customize the formatting of the generated README for each platform. You can modify the formatting in the corresponding Formatter class.

  • Configuration: Adjust Secpar's settings, such as maximum requests, submissions per update, or other platform-specific parameters.

7. Data Storage

Secpar stores code submissions and related information in your GitHub repository. Each submission is listed in the README with details such as problem name, language, solution link, tags, and submission date.

8. FAQs

  • What platforms does Secpar support?

    • Secpar currently supports Codeforces, CSES (University of Helsinki), and Vjudge.
  • Is it safe to store my GitHub access token?

    • Access tokens should be stored securely. Secpar stores them in a configuration file, and it's essential to protect this file.
  • How can I customize the README format?

    • You can customize the README format by modifying the corresponding Formatter class for each platform.

9. Contributing

Contributions to Secpar are welcome! Feel free to fork the repository, make improvements, and create pull requests.

10. License

Secpar is released under the MIT License. See the LICENSE file for details.

11. Upcoming Features

  • More platforms: platforms such as Atcoder and CodeChef are currently being worked on.

Note: This documentation provides an overview of Secpar's functionality and usage. For detailed code explanations, refer to the source code and comments in Secpar's repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Secpar-1.1.4.tar.gz (32.1 kB view hashes)

Uploaded Source

Built Distribution

Secpar-1.1.4-py3-none-any.whl (47.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page