Skip to main content

A command-line tool for deploying Scrapy spiders to APCloudy

Project description

PAB - APCloudy Deployment Tool

PyPI version Documentation Status Python Support License: MIT GitHub stars GitHub issues

PAB is a command-line tool for deploying Scrapy spiders to APCloudy, similar to how shub works with Scrapinghub. It provides an easy way to manage and deploy your web scraping projects to the APCloudy platform.

📖 Documentation

For comprehensive documentation, visit: https://pab-cli.readthedocs.io/

Features

  • 🚀 Easy deployment of Scrapy spiders to APCloudy
  • 🔐 Secure authentication and credential management
  • 📦 Automatic project packaging and upload
  • 📋 Project and spider management
  • 🔄 Real-time deployment status tracking
  • 🌟 Cross-platform support (Windows, macOS, Linux)

Installation

You can install PAB using pip:

pip install pab-cli

Or install from source:

git clone https://github.com/fawadss1/pab-cli.git
cd pab-cli
pip install -e .

Quick Start

1. Login to APCloudy

pab login

This will prompt you for your APCloudy API key and save it securely.

2. List Available Projects

pab projects

This will show you all available projects with their IDs.

3. Deploy a Spider

Navigate to your Scrapy project directory and run:

pab deploy <project-id>

For example:

pab deploy 5465

PAB will automatically package your project and deploy it to the specified project on APCloudy.

You can also specify additional options:

pab deploy 5465 --version v0.2.4 --target /path/to/project

Commands

Authentication

  • pab login - Login to APCloudy with API key
  • pab logout - Logout from APCloudy
  • pab status - Show current authentication status

Deployment

  • pab deploy <project-id> - Deploy current project to specified APCloudy project
  • pab deploy <project-id> --version <version> - Deploy with specific version tag
  • pab deploy <project-id> --target <path> - Deploy from specific directory

Project Management

  • pab projects - List all available projects
  • pab spiders <project-id> - List spiders in a project

Configuration

PAB stores configuration in:

  • Windows: %APPDATA%\pab\pab_config.json
  • macOS/Linux: ~/.pab/pab_config.json

Examples

Basic Usage

# Login to APCloudy
pab login

# List available projects to get project IDs
pab projects

# Deploy to project ID 5465
pab deploy 5465

# Check authentication status
pab status

# List spiders in a project
pab spiders 5465

Advanced Usage

# Deploy with specific version
pab deploy 5465 --version production-2024

# Deploy from different directory
pab deploy 5465 --target /path/to/project

# Deploy with custom version and target
pab deploy 5465 --version v1.2.3 --target /my/scrapy/project

API Endpoints

PAB communicates with APCloudy using the following API endpoints:

  • POST /api/cli/auth/authenticate - API key authentication
  • POST /api/cli/auth/refresh - Token refresh
  • GET /api/cli/projects - List projects
  • POST /api/cli/projects/{id}/deploy - Deploy spider
  • GET /api/cli/projects/{id}/spiders - List spiders
  • GET /api/cli/deployments/{id}/status - Deployment status

Requirements

  • Python 3.7+
  • Scrapy 2.0+
  • Valid APCloudy account and API key

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For support, please contact:

Changelog

v0.2.0 (2025-11-10)

  • Pre-deployment Validation: Added comprehensive spider validation before deployment
    • Syntax error detection using Python's compile() function
    • Undefined variable detection with AST-based static analysis
    • Import error validation to catch missing modules and import failures
    • Spider structure validation for Scrapy-specific requirements
  • Smart Validation: Script files in scripts/ directory are skipped during import validation to avoid false positives from API calls
  • Package Validation: Added package size validation to prevent uploading empty (0-byte) packages
  • Enhanced Error Reporting: Improved error messages with file paths and line numbers for easier debugging
  • Code Optimization: Removed unused imports and variables for cleaner codebase
  • Builtin Support: Fixed validator to correctly recognize Python builtins (len, str, int, Exception, etc.)
  • Token Refresh Handling: Improved HTTP client to read file content into memory for reliable token refresh retries
  • Debug Logging: Added detailed logging during package creation for better troubleshooting

v0.1.0 (2025-08-06)

  • Initial release
  • Basic authentication and deployment functionality
  • Project and spider management
  • Cross-platform support

Made with ❤️ by Fawad Ali for AskPablos

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pab_cli-0.2.4.tar.gz (33.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pab_cli-0.2.4-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file pab_cli-0.2.4.tar.gz.

File metadata

  • Download URL: pab_cli-0.2.4.tar.gz
  • Upload date:
  • Size: 33.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for pab_cli-0.2.4.tar.gz
Algorithm Hash digest
SHA256 e99dc490b257a924c6b4cf28f99a5e605e3bab529d5f4a66cc3759d1c027ff50
MD5 3bfd59acc8c1aed866f59f2654778ab4
BLAKE2b-256 2e757ba15cf5681d76404d88c22b7edcfc07264ea399bc132c01a7576456d2ea

See more details on using hashes here.

File details

Details for the file pab_cli-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: pab_cli-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for pab_cli-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 51f9d170482a1cbc54083cce69561ce3fb241b37e3311018ebcf3d07fb0eaf3d
MD5 4c70193d81f589e483e96ce043802b76
BLAKE2b-256 999ecab2ae52ba9da378b1f7f49d4d9da67ad4c1d7cb384a865bad5b8ffdc0ad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page