Skip to main content

A utility that simplifies the download and generation of Matrix Market (`.mtx`) files.

Project description

MtxMan ๐Ÿ”ข

This is a utility that simplifies the download and generation of Matrix Market (.mtx) files.

Requirements

  • gcc: to build dependencies:
    • distributed_mmio to convert matrices to BMTX format.
    • Graph500 and PaRMAT generators.
  • python3/pip/pipx to download and setup MtxMan.

Setup

First, setup you Python environment (if needed).

Virtual Environment Setup

# If you don't already have one, create and activate a venv
python3 -m venv .venv
source .venv/bin/activate

pip install pipx
pipx ensurepath

You may need to restart your terminal for the changes to take effect.

MtxMan Installation

Once pipx is installed, you can install MtxMan from PyPI:

# Install MtxMan CLI
pipx install mtxman

Great, you now have MtxMan installed! You can check out the available commands by running:

mtxman --help

Developer Installation

# Clone the repository
git clone git@github.com:ThomasPasquali/MtxMan.git
cd MtxMan
# Install the project in editable mode
pip install -e .

Now the mtxman command should use the local version of the package.
Any changes you make to the code will be reflected immediately when you run the command.

Usage: matrices download/generation

Once you have the MtxMan available on your system.

  1. Create your own YAML configuration file (check out the example below for the syntax)
  2. Run the following command:
mtxman sync <your_config_file>.yaml

By default this command will download/generate all the configured matrices.

For more details, run mtxman sync --help.

Example Configuration File

# This is the base folder for storing the Matrix Market files
path: ./datasets

# This is an example subfolder/category of matrices
matrices_category_1:

  # Generators configuration
  generators:
    # Graph500 Kronecker
    graph500:
      # This will generate two graphs:
      # 1) Scale 4, Edge-factor 5
      # 2) Scale 6, Edge-factor 10
      scale:
        - 4
        - 6
      edge_factor:
        - 5
        - 10

    # PaRMAT generator
    parmat:
    # Parameters:
    # N - Number of veritces
    # M - Number of edges
    # a,b,c - RMAT probabilities. "d" will be deduced automatically. (defaulf: a,b,c=0.25)
    # noDuplicateEdges, undirected, noEdgeToSelf, sorted - Flags. To enable a flag, please set it to 1. (default: 0)
      defaults: # This is optional
        N: 32
        a: 0.25
        b: 0.25
        c: 0.25
        undirected: 1
        noDuplicateEdges: 1
      matrices: # Specify the list of matrices. Default parametes can be overwritten
        - { M: 64 }
        - { M: 128 }
        - { N: 64, M: 64, a: 0.7, b: 0.1, c: 0.1, noEdgeToSelf: 1 } # Overriding defaults

  # List of matrices to be downloaded from SuiteSparse
  # Format: "<group>/<matrix_name>"
  suite_sparse_matrix_list:
    - HB/ash219
    - HB/arc130
    - Averous/epb0
  
  # This allows to download matrices based on their metadata
  # Internally, these options will be passed to the `ssgetpy` package
  suite_sparse_matrix_range:
    min_nnzs: 100
    max_nnzs: 1000
    limit: 4

  # Configuration for downloading files directly from publicly available URLs
  # Supported archive types: `zip`, `tar`, `tar.gz` (`tgz`)
  # `filename` is REQUIRED. Ensure to include file extension (.mtx or .bmtx)
  # `rename` is optional. If set, the matrix and containing folder will be renamed
  direct_urls:
    - url: https://suitesparse-collection-website.herokuapp.com/MM/HB/1138_bus.tar.gz
      filename: 1138_bus.mtx
      rename: renamed_1138_bus.mtx

    - url: https://suitesparse-collection-website.herokuapp.com/MM/HB/1138_bus.tar.gz
      filename: 1138_bus.mtx

# This is ANOTHER example subfolder/category of matrices
# The configuration structure is as above
# Keys 'generators', 'suite_sparse_matrix_list' and 'suite_sparse_matrix_range' are OPTIONAL
matrices_category_2:
  suite_sparse_matrix_list:
    - Simon/olafu

matrices_category_3:
  generators:
    graph500:
      # This will generate three graphs:
      # 1) Scale 6, Edge-factor 5
      # 2) Scale 8, Edge-factor 5
      # 3) Scale 9, Edge-factor 5
      edge_factor: 5
      scale:
        - 6
        - 8
        - 9

Files Structure

The downloaded/generated files are structured as follows:

<config.path>
โ”œโ”€โ”€ <category_0>
โ”‚   โ”œโ”€โ”€ <SuiteSparse_group_0> # Matrices from SuiteSparse "list"
โ”‚   โ”‚   โ””โ”€โ”€ <matrix_0>
โ”‚   โ”‚       โ””โ”€โ”€ <matrix_0>.mtx
โ”‚   โ”œโ”€โ”€ <SuiteSparse_group_1>
โ”‚   โ”‚   โ”œโ”€โ”€ <matrix_0>
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ <matrix_0>.mtx
โ”‚   โ”‚   โ””โ”€โ”€ <matrix_1>
โ”‚   โ”‚       โ””โ”€โ”€ <matrix_1>.mtx
|   ...
|   |
|   โ”œโ”€โ”€ Graph500
โ”‚   โ”‚   โ”œโ”€โ”€ graph500_<scale_0>_<edge_factor0>
โ”‚   โ”‚   โ”œโ”€โ”€ graph500_<scale_1>_<edge_factor1>
โ”‚   โ”‚   ...
|   |
|   โ”œโ”€โ”€ PaRMAT
โ”‚   โ”‚   โ”œโ”€โ”€ parmat_N<N_0>_M<M_0>_<other parmat parameters 0>
โ”‚   โ”‚   โ”œโ”€โ”€ parmat_N<N_1>_M<M_1>_<other parmat parameters 1>
โ”‚   โ”‚   ...
|   |
โ”‚   โ””โ”€โ”€ SuiteSparse_<min_nnz>_<max_nnz>_<limit> # Matrices from SuiteSparse "range"
|   โ”‚   โ”œโ”€โ”€ <SuiteSparse_group_0> # Matrices from SuiteSparse "list"
|   โ”‚   โ”‚   โ””โ”€โ”€ <matrix_0>
|   โ”‚   โ”‚       โ””โ”€โ”€ <matrix_0>.mtx
|   |   ...
|   โ””โ”€โ”€ matrices_list.txt     # Summary file, contains <category_0> matrices paths
|   โ””โ”€โ”€ matrices_list_mtx.txt # This file will be generated only if running the sync command with `-bmtx -kmtx`.
|   |                         # It will contain paths to .mtx files
|   โ””โ”€โ”€ matrices_metadata.csv # Summary file, contains <category_0> matrices metadata (if available)
โ”œโ”€โ”€ <category_1>
โ”‚   |
|   ... # Same structure
...
โ””โ”€โ”€ matrices_list.txt     # Summary file, contains all matrices paths
โ””โ”€โ”€ matrices_list_mtx.txt # Same as the category-specific file
โ””โ”€โ”€ matrices_metadata.csv # Summary file, contains all matrices metadata (number of rows, columns, non-zeros etc.)

Optimize Required Disk Space and Read Time

To optimize space requirements, run the sync command as follows:

mtxman sync <your_config_file>.yaml --binary-mtx

This will convert .mtx files to .bmtx saving 50 to 80% disk space.
The reading of .bmtx files is handled by https://github.com/HicrestLaboratory/distributed_mmio. Check it out!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mtxman-0.0.7.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mtxman-0.0.7-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file mtxman-0.0.7.tar.gz.

File metadata

  • Download URL: mtxman-0.0.7.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mtxman-0.0.7.tar.gz
Algorithm Hash digest
SHA256 e7b7e0a62e8ca41b7a2075e7c94e04848d93f6c9befdfc6f40c831941f46e553
MD5 b6504bc3503b0d990ce524856f2b453d
BLAKE2b-256 69bb21166ecad5f7b35ce797ed249f2e06fd9a580ebb6f205472708d08d1989b

See more details on using hashes here.

Provenance

The following attestation bundles were made for mtxman-0.0.7.tar.gz:

Publisher: pypi-publish.yml on ThomasPasquali/MtxMan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mtxman-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: mtxman-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 20.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mtxman-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 e4c6eb62476c6fe2de9168b4fcc090b5a7641c51aa60de4bd5df2f134d7bfbe6
MD5 334fb71c88aa20d4fdd80838ab83b7dd
BLAKE2b-256 8f7a2f886c5f9d98948257613f5387a91f96ce5594f026fa963484597e4d4e75

See more details on using hashes here.

Provenance

The following attestation bundles were made for mtxman-0.0.7-py3-none-any.whl:

Publisher: pypi-publish.yml on ThomasPasquali/MtxMan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page