Skip to main content

No project description provided

Project description

MVANet

CI

This is a fork of the original MVANet, with bug fixes and packaging improvements.

Installation

pip install git+https://github.com/creative-graphic-design/MVANet

Usage

from PIL import Image
from mvanet.predictor import MVANetPredictor

test_image = Image.open("/path/to/image.png")

predictor = MVANetPredictor()

# Predict the RGBA image
predicted_image = predictor(test_image, output_type="rgba")
predicted_image.save("rgba.py")

# Predict the mask image
predicted_mask = predictor(test_image, output_type="mask")
predicted_mask.save("mask.png")

The official repo of the CVPR 2024 paper (Highlight), Multi-view Aggregation Network for Dichotomous Image Segmentation

PWC PWC PWC PWC PWC

Introduction

Dichotomous Image Segmentation (DIS) has recently emerged towards high-precision object segmentation from high-resolution natural images. When designing an effective DIS model, the main challenge is how to balance the semantic dispersion of high-resolution targets in the small receptive field and the loss of high-precision details in the large receptive field. Existing methods rely on tedious multiple encoder-decoder streams and stages to gradually complete the global localization and local refinement.

Human visual system captures regions of interest by observing them from multiple views. Inspired by it, we model DIS as a multi-view object perception problem and provide a parsimonious multi-view aggregation network (MVANet), which unifies the feature fusion of the distant view and close-up view into a single stream with one encoder-decoder structure. Specifically, we split the high-resolution input images from the original view into the distant view images with global information and close-up view images with local details. Thus, they can constitute a set of complementary multi-view low-resolution input patches.

image

Moreover, two efficient transformer-based multi-view complementary localization and refinement modules (MCLM & MCRM) are proposed to jointly capturing the localization and restoring the boundary details of the targets.

image

We achieves state-of-the-art performance in terms of almost all metrics on the DIS benchmark dataset.

image

We have optimized the code and achieved an enhanced FPS performance, reaching 15.2.

image

Here are some of our visual results:

image

I. Requiremets

  • python==3.7
  • torch==1.10.0
  • torchvision==0.11.0
  • mmcv-full==1.3.17
  • mmdet==2.17.0
  • mmengine==0.8.1
  • mmsegmentation==0.19.0
  • numpy
  • ttach
  • einops
  • timm
  • scipy

II. Training

  1. Download the pretrained model at Google Drive.
  2. Then, you can start training by simply run:
python train.py

III. Testing

  1. Update the data path in config file ./utils/config.py (line 4~8)

  2. Replace the existing path with the path to your saved model in ./predict.py (line 14)

    You can also download our trained model at Google Drive.

  3. Start predicting by:

python predict.py
  1. Change the predicted map path in ./test.py (line 17) and start testing:
python test.py

You can get our prediction maps at Google Drive.

To Do List

  • Release our camere-ready paper on arxiv (done)
  • Release our training code (done)
  • Release our model checkpoints (done)
  • Release our prediction maps (done)

Citations

@article{yu2024multi,
  title={Multi-view Aggregation Network for Dichotomous Image Segmentation},
  author={Yu, Qian and Zhao, Xiaoqi and Pang, Youwei and Zhang, Lihe and Lu, Huchuan},
  journal={arXiv preprint arXiv:2404.07445},
  year={2024}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mvanet-0.1.0.tar.gz (23.3 kB view details)

Uploaded Source

Built Distribution

mvanet-0.1.0-py3-none-any.whl (24.1 kB view details)

Uploaded Python 3

File details

Details for the file mvanet-0.1.0.tar.gz.

File metadata

  • Download URL: mvanet-0.1.0.tar.gz
  • Upload date:
  • Size: 23.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.4 Linux/6.5.0-1022-azure

File hashes

Hashes for mvanet-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2af958cc7e8b4eee30895b2680a07e758b6bf7c025c24594d2e9bf736a31bbcb
MD5 e9895f818231dd1c37573597924b235d
BLAKE2b-256 c94e5e24fc714c54e5f89f3151de585b20791d7c76eee8e3809c73ad3dd2130d

See more details on using hashes here.

File details

Details for the file mvanet-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mvanet-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 24.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.4 Linux/6.5.0-1022-azure

File hashes

Hashes for mvanet-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 34cb98331c8ee15fe40d1f3a2b287c142a41bf26819a183575a0d3137cc9a067
MD5 ad28b2cd6e8daae3cadc831bc1208815
BLAKE2b-256 4741f4c3b7d0d61ce9dec45cd0315933ba1ecd257e032187776621b1a3e05dde

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page