Visualizers of the PET dataset goldstandard data and baselines prediction
Project description
PET Dataset Goldstandard and Baselines predictions visualizers
This package provides two graphical interfaces to visualize the PET dataset goldstandard annotations and the baselines predictions.
created by Patrizio Bellan
The PET Dataset
Abstract. Process extraction from text is an important task of process discovery, for which various approaches have been developed in recent years. However, in contrast to other information extraction tasks, there is a lack of gold-standard corpora of business process descriptions that are carefully annotated with all the entities and relationships of interest. Due to this, it is currently hard to compare the results obtained by extraction approaches in an objective manner, whereas the lack of annotated texts also prevents the application of data-driven information extraction methodologies, typical of the natural language processing field. Therefore, to bridge this gap, we present the PET dataset, a first corpus of business process descriptions annotated with activities, gateways, actors, and flow information. We present our new resource, including a variety of baselines to benchmark the difficulty and challenges of business process extraction from text. PET can be accessed via huggingface.co/datasets/patriziobellan/PET
PET Dataset Data
Baseline Predictor Agreement Interface
This visualizer shows the agreement between the baseline predictors and the goldstandard annotations.
from Visualizers.BaselinePredictorAgreementInterface import Show as agreements
agreements()
Baselines Comparison LineView
This visualizer provides a GUI to compare the goldstandard process element annotations and the Baseline 1 ones.
from Visualizers.BaselinesComparisonLineView import Show as lineview
lineview()
Cite the PET dataset
PET Dataset
@inproceedings{DBLP:conf/bpm/BellanADGP22,
author = {Patrizio Bellan and
Han van der Aa and
Mauro Dragoni and
Chiara Ghidini and
Simone Paolo Ponzetto},
editor = {Cristina Cabanillas and
Niels Frederik Garmann{-}Johnsen and
Agnes Koschmider},
title = {{PET:} An Annotated Dataset for Process Extraction from Natural Language
Text Tasks},
booktitle = {Business Process Management Workshops - {BPM} 2022 International Workshops,
M{\"{u}}nster, Germany, September 11-16, 2022, Revised Selected
Papers},
series = {Lecture Notes in Business Information Processing},
volume = {460},
pages = {315--321},
publisher = {Springer},
year = {2022},
url = {https://doi.org/10.1007/978-3-031-25383-6\_23},
doi = {10.1007/978-3-031-25383-6\_23},
timestamp = {Tue, 14 Feb 2023 09:47:10 +0100},
biburl = {https://dblp.org/rec/conf/bpm/BellanADGP22.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Guidelines
@inproceedings{DBLP:conf/aiia/BellanGDPA22,
author = {Patrizio Bellan and
Chiara Ghidini and
Mauro Dragoni and
Simone Paolo Ponzetto and
Han van der Aa},
editor = {Debora Nozza and
Lucia C. Passaro and
Marco Polignano},
title = {Process Extraction from Natural Language Text: the {PET} Dataset and
Annotation Guidelines},
booktitle = {Proceedings of the Sixth Workshop on Natural Language for Artificial
Intelligence {(NL4AI} 2022) co-located with 21th International Conference
of the Italian Association for Artificial Intelligence (AI*IA 2022),
Udine, November 30th, 2022},
series = {{CEUR} Workshop Proceedings},
volume = {3287},
pages = {177--191},
publisher = {CEUR-WS.org},
year = {2022},
url = {https://ceur-ws.org/Vol-3287/paper18.pdf},
timestamp = {Fri, 10 Mar 2023 16:23:01 +0100},
biburl = {https://dblp.org/rec/conf/aiia/BellanGDPA22.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file PETGoldstandardBaselinesVisualizers-1.0.2.tar.gz
.
File metadata
- Download URL: PETGoldstandardBaselinesVisualizers-1.0.2.tar.gz
- Upload date:
- Size: 20.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3cc838315b83496837c607530c9fae5d6cccb4583c376f2d038ea3c0b1253eac |
|
MD5 | 7cb27b84f047671d6224029780de6742 |
|
BLAKE2b-256 | 37e8a9c243205b2d10528a40f56c3699adbd4be4d2f026481bbfb46f41129a2b |