Project description

SIMRE

*SimRE is a Python-based tool for automatically identifying requirements' similarity in SPL projects.

Parameters description

You need to enter seven parameters. These parameters are the following:
- nameNewReq: name of the CSV file that contains the list of new requirements. Must be in fileserver folder.
- nameFeatures: name of the XML or UVL file that contains the existing requirements. Must be in fileserver folder.
- nameReqDescription: name of the JSON file that contains the requirements description. Must be in fileserver folder.
- language: 'en' for English and 'es' for Spanish
- listModels: array with the models. optional. The default model is 1. The models are the following:
  - 1:Model multilingual MiniLM-L12-v2
  - 2:Model multilingual distiluse-cased-v2
  - 3:Model multilingual mpnet-base-v2
  - 4:'Model word2vec
  - 5:'Model fastText
- threshold: optional. The default value is 0.7.
- preprocess: optional. The values are: True to allow the pre-processing (default value), and False for without pre-processing

Installation by pip

Its necessary to have installed al least Python 3.8.10 and pip 23.1.2
The tool can be used by installing the library or downloading the code.
After download the library using pip, use the following code to download several pre-trained models and put them on caché.

from simre import init_models
init_models.main()

from simre import similarity
models = similarity.load_models()

This process may take several minutes depending on the processor's capacity and memory. We recommend a RAM of at least of 16 GB.
When the process finishes, you should confirm that a "fileserver" folder has been created and contains four files.

Usage

The first three parameters must be on the fileserver folder.
The method similarity_process perform the similarity process.

similarity.similarity_process(nameNewReq, nameFeatures, nameReqDescription, 'en',models)

In the examples folder are some files to make a test.

similarity.similarity_process('newRequirements.csv', 'featureModel.xml', 'descRequirements_en.json', 'en',models)

With all the parameters.

similarity.similarity_process('newRequirements.csv', 'featureModel.xml', 'descRequirements_en.json', 'en',models,[1,2,3],0.6,False)

The results will be provided in a CSV file on the fileserver folder ('Similarity List.csv').

Installation by code

Download the code
Install the required libraries. All necessary libraries are listed in the requirements.txt file. To install them, execute the following command pip install -r ./requirements.txt
It is necessary to download several pre-trained models. This can be done automatically or manually. To download the models automatically, execute the following command: python process.py init. To download the models manually, follow these steps:
a. Download the models of spacy for Spanish and English: python -m spacy download es_core_news_sm and python -m spacy download en_core_web_sm
b. Download the models of fastText for Spanish and English: cc.es.300.bin and cc.es.300.bin from https://fasttext.cc/docs/en/crawl-vectors.html
c. Download the word2vec-based models for Spanish and English: SBW-vectors-300-min5.bin.gz from https://crscardellino.github.io/SBWCE/ and GoogleNews-vectors-negative300.bin.gz from https://code.google.com/archive/p/word2vec/
d. At the same directory level as the src folder, create a new folder named fileserver. Place all the pre-trained models into this folder.

Usage

To execute the tool without optional parameters, use the following command: python process.py nameNewReq nameFeatures nameReqDescription language. Example:: python process.py newRequirements.csv featureModel.xml descRequirements_en.json en.
This is an example using all the parameters: python process.py newRequirements.csv featureModel.xml descRequirements_en.json en 1,2,3,5 0.7 False
The results will be provided in a CSV file on the fileserver folder ('Similarity List.csv').

Formats

*XML file:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<featureModel chosenLayoutAlgorithm="4">
  <struct>
    <and mandatory="true" name="GEMA_SPL">      
      <and name="UserManagement">
        <or name="UM_Registration">
          <feature mandatory="true" name="UM_R_ByAdmin"/>
          <feature mandatory="true" name="UM_R_Anonymous"/>
        </or>              
      </and>      
    </and>
  </struct>
  <featureOrder userDefined="false"/>
</featureModel>

*UVL file:

features
	UserManagement 
		optional
			UM_Registration 
				or
					UM_R_ByAdmin 
					UM_R_Anonymous

*JSON file:

{
  "UserManagement": { 
      "label": "User Management",
      "desc": "User Management" 
  },
  "UM_Registration": { 
      "label": "User Registration",
      "desc": "User Registration" 
  },
  
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.0.3

Jun 5, 2024

0.0.2

Jun 5, 2024

This version

0.0.1

Jun 5, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simre-0.0.1.tar.gz (10.9 kB view hashes)

Uploaded Jun 5, 2024 Source

Built Distribution

simre-0.0.1-py3-none-any.whl (9.9 kB view hashes)

Uploaded Jun 5, 2024 Python 3

Hashes for simre-0.0.1.tar.gz

Hashes for simre-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`ab88ad3de4ff25d8a336570d66ee21f05184c6bdb9cb75ff9ae937ae067ee9ca`
MD5	`493c6c3a333ba82798a08e5907d2cfa9`
BLAKE2b-256	`6c1c41f8e642ee600141f323e46c3ad42a0ee950b7f07f0830c25df693c254dc`

Hashes for simre-0.0.1-py3-none-any.whl

Hashes for simre-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`89fb1cfc6eaa991e5646b9ef402c0243270fb069139f4b22246eddc6af7cef59`
MD5	`67a2fbbf8471d9ec37b7c6bec449ae31`
BLAKE2b-256	`0e77556ddace04e9d35e7cb5531257f804bd75feccc9f1e8eb6e65acc0eccd62`