Skip to main content

A Source Code Similarity System

Project description

scoss

A Source Code Similarity System - SCOSS

There are four supported metrics:

  • count_operator: A metric that counts operators in source-code to calculate similarity score.
  • set_operator: A metric that checks the presence of operators in source-code to calculate similarity score.
  • hash_operator: A metric that uses the combination of adjacent operators to calculate similarity score.
  • SMoss: A wrapper of MOSS (the same as mosspy).

Installations

This package requires python 3.6 or later.

pip install scoss

Usages

You can use SCOSS as a Command Line Interface, or a library in your project, or web-app interface

Command Line Interface (CLI)

Comming soon...

Using as a library

  1. Define a Scoss object and register some metrics:
from scoss import Scoss
sc = Scoss(lang='cpp')
# only show pairs that have similarity score > threshold
sc.add_metric('count_operator', threshold=0.7) 
sc.add_metric('set_operator', threshold=0.5)
  1. Register source-codes to defined scoss object:
sc.add_file('./tests/data/a.cpp')
sc.add_file('./tests/data/b.cpp')
sc.add_file('./tests/data/c.cpp')
# or add by wide-card
sc.add_file_by_wildcard('./tests/data/problem_A_*.cpp')
  1. Run Scoss and get results:
sc.run()
# filter results by combine thresholds from different metrics (and_threshold)
print(sc.get_matches(and_thresholds=True))

The same behaviours is defined in SMoss. You can create SMoss object to use MOSS system.

Web-app interface

Please check our web-app interface here.

Issues

This project is in development, if you find any issues, please create an issue here.

Contributors

Ngoc Bui, Thai Do, Tran Vien.

Acknowledgements

This project is sponsored and led by Prof. Do Phan Thuan, Hanoi University of Science and Technology.

A part of this code adapts this source code https://github.com/soachishti/moss.py as baseline for SMoss.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scoss-0.0.2.tar.gz (14.8 kB view details)

Uploaded Source

Built Distribution

scoss-0.0.2-py3-none-any.whl (19.9 kB view details)

Uploaded Python 3

File details

Details for the file scoss-0.0.2.tar.gz.

File metadata

  • Download URL: scoss-0.0.2.tar.gz
  • Upload date:
  • Size: 14.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for scoss-0.0.2.tar.gz
Algorithm Hash digest
SHA256 2afe5abf5bf09ae5f5eadb23a07ab240668465c4f3c22c144c70144f2e5ac508
MD5 b7a53c24937a2c7f338b26a9fd8c81de
BLAKE2b-256 fa6a6508b6d7d304465b708be902cf87a25a59503288b4b83da9faee23e34c93

See more details on using hashes here.

File details

Details for the file scoss-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: scoss-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 19.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for scoss-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 60ffb37081648e2e2769873d6fda71ea2d0403cdc89ec51a2b954768ea30cb7e
MD5 180739b172bd1a24850166495350c2ca
BLAKE2b-256 2758dd6565519b8d3a680219da8d68a4259d620e0743f82b53576d79853fd552

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page