Skip to main content

A short description for your project.

Project description

Library for detecting plagiarism in source code. Userful for online judges, teachers, developers and maybe lawyers.

Plagiarism uses the method described [here](http://…). The basic idea is to classify each submitted file according to different metrics and perform a series of k-means based clusterizations to determine which objects are most similar to each other. This approach has a N log N cost and scales fairly well to big samples.

The algorithm can be applied to natural text, source code and can even be adapted to run on arbitrary data structures (such as the parse tree of a computer program, ASM output, even binary executables). It requires some tuning for each application and accuracy may vary widely depending on application. You should expect better results grading Python and C source code. Performance on other programming languages or even in other domains may vary.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plagiarism-0.1.0.tar.gz (21.1 kB view details)

Uploaded Source

File details

Details for the file plagiarism-0.1.0.tar.gz.

File metadata

  • Download URL: plagiarism-0.1.0.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for plagiarism-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ef158ebc755f323d4d371ddc230ab2e346462e2b96a7f6cfa4181e215450c3a4
MD5 b94411c1ebbac96bc9fd9ac52f1f9db5
BLAKE2b-256 9e99b1db825ef1f24fa80e0535528f6b1f7eda1b11e61000c50260660a58f294

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page