Design and run batch replacements.
Project description
defected
Open source projects thrive on collaboration, but their openness comes with risks. Contributors may unknowingly or intentionally exhibit suspicious behaviors, such as:
- Frequent timezone changes in their commit metadata.
- Working at unusual hours or during public holidays.
- Unusual patterns in commit activity.
These anomalies could indicate automation scripts, compromised accounts, or malicious actions.
Defected is a CLI tool designed to help maintainers detect and flag suspicious commit patterns. By analyzing Git logs, Defected provides insights into contributors’ behaviors, helping ensure the security and integrity of your project.
Problem Statement
Risks in Open Source Collaboration:
- Frequent Timezone Changes:
- Automation scripts or account misuse can result in rapid changes in timezone metadata.
- Unusual Working Hours:
- Commits made during public holidays or odd hours may indicate suspicious activity.
- Behavioral Anomalies:
- Patterns of activity inconsistent with normal contributor behavior could point to automation or malicious intent.
Manually detecting these patterns is tedious and impractical for large projects.
Objective
Defected addresses these challenges by:
- Detecting frequent timezone changes in commit metadata.
- Highlighting contributors with irregular commit patterns.
- Flagging potential risks for maintainers to investigate.
- Providing clear and exportable reports for further analysis.
Features
- Easy-to-Use CLI:
- Installable via PyPI, Defected is simple to run directly from your terminal.
- Commit Metadata Analysis:
- Extracts author, email, date, and timezone data from Git logs.
- Timezone Change Detection:
- Flags contributors exceeding a configurable threshold of timezone changes.
- Customizable Options:
- Adjust thresholds, filter suspicious results.
- Exportable Reports:
- Saves results in CSV format for further analysis.
Installation
Install Defected using pip:
$ pip install defected
Usage
Defected provides a single command-line interface with subcommands. You can list all the available commands by using:
$ defected -h
Available Commands
The analyze sub-command help you to find suspicious contributors:
$ defected analyze [OPTIONS]
List all the available options by using:
$ defected analyze -h
Examples
Analyze the current repository for timezone changes:
defected analyze
Clone and analyze a remote repository. Analyze a remote repository by providing its URL:
defected analyze --repo https://github.com/user/repo.git
Filter only suspicious results. Display and export only contributors flagged as suspicious:
defected analyze --only-suspicious
Example Output
Terminal Output
Extracting Git logs...
150 commits extracted.
Analyzing timezones with a threshold of 2 timezone changes...
Showing only suspicious results:
author email total_commits unique_timezones timezone_changes suspicious
0 Alice Smith alice@example.com 45 3 4 True
1 Bob Johnson bob@example.com 30 2 3 True
Saving analysis to 'timezone_analysis.csv'...
Analysis saved.
CSV Output
| author | total_commits | unique_timezones | timezone_changes | suspicious | |
|---|---|---|---|---|---|
| Alice Smith | alice@example.com | 45 | 3 | 4 | True |
| Bob Johnson | bob@example.com | 30 | 2 | 3 | True |
Real world use case
JiaT75 and the xz Backdoor
Background
In February 2024, a contributor named JiaT75 managed to introduce a backdoor into the popular compression utility xz. This backdoor could have allowed unauthorized access to systems using the library, creating a serious security risk.
Upon investigation, it was discovered that JiaT75 exhibited suspicious behavior:
- They made commits from multiple, rapidly-changing timezones over a short period.
- Their activity patterns were inconsistent with typical open source contributors, suggesting potential misuse of accounts or automation.
Defected can help identify such patterns in contributors' Git activity.
Detecting JiaT75's Behavior with Defected
Suppose you have a repository of xz and suspect malicious activity. You can use Defected to analyze the commit logs for anomalies.
Let's analyze the xz repository:
defected analyze --repo https://github.com/tukaani-project/xz --only-suspicious
This command output something like the following:
Cloning remote repository: https://github.com/tukaani-project/xz...
Extracting Git logs...
2676 commits extracted.
Parsing logs...
Analyzing timezones with a threshold of 2 timezone changes...
Showing only suspicious results:
author total_commits unique_timezones timezone_changes suspicious email
36 Lasse Collin 2102 3 36 True lasse.collin@tukaani.org
28 Jia Tan 449 3 14 True jiat0218@gmail.com
32 Jonathan Nieder 9 3 4 True jrnieder@gmail.com
Saving analysis to 'timezone_analysis.csv'...
Analysis saved.
Results are exported at the CSV format and can be loaded in sheet:
| author | total_commits | unique_timezones | timezone_changes | suspicious | |
|---|---|---|---|---|---|
| Lasse Collin | 2102 | 3 | 36 | True | lasse.collin@tukaani.org |
| Jia Tan | 449 | 3 | 14 | True | jiat0218@gmail.com |
| Jonathan Nieder | 9 | 3 | 4 | True | jrnieder@gmail.com |
Interpretation
The results show that Jia Tan also known as JiaT75:
- Contributed 449 commits to the repository.
- Operated from 3 different timezones during his activity period.
- Exhibited 14 timezone changes, exceeding the threshold of 2, which flags them as "suspicious."
These irregular patterns warrant further investigation and could have raised red flags before the backdoor was merged.
Obviously not all activities are not suspicious. The result above also show legit activity like the ones from Lasse and Jonathan. But the one from Jia as been proven to be security attack lead through social engineering.
Lessons Learned
This case highlights the importance of monitoring contributor activity, especially in critical open source projects.
By using tools like Defected, maintainers can:
- Proactively identify suspicious contributors.
- Investigate anomalies in commit patterns.
- Prevent security risks, such as backdoors, before they impact end users.
Why This Matters
The case of JiaT75 is a reminder that even trusted repositories can be compromised. Open source maintainers need tools like Defected to protect their projects from potential threats by identifying early warning signs such as irregular timezone changes.
Obviously, not all timezone changes are suspicious, many of them are legit, but like demonstrated by xz example some are real attempts. JiaT75 tried to show that he was located in Asia where some timezone changes reflect western Europe timezone. Some timezone changes are so short that JiaT75 travel faster than light.
How It Works
- Log Extraction:
- Extracts contributor metadata (author, email, date, timezone) using Git.
- Analysis:
- Groups commits by contributors.
- Detects timezone changes and flags irregular patterns.
- Results:
- Outputs analysis to the terminal.
- Exports results to a CSV file.
Future Improvements
- Holiday Detection:
- Cross-reference commit dates with public holiday calendars for anomaly detection.
- Commit Pattern Visualization:
- Add heatmaps or graphs to visualize contributors' activity.
- CI/CD Integration:
- Automate detection in pipelines to secure projects during updates.
Contributing
We welcome contributions to Defected! To contribute:
- Fork the repository.
- Create a feature branch.
- Submit a pull request with a detailed description of your changes.
License
Defected is licensed under the MIT License. See the LICENSE file for details.
Acknowledgments
This project is inspired by the open source community and aims to empower maintainers with tools to ensure project security and integrity.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file defected-0.1.0.tar.gz.
File metadata
- Download URL: defected-0.1.0.tar.gz
- Upload date:
- Size: 22.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6edd6713db99efa76238ee0ea88a3a8ad0514de294011af5cbab2f6a53429aa
|
|
| MD5 |
9c793d3f15006ed3f9a7e6d929112ed9
|
|
| BLAKE2b-256 |
09faf312dc3979256271b27d914600f48f8f6b338e3afd72f39718292f802068
|
Provenance
The following attestation bundles were made for defected-0.1.0.tar.gz:
Publisher:
main.yml on 4383/defected
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
defected-0.1.0.tar.gz -
Subject digest:
a6edd6713db99efa76238ee0ea88a3a8ad0514de294011af5cbab2f6a53429aa - Sigstore transparency entry: 152015937
- Sigstore integration time:
-
Permalink:
4383/defected@3aaa075a02e152ce6882e587ea5db1fb4bc2a36d -
Branch / Tag:
refs/tags/0.1.0 - Owner: https://github.com/4383
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
main.yml@3aaa075a02e152ce6882e587ea5db1fb4bc2a36d -
Trigger Event:
push
-
Statement type:
File details
Details for the file defected-0.1.0-py3-none-any.whl.
File metadata
- Download URL: defected-0.1.0-py3-none-any.whl
- Upload date:
- Size: 20.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bdcfcaeb22df4cc99a8ebd8017ede1f0fa9f1ea9c6d54576634798f7991d08d4
|
|
| MD5 |
6fae7f2dea161b9db52ddbe7f92cf2a2
|
|
| BLAKE2b-256 |
826d844455f31522a871eb5a360bdf4e989bc5310ea41d098703d9c62df44c6b
|
Provenance
The following attestation bundles were made for defected-0.1.0-py3-none-any.whl:
Publisher:
main.yml on 4383/defected
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
defected-0.1.0-py3-none-any.whl -
Subject digest:
bdcfcaeb22df4cc99a8ebd8017ede1f0fa9f1ea9c6d54576634798f7991d08d4 - Sigstore transparency entry: 152015938
- Sigstore integration time:
-
Permalink:
4383/defected@3aaa075a02e152ce6882e587ea5db1fb4bc2a36d -
Branch / Tag:
refs/tags/0.1.0 - Owner: https://github.com/4383
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
main.yml@3aaa075a02e152ce6882e587ea5db1fb4bc2a36d -
Trigger Event:
push
-
Statement type: