Transform raw data into insights.
Project description
Pychemist
The Alchemist of Data Science
Transform raw data into insights.
Pychemist is a lightweight Python library designed to simplify and enrich your data science workflow. Inspired by the transformation mindset of an alchemist, it helps you turn raw data into golden insights. Pychemist replaces complex, repetitive code with a clean and intuitive syntax that streamlines data cleaning, transformation, and preparation to reveal clear insights. It also enables clean and well-structured presentation of results, making it easier to communicate findings effectively.
Features
- Create lagged or lead variables for time-series and panel data
- Run quick, readable t-tests on treatment groups
- Filter model summaries to hide fixed effects
- Conditional mutation of DataFrames
- Pandas accessor (
.chem) for fluent, chainable workflows
Installation
pip install pychemist
Usage
import pychemist as chem
Example 1: Conditional mutation using the df.chem.mutate DataFrame accessor.
Update the total_assets for a specific company and year to a given value:
df=df.chem.mutate('company_id == "8ga62sav" & year==2025', "total_assets", 82000000)
Example 2: Conditional mutation using the df.chem.mutate DataFrame accessor.
Set the Promotion column to 1 for managers who haven't been promoted in 3 or more years and have a performance rating of at least 4; otherwise set it to 0.
df = df.chem.mutate('YearsSinceLastPromotion >= 3 & JobRole == "Manager" & PerformanceRating >= 4', 'Promotion', 1, 0)
Example 3: Creating lagged variables using the df.chem.lag DataFrame accessor.
Create lagged versions of total assets and net income for each ticker, only when the year difference is exactly 1:
df=df.chem.lag(['total assets','net income'],'ticker','year')
Example 4: Creating leading variables using the df.chem.lead DataFrame accessor.
Create lead (future) versions of total assets and net income for each ticker, only when the year difference is exactly 1:
df=df.chem.lead(['total assets','net income'],'ticker','year')
Example 5: Creating 2-year lagged variables using the df.chem.lag DataFrame accessor.
Create lagged versions of total assets and net income for each ticker, only when the year difference is exactly 2:
df=df.chem.lag(['total assets','net income'],'ticker','year',2)
Example 6: T-test between treated and control groups
chem.ttest(df, variable="outcome", treatment="treated")
Example 7: Model summary without fixed effects
import statsmodels.formula.api as smf
model = smf.ols("y ~ x + C(firm)", data=df).fit()
print(chem.summary_no_fe(model))
MIT License Copyright (c) Jeroen van Raak (2025)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pychemist-0.0.12.tar.gz.
File metadata
- Download URL: pychemist-0.0.12.tar.gz
- Upload date:
- Size: 12.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44ce39ac32b81c471552c22a79870d4af90b74e61c2b5e8efabbaf461bed32f2
|
|
| MD5 |
aea409ebb668b10f517dbd9fd5702781
|
|
| BLAKE2b-256 |
d7d4a49447dc3d82d9ca53d22b74ff8ab9eda3fc4fb1fb278bc3b302fce9f630
|
Provenance
The following attestation bundles were made for pychemist-0.0.12.tar.gz:
Publisher:
release.yaml on vanraak/pychemist
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pychemist-0.0.12.tar.gz -
Subject digest:
44ce39ac32b81c471552c22a79870d4af90b74e61c2b5e8efabbaf461bed32f2 - Sigstore transparency entry: 320692142
- Sigstore integration time:
-
Permalink:
vanraak/pychemist@5029d31301323c2bad53e4e4d2fa0f49d5cbc058 -
Branch / Tag:
refs/tags/v0.0.12 - Owner: https://github.com/vanraak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@5029d31301323c2bad53e4e4d2fa0f49d5cbc058 -
Trigger Event:
push
-
Statement type:
File details
Details for the file pychemist-0.0.12-py3-none-any.whl.
File metadata
- Download URL: pychemist-0.0.12-py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9f235429c7faaa5845b94e92f629eee7e0f4461de85d5198415d29688d9cff2
|
|
| MD5 |
a730e6aa2b9556351b36cd2bd3feba07
|
|
| BLAKE2b-256 |
b3d939cc071b7295cee510fbd28f1e36a48abf413fdb9c66b73e24026208ff70
|
Provenance
The following attestation bundles were made for pychemist-0.0.12-py3-none-any.whl:
Publisher:
release.yaml on vanraak/pychemist
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pychemist-0.0.12-py3-none-any.whl -
Subject digest:
b9f235429c7faaa5845b94e92f629eee7e0f4461de85d5198415d29688d9cff2 - Sigstore transparency entry: 320692167
- Sigstore integration time:
-
Permalink:
vanraak/pychemist@5029d31301323c2bad53e4e4d2fa0f49d5cbc058 -
Branch / Tag:
refs/tags/v0.0.12 - Owner: https://github.com/vanraak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@5029d31301323c2bad53e4e4d2fa0f49d5cbc058 -
Trigger Event:
push
-
Statement type: