Add your description here
Project description
clean-column-names
Clean pandas DataFrame column names into predictable, consistent case styles.
Installation
uv add clean-column-names
Usage
import pandas as pd
import clean_column_names
df = pd.DataFrame(
columns=[
"First Name",
"Café Sales ($)",
"HTTPStatusCode",
"",
None,
"First Name",
]
)
df = df.pipe(clean_column_names.clean_column_names)
print(df.columns.tolist())
Output:
[
"first_name",
"cafe_sales_$",
"http_status_code",
"column",
"column_1",
"first_name_1",
]
The original DataFrame is not modified.
API
df = df.pipe(
clean_column_names.clean_column_names,
case="snake",
replace=None,
remove_accents=True,
)
Arguments
df: A pandas DataFrame.
case: The target case style. Defaults to "snake".
replace: Optional mapping of literal text replacements to apply before case conversion. Matching
is case-insensitive.
remove_accents: When True, accented characters are transliterated to ASCII where possible.
Defaults to True.
Case Styles
case |
Example |
|---|---|
"snake" |
column_name |
"kebab" |
column-name |
"camel" |
columnName |
"pascal" |
ColumnName |
"const" |
COLUMN_NAME |
"sentence" |
Column name |
"title" |
Column Name |
"lower" |
column name |
"upper" |
COLUMN NAME |
Examples
Use kebab case:
df = df.pipe(
clean_column_names.clean_column_names,
case="kebab",
)
print(df.columns.tolist())
[
"first-name",
"cafe-sales-$",
"http-status-code",
"column",
"column-1",
"first-name-1",
]
Apply replacements before cleaning:
df = df.pipe(
clean_column_names.clean_column_names,
replace={"HTTP": "API"},
)
print(df.columns.tolist())
[
"first_name",
"cafe_sales_$",
"api_status_code",
"column",
"column_1",
"first_name_1",
]
Keep accented characters:
df = df.pipe(
clean_column_names.clean_column_names,
case="title",
remove_accents=False,
)
print(df.columns.tolist())
[
"First Name",
"Café Sales ($)",
"Http Status Code",
"Column",
"Column 1",
"First Name 1",
]
Behavior Notes
Blank and null column names are converted to column.
If multiple columns clean to the same name, numeric suffixes are added using the target case style's separator:
df = pd.DataFrame(columns=["Name", "Name", "Name"])
df = df.pipe(clean_column_names.clean_column_names)
print(df.columns.tolist())
["name", "name_1", "name_2"]
This package supports ordinary flat columns and pandas MultiIndex columns.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file clean_column_names-0.1.0.tar.gz.
File metadata
- Download URL: clean_column_names-0.1.0.tar.gz
- Upload date:
- Size: 3.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a275956aa45ca4c4285ec68751d1b76c7238dfb29c8267c43afab7028512289
|
|
| MD5 |
2fafeb448ca46151faca15f3a473becf
|
|
| BLAKE2b-256 |
8b867b540b5e9647225ee77c9d42198f7d0580c75ab3e6df1f569e69d2c355e7
|