Convert a Pandas DataFrame/Series with dtype str/string/object to the best available dtypes
Project description
What is it used for?
Convert a Pandas DataFrame/Series with dtype str/string/object to the best available dtypes
Installation
pip install a-pandas-ex-string-to-dtypes
Usage
from a_pandas_ex_string_to_dtypes import pd_add_string_to_dtypes
import pandas as pd
pd_add_string_to_dtypes()
df = pd.read_csv("https://github.com/pandas-dev/pandas/raw/main/doc/data/titanic.csv")
print(df)
print(df.dtypes)
PassengerId Survived Pclass ... Fare Cabin Embarked
0 1 0 3 ... 7.2500 NaN S
1 2 1 1 ... 71.2833 C85 C
2 3 1 3 ... 7.9250 NaN S
3 4 1 1 ... 53.1000 C123 S
4 5 0 3 ... 8.0500 NaN S
.. ... ... ... ... ... ... ...
886 887 0 2 ... 13.0000 NaN S
887 888 1 1 ... 30.0000 B42 S
888 889 0 3 ... 23.4500 NaN S
889 890 1 1 ... 30.0000 C148 C
890 891 0 3 ... 7.7500 NaN Q
[891 rows x 12 columns]
PassengerId int64
Survived int64
Pclass int64
Name object
Sex object
Age float64
SibSp int64
Parch int64
Ticket object
Fare float64
Cabin object
Embarked object
dtype: object
dfstring = pd.concat(
[df[x].astype("string") for x in df.columns], axis=1, ignore_index=True
)
dfstring.columns=df.columns
print(dfstring)
print(dfstring.dtypes)
PassengerId Survived Pclass ... Fare Cabin Embarked
0 1 0 3 ... 7.25 <NA> S
1 2 1 1 ... 71.2833 C85 C
2 3 1 3 ... 7.925 <NA> S
3 4 1 1 ... 53.1 C123 S
4 5 0 3 ... 8.05 <NA> S
.. ... ... ... ... ... ... ...
886 887 0 2 ... 13.0 <NA> S
887 888 1 1 ... 30.0 B42 S
888 889 0 3 ... 23.45 <NA> S
889 890 1 1 ... 30.0 C148 C
890 891 0 3 ... 7.75 <NA> Q
[891 rows x 12 columns]
PassengerId string
Survived string
Pclass string
Name string
Sex string
Age string
SibSp string
Parch string
Ticket string
Fare string
Cabin string
Embarked string
dtype: object
converted = dfstring.ds_string_to_best_dtype()
print(converted)
print(converted.dtypes)
PassengerId Survived Pclass ... Fare Cabin Embarked
0 1 0 3 ... 7.2500 <NA> S
1 2 1 1 ... 71.2833 C85 C
2 3 1 3 ... 7.9250 <NA> S
3 4 1 1 ... 53.1000 C123 S
4 5 0 3 ... 8.0500 <NA> S
.. ... ... ... ... ... ... ...
886 887 0 2 ... 13.0000 <NA> S
887 888 1 1 ... 30.0000 B42 S
888 889 0 3 ... 23.4500 <NA> S
889 890 1 1 ... 30.0000 C148 C
890 891 0 3 ... 7.7500 <NA> Q
[891 rows x 12 columns]
PassengerId uint16
Survived uint8
Pclass uint8
Name string
Sex category
Age object
SibSp uint8
Parch uint8
Ticket object
Fare float64
Cabin category
Embarked category
dtype: object
Parameters:
df: Union[pd.DataFrame, pd.Series]
Returns:
Union[pd.DataFrame, pd.Series]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file a_pandas_ex_string_to_dtypes-0.1.tar.gz
.
File metadata
- Download URL: a_pandas_ex_string_to_dtypes-0.1.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 850db3010565b38d5d3eb8fc0f3a97c8223b7d8b595835071de1934c39d0bf91 |
|
MD5 | 82e416bf7c01e79ed37a0823f97cfe32 |
|
BLAKE2b-256 | 5acd0ee0f6357f7fe2d7f9822eef64ffc9dd047cb288615e0d3ccec1eb39a9be |
File details
Details for the file a_pandas_ex_string_to_dtypes-0.1-py3-none-any.whl
.
File metadata
- Download URL: a_pandas_ex_string_to_dtypes-0.1-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f428c8564bc589b102de2a8bae982a4c659b899f1214ae506c86207ff5c8769e |
|
MD5 | f1f0cc827f69ccfa097f87bf1c5b42f9 |
|
BLAKE2b-256 | 0df2934ee563eee188b6d7a7d267cc85268364e31c2405ed6aa04622f194c941 |