Skip to main content

Convert a Pandas DataFrame/Series with dtype str/string/object to the best available dtypes

Project description

What is it used for?

Convert a Pandas DataFrame/Series with dtype str/string/object to the best available dtypes

Installation

pip install a-pandas-ex-string-to-dtypes

Usage

    from a_pandas_ex_string_to_dtypes import pd_add_string_to_dtypes

    import pandas as pd

    pd_add_string_to_dtypes()

    df = pd.read_csv("https://github.com/pandas-dev/pandas/raw/main/doc/data/titanic.csv")

    print(df)

    print(df.dtypes)   

    

    

         PassengerId  Survived  Pclass  ...     Fare Cabin  Embarked

    0              1         0       3  ...   7.2500   NaN         S

    1              2         1       1  ...  71.2833   C85         C

    2              3         1       3  ...   7.9250   NaN         S

    3              4         1       1  ...  53.1000  C123         S

    4              5         0       3  ...   8.0500   NaN         S

    ..           ...       ...     ...  ...      ...   ...       ...

    886          887         0       2  ...  13.0000   NaN         S

    887          888         1       1  ...  30.0000   B42         S

    888          889         0       3  ...  23.4500   NaN         S

    889          890         1       1  ...  30.0000  C148         C

    890          891         0       3  ...   7.7500   NaN         Q

    [891 rows x 12 columns]  

    

    PassengerId      int64

    Survived         int64

    Pclass           int64

    Name            object

    Sex             object

    Age            float64

    SibSp            int64

    Parch            int64

    Ticket          object

    Fare           float64

    Cabin           object

    Embarked        object

    dtype: object     

    

    

    

    

    

    dfstring = pd.concat(

        [df[x].astype("string") for x in df.columns], axis=1, ignore_index=True

    )

    dfstring.columns=df.columns

    print(dfstring)

    print(dfstring.dtypes)  

    

        PassengerId Survived Pclass  ...     Fare Cabin Embarked

    0             1        0      3  ...     7.25  <NA>        S

    1             2        1      1  ...  71.2833   C85        C

    2             3        1      3  ...    7.925  <NA>        S

    3             4        1      1  ...     53.1  C123        S

    4             5        0      3  ...     8.05  <NA>        S

    ..          ...      ...    ...  ...      ...   ...      ...

    886         887        0      2  ...     13.0  <NA>        S

    887         888        1      1  ...     30.0   B42        S

    888         889        0      3  ...    23.45  <NA>        S

    889         890        1      1  ...     30.0  C148        C

    890         891        0      3  ...     7.75  <NA>        Q

    [891 rows x 12 columns]    

    

    

    PassengerId    string

    Survived       string

    Pclass         string

    Name           string

    Sex            string

    Age            string

    SibSp          string

    Parch          string

    Ticket         string

    Fare           string

    Cabin          string

    Embarked       string

    dtype: object    

    

    

    

    converted = dfstring.ds_string_to_best_dtype()

    print(converted)

    print(converted.dtypes)

         PassengerId  Survived  Pclass  ...     Fare Cabin Embarked

    0              1         0       3  ...   7.2500  <NA>        S

    1              2         1       1  ...  71.2833   C85        C

    2              3         1       3  ...   7.9250  <NA>        S

    3              4         1       1  ...  53.1000  C123        S

    4              5         0       3  ...   8.0500  <NA>        S

    ..           ...       ...     ...  ...      ...   ...      ...

    886          887         0       2  ...  13.0000  <NA>        S

    887          888         1       1  ...  30.0000   B42        S

    888          889         0       3  ...  23.4500  <NA>        S

    889          890         1       1  ...  30.0000  C148        C

    890          891         0       3  ...   7.7500  <NA>        Q

    [891 rows x 12 columns]    

    

    

    PassengerId      uint16

    Survived          uint8

    Pclass            uint8

    Name             string

    Sex            category

    Age              object

    SibSp             uint8

    Parch             uint8

    Ticket           object

    Fare            float64

    Cabin          category

    Embarked       category

    dtype: object    

    

    

        Parameters:

            df: Union[pd.DataFrame, pd.Series]

        Returns:

            Union[pd.DataFrame, pd.Series]

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

a_pandas_ex_string_to_dtypes-0.1.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file a_pandas_ex_string_to_dtypes-0.1.tar.gz.

File metadata

File hashes

Hashes for a_pandas_ex_string_to_dtypes-0.1.tar.gz
Algorithm Hash digest
SHA256 850db3010565b38d5d3eb8fc0f3a97c8223b7d8b595835071de1934c39d0bf91
MD5 82e416bf7c01e79ed37a0823f97cfe32
BLAKE2b-256 5acd0ee0f6357f7fe2d7f9822eef64ffc9dd047cb288615e0d3ccec1eb39a9be

See more details on using hashes here.

File details

Details for the file a_pandas_ex_string_to_dtypes-0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for a_pandas_ex_string_to_dtypes-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f428c8564bc589b102de2a8bae982a4c659b899f1214ae506c86207ff5c8769e
MD5 f1f0cc827f69ccfa097f87bf1c5b42f9
BLAKE2b-256 0df2934ee563eee188b6d7a7d267cc85268364e31c2405ed6aa04622f194c941

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page