Skip to main content

Data access and analysis of baby names statistics

Project description

babe

Note that the first time you import name, you need to have access to the Internet, and it will take a few seconds (depending on bandwidth) to download the required data.

But this data is automatically saved in a local file so things are faster the next time around.

from babe import names_by_us_states, names_all_us_states

names_all_us_state

This data frame provides popularity matrix for names of babies born in the US between 1910 and 2019.

names_all_us_states
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
popularity
name year
Aaban 2013 6
2014 6
Aadam 2019 6
Aadan 2008 12
2009 6
... ... ...
Zyriah 2013 7
2014 6
2016 5
Zyron 2015 5
Zyshonne 1998 5

594681 rows × 1 columns

names = set(names_all_us_states.reset_index()['name'].values)
print(f"{len(names)} unique names")
31862 unique names
years = set(names_all_us_states.reset_index()['year'])
print(f"Popularity stats cover years {min(years)} through {max(years)} (or subset thereof, depending on the name)")
Popularity stats cover years 1910 through 2019 (or subset thereof, depending on the name)
names_all_us_states.loc['Vanessa'].plot(figsize=(15, 4), style='-o', grid=True)
<AxesSubplot:xlabel='year'>

png

names_all_us_states.loc['Cora'].plot(figsize=(15, 4), style='-o', grid=True)
<AxesSubplot:xlabel='year'>

png

names_by_us_states

This dataframe provides the same as above, but by state. 51 US states are covered.

names_by_us_states
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
gender popularity
state name year
AK Mary 1910 F 14
Annie 1910 F 12
Anna 1910 F 10
Margaret 1910 F 8
Helen 1910 F 7
... ... ... ... ...
WY Theo 2019 M 5
Tristan 2019 M 5
Vincent 2019 M 5
Warren 2019 M 5
Waylon 2019 M 5

6122890 rows × 2 columns

states = set(names_by_us_states.reset_index()['state'])
print(f"{len(states)} states")
51 states
names_by_us_states.loc['CA']
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
gender popularity
name year
Mary 1910 F 295
Helen 1910 F 239
Dorothy 1910 F 220
Margaret 1910 F 163
Frances 1910 F 134
... ... ... ...
Zayvion 2019 M 5
Zeek 2019 M 5
Zhaire 2019 M 5
Zian 2019 M 5
Ziyad 2019 M 5

387781 rows × 2 columns

names_by_us_states.loc['CA'].loc['Cora']
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
gender popularity
year
1911 F 8
1912 F 9
1913 F 15
1914 F 15
1915 F 17
... ... ...
2015 F 269
2016 F 244
2017 F 284
2018 F 282
2019 F 256

109 rows × 2 columns

names_by_us_states.loc['CA'].loc['Cora'].plot(figsize=(15, 4), style='-o', grid=True)
<AxesSubplot:xlabel='year'>

png

names_by_us_states.loc['GA'].loc['Cora'].plot(figsize=(15, 4), style='-o', grid=True)
<AxesSubplot:xlabel='year'>

png

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

babe-0.0.2.tar.gz (3.8 kB view hashes)

Uploaded Source

Built Distribution

babe-0.0.2-py3-none-any.whl (3.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page