Data access and analysis of baby names statistics
Project description
babe
Note that the first time you import name, you need to have access to the Internet, and it will take a few seconds (depending on bandwidth) to download the required data.
But this data is automatically saved in a local file so things are faster the next time around.
from babe import names_by_us_states, names_all_us_states
names_all_us_state
This data frame provides popularity matrix for names of babies born in the US between 1910 and 2019.
names_all_us_states
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
popularity | ||
---|---|---|
name | year | |
Aaban | 2013 | 6 |
2014 | 6 | |
Aadam | 2019 | 6 |
Aadan | 2008 | 12 |
2009 | 6 | |
... | ... | ... |
Zyriah | 2013 | 7 |
2014 | 6 | |
2016 | 5 | |
Zyron | 2015 | 5 |
Zyshonne | 1998 | 5 |
594681 rows × 1 columns
names = set(names_all_us_states.reset_index()['name'].values)
print(f"{len(names)} unique names")
31862 unique names
years = set(names_all_us_states.reset_index()['year'])
print(f"Popularity stats cover years {min(years)} through {max(years)} (or subset thereof, depending on the name)")
Popularity stats cover years 1910 through 2019 (or subset thereof, depending on the name)
names_all_us_states.loc['Vanessa'].plot(figsize=(15, 4), style='-o', grid=True)
<AxesSubplot:xlabel='year'>
names_all_us_states.loc['Cora'].plot(figsize=(15, 4), style='-o', grid=True)
<AxesSubplot:xlabel='year'>
names_by_us_states
This dataframe provides the same as above, but by state. 51 US states are covered.
names_by_us_states
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
gender | popularity | |||
---|---|---|---|---|
state | name | year | ||
AK | Mary | 1910 | F | 14 |
Annie | 1910 | F | 12 | |
Anna | 1910 | F | 10 | |
Margaret | 1910 | F | 8 | |
Helen | 1910 | F | 7 | |
... | ... | ... | ... | ... |
WY | Theo | 2019 | M | 5 |
Tristan | 2019 | M | 5 | |
Vincent | 2019 | M | 5 | |
Warren | 2019 | M | 5 | |
Waylon | 2019 | M | 5 |
6122890 rows × 2 columns
states = set(names_by_us_states.reset_index()['state'])
print(f"{len(states)} states")
51 states
names_by_us_states.loc['CA']
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
gender | popularity | ||
---|---|---|---|
name | year | ||
Mary | 1910 | F | 295 |
Helen | 1910 | F | 239 |
Dorothy | 1910 | F | 220 |
Margaret | 1910 | F | 163 |
Frances | 1910 | F | 134 |
... | ... | ... | ... |
Zayvion | 2019 | M | 5 |
Zeek | 2019 | M | 5 |
Zhaire | 2019 | M | 5 |
Zian | 2019 | M | 5 |
Ziyad | 2019 | M | 5 |
387781 rows × 2 columns
names_by_us_states.loc['CA'].loc['Cora']
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
gender | popularity | |
---|---|---|
year | ||
1911 | F | 8 |
1912 | F | 9 |
1913 | F | 15 |
1914 | F | 15 |
1915 | F | 17 |
... | ... | ... |
2015 | F | 269 |
2016 | F | 244 |
2017 | F | 284 |
2018 | F | 282 |
2019 | F | 256 |
109 rows × 2 columns
names_by_us_states.loc['CA'].loc['Cora'].plot(figsize=(15, 4), style='-o', grid=True)
<AxesSubplot:xlabel='year'>
names_by_us_states.loc['GA'].loc['Cora'].plot(figsize=(15, 4), style='-o', grid=True)
<AxesSubplot:xlabel='year'>
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
babe-0.0.2.tar.gz
(3.8 kB
view hashes)
Built Distribution
babe-0.0.2-py3-none-any.whl
(3.9 kB
view hashes)