Python package for gender classification.
Project description
chicksexer - Python package for gender classification
=================================================================
![Chicksexer](images/chicksexer.jpg?raw=true "Title")
`chicksexer` is a Python package that performs **gender classification**. It receives a string of person name and returns the probability estimate of its gender as follows:
```python
>>> from chicksexer import predict_gender
>>> predict_gender('John Smith')
{'female': 0.0027230381965637207, 'male': 0.9972769618034363}
```
Several merits of using the classifier instead of simply looking up known male/female names are:
* Sometimes simple name lookup does not work. For instance, "Ryu" is likely to be a male name if it's followed by a Japanese surname, whereas it can be a Korean surname as well, then it's gender neutral.
* Can predict the gender of a name that does not exist in the list of male/female names.
* Can deal with a typo in a name relatively easily.
You can also get an estimate as a simple string as follows:
```python
>>> predict_gender('Oliver Butterfield', return_proba=False)
'male'
>>> predict_gender('Naila Ata', return_proba=False)
'female'
>>> predict_gender('Saldivar Anderson', return_proba=False)
'neutral'
>>> predict_gender('Ponyo', return_proba=False) # name of a character from the film
'neutral'
>>> predict_gender('Ponya', return_proba=False) # modify the name such that it sounds like a female name
'female'
>>> predict_gender('Ryu Ito', return_proba=False) # Ryu here is a Japanese first name
'male'
>>> predict_gender('Ryu Seo-yeon', return_proba=False) # Ryu is a Korean surname, Seo-yeon is a popular first name for girls
'female'
```
If you want to predict the gender of multiple names, use `predict_genders` (plural) function instead:
```python
>>> from chicksexer import predict_genders
>>> predict_genders(['Ichiro Suzuki', 'Haruki Murakami'])
[{'female': 3.039836883544922e-05, 'male': 0.9999696016311646},
{'female': 1.2040138244628906e-05, 'male': 0.9999879598617554}]
>>> predict_genders(['Ichiro Suzuki', 'Haruki Murakami'], return_proba=False)
['male', 'male']
```
Installation
------------
- This repository can run on Ubuntu 14.04 LTS & Mac OSX 10.x (not tested on other OSs)
- Tested only on Python 3.5
`chicksexer` depends on [NumPy and Scipy](https://www.scipy.org/install.html), Python packages for scientific computing. You might need to have them installed prior to installing `chicksexer`.
You can install `chicksexer` by:
```pip install chicksexer```
`chicksexer` also depends on `tensorflow` package. In default, it tries to install the CPU-only version of `tensorflow`. If you want to use GPU, you need to install `tensorflow` with GPU support by yourself. (C.f. [Installing Tensorflow](https://www.tensorflow.org/install/))
=================================================================
![Chicksexer](images/chicksexer.jpg?raw=true "Title")
`chicksexer` is a Python package that performs **gender classification**. It receives a string of person name and returns the probability estimate of its gender as follows:
```python
>>> from chicksexer import predict_gender
>>> predict_gender('John Smith')
{'female': 0.0027230381965637207, 'male': 0.9972769618034363}
```
Several merits of using the classifier instead of simply looking up known male/female names are:
* Sometimes simple name lookup does not work. For instance, "Ryu" is likely to be a male name if it's followed by a Japanese surname, whereas it can be a Korean surname as well, then it's gender neutral.
* Can predict the gender of a name that does not exist in the list of male/female names.
* Can deal with a typo in a name relatively easily.
You can also get an estimate as a simple string as follows:
```python
>>> predict_gender('Oliver Butterfield', return_proba=False)
'male'
>>> predict_gender('Naila Ata', return_proba=False)
'female'
>>> predict_gender('Saldivar Anderson', return_proba=False)
'neutral'
>>> predict_gender('Ponyo', return_proba=False) # name of a character from the film
'neutral'
>>> predict_gender('Ponya', return_proba=False) # modify the name such that it sounds like a female name
'female'
>>> predict_gender('Ryu Ito', return_proba=False) # Ryu here is a Japanese first name
'male'
>>> predict_gender('Ryu Seo-yeon', return_proba=False) # Ryu is a Korean surname, Seo-yeon is a popular first name for girls
'female'
```
If you want to predict the gender of multiple names, use `predict_genders` (plural) function instead:
```python
>>> from chicksexer import predict_genders
>>> predict_genders(['Ichiro Suzuki', 'Haruki Murakami'])
[{'female': 3.039836883544922e-05, 'male': 0.9999696016311646},
{'female': 1.2040138244628906e-05, 'male': 0.9999879598617554}]
>>> predict_genders(['Ichiro Suzuki', 'Haruki Murakami'], return_proba=False)
['male', 'male']
```
Installation
------------
- This repository can run on Ubuntu 14.04 LTS & Mac OSX 10.x (not tested on other OSs)
- Tested only on Python 3.5
`chicksexer` depends on [NumPy and Scipy](https://www.scipy.org/install.html), Python packages for scientific computing. You might need to have them installed prior to installing `chicksexer`.
You can install `chicksexer` by:
```pip install chicksexer```
`chicksexer` also depends on `tensorflow` package. In default, it tries to install the CPU-only version of `tensorflow`. If you want to use GPU, you need to install `tensorflow` with GPU support by yourself. (C.f. [Installing Tensorflow](https://www.tensorflow.org/install/))
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
chicksexer-0.1.0-py3-none-any.whl
(34.0 MB
view hashes)
Close
Hashes for chicksexer-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d14ce9ba71a4ceae42f63e26063abed43c124bd76140381ba69ea78906b9f388 |
|
MD5 | 24373bfbdba0509f5f77fbe88c5a09e0 |
|
BLAKE2b-256 | 60b88d3930d226caf18da5f4f852ebcc5cc1144feb8fea2b1d53dd100644a8dd |