Pandas extension with NLP functionalities
Project description
NLP Pandas
It's an extension for pandas providing some NLP functionalities for strings.
Installation
Install with:
pip install -U pandas-nlp
Requirements
- python >= 3.8
Key features
Language detection
import pandas as pd
import pandas_nlp
df = pd.DataFrame({
"id": [1, 2, 3, 4, 5],
"text": [
"I like cats",
"Me gustan los gatos",
"M'agraden els gats",
"J'aime les chats",
"Ich mag Katzen",
],
})
df.text.nlp.language()
Output
0 en
1 es
2 ca
3 fr
4 de
Name: text_language, dtype: object
df.text.nlp.language(confidence=True).apply(pd.Series)
Output
language confidence
0 en 0.897090
1 es 0.982045
2 ca 0.999806
3 fr 0.999713
4 de 0.997995
String embedding
import pandas as pd
import pandas_nlp
df = pd.DataFrame(
{"id": [1, 2, 3], "text": ["cat", "dog", "violin"]}
)
df.text.nlp.embedding()
Output
0 [2.0860276, 0.78038394, 0.20159146, -1.2828196...
1 [0.96052396, 1.0350337, 0.11549556, -1.2252672...
2 [1.2934866, 0.10021937, 0.71453714, -1.3288003...
Name: text_embedding, dtype: object
String embedding
import pandas as pd
import pandas_nlp
df = pd.DataFrame(
{"id": [0, 1], "text": ["Hello, how are you?", "Code. Sleep. Eat"]}
)
df.text.nlp.sentences()
Output
0 [Hello, how are you?]
1 [Code., Sleep., Eat]
Name: text_sentences, dtype: object
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pandas-nlp-0.4.2.tar.gz
(4.8 kB
view hashes)
Built Distribution
Close
Hashes for pandas_nlp-0.4.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5167adce37c44ea35b65bf2ac349de61d58ae3afb0a75511b25be0c8dddfd125 |
|
MD5 | ed48bb4c41aa5230dfc6f6a898cead83 |
|
BLAKE2b-256 | b5965b3a3b8530869a42033b3f17387c959b9ed3347ef9b4b8f5b309ba9fe547 |