Python package for loading Stanford Sentiment Treebank corpus
Project description
SST Utils
---------
Utilities for loading and visualizing Stanford Sentiment Treebank.
See examples below for usage.
@author Jonathan Raiman
Javascript code by Jason Chuang and Stanford NLP modified and taken from [Stanford NLP Sentiment Analysis demo](http://nlp.stanford.edu:8080/sentiment/rntnDemo.html).
### Visualization
Allows for visualization using Jason Chuang's Javascript and CSS within an IPython notebook:
```python
import pytreebank
# load the sentiment treebank corpus in the parenthesis format,
# e.g. "(4 (2 very ) (3 good))"
dataset = pytreebank.import_tree_corpus("train.txt")
# add Javascript and CSS to the Ipython notebook
pytreebank.LabeledTree.inject_visualization_javascript()
# select and example to visualize
example = dataset[0]
# display it in the page
example.display()
```
![Example visualization using pytreebank](visualization_example.png)
### Lines and Labels
To use the corpus to output spans from the different trees you can call the `to_labeled_lines` and `to_lines` method of a `LabeledTree`. The first returned sentence in those lists is always the root sentence:
```python
import pytreebank
dataset = pytreebank.import_tree_corpus("train.txt")
example = dataset[0]
# extract spans from the tree.
for label, sentence in example.to_labeled_lines():
print("%s has sentiment label %s" % (
sentence,
["very negative", "negative", "neutral", "positive", "very positive"][label]
))
```
---------
Utilities for loading and visualizing Stanford Sentiment Treebank.
See examples below for usage.
@author Jonathan Raiman
Javascript code by Jason Chuang and Stanford NLP modified and taken from [Stanford NLP Sentiment Analysis demo](http://nlp.stanford.edu:8080/sentiment/rntnDemo.html).
### Visualization
Allows for visualization using Jason Chuang's Javascript and CSS within an IPython notebook:
```python
import pytreebank
# load the sentiment treebank corpus in the parenthesis format,
# e.g. "(4 (2 very ) (3 good))"
dataset = pytreebank.import_tree_corpus("train.txt")
# add Javascript and CSS to the Ipython notebook
pytreebank.LabeledTree.inject_visualization_javascript()
# select and example to visualize
example = dataset[0]
# display it in the page
example.display()
```
![Example visualization using pytreebank](visualization_example.png)
### Lines and Labels
To use the corpus to output spans from the different trees you can call the `to_labeled_lines` and `to_lines` method of a `LabeledTree`. The first returned sentence in those lists is always the root sentence:
```python
import pytreebank
dataset = pytreebank.import_tree_corpus("train.txt")
example = dataset[0]
# extract spans from the tree.
for label, sentence in example.to_labeled_lines():
print("%s has sentiment label %s" % (
sentence,
["very negative", "negative", "neutral", "positive", "very positive"][label]
))
```
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pytreebank-0.1.5.tar.gz
(31.2 kB
view details)
File details
Details for the file pytreebank-0.1.5.tar.gz
.
File metadata
- Download URL: pytreebank-0.1.5.tar.gz
- Upload date:
- Size: 31.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ef61d34ee8318728584e5bfc519acb2d963bdce16dc6b958f1e29ed29c9c1b05 |
|
MD5 | 9e20e398a458c7dd507e6cd7c6c65a35 |
|
BLAKE2b-256 | 222e3d78fe0a52c4dd48e9530bc18b4e288f99d8357bf690d7c7281508862be9 |