medspaCy NLP pipeline for detecting patient housing stability.
Project description
ReHouSED NLP
Overview
This package is a medspaCy implementation of an NLP system for identifying patient housing stability in clinical texts. This system was originally developed in the Department of Veterans Affairs to study housing outcomes of Veterans participating in the Supportive Service for Veteran Families (SSVF) program. The development and validation of this system is described in ReHouSED: A Novel Measurement of Veteran Housing Stability Using Natural Language Processing by Chapman et al. (accepted and in press).
This system attempts to classify housing stability at two levels:
- Document-level: Each document processed by the NLP is classified as either "STABLY_HOUSED", "UNSTABLY_HOUSED", or "UNKNOWN"
- Patient-level: A set of documents over a period of time are processed and aggregated to a patient level. This is a numeric score ranging from 0-1 called "Relative Housing Stability in Electronic Documentation" (ReHouSED)
Detailed examples and explanations of the logic are provided in notebooks/
Disclaimer
This system is an approximation of the system described in the manuscript and has been modified to exclude logic specific to VA documentation. It is far from perfect and will certainly make mistakes!
Installation
You can install rehoused_nlp
using pip
:
pip install rehoused-nlp
Or the source code found in this repository:
python setup.py install
rehoused_nlp
requires Python 3.7 or 3.8, medspaCy, and spaCy 2.2.X. spaCy 3 is not currently supported.
Quick start
Document-level example
from rehoused_nlp import build_nlp, visualize_doc_classification
nlp = build_nlp()
text = """
History of present illness: The patient was evicted from her apartment two months ago.
Since then she has lived in a shelter while looking for an apartment.
Past medical history:
1. Pneumonia
2. Afib
3. Homelessness
Housing Status: Stably Housed
Assessment/Plan: The patient was accepted to an apartment and signed the lease last week.
"""
doc = nlp(text)
visualize_doc_classification(doc)
Patient-level example
from rehoused_nlp import calculate_rehoused
import pandas as pd
df = pd.read_csv("path/to/data.tsv", sep="\t")
print("Input:")
df.head()
print("Output:")
rehoused = calculate_rehoused(df)
rehoused.head()
Input:
Output:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file rehoused_nlp-0.0.0.1.tar.gz
.
File metadata
- Download URL: rehoused_nlp-0.0.0.1.tar.gz
- Upload date:
- Size: 33.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 835de79fb2fcf1175cbeee3f9e59258325346b491efe47f85e250148cb3ac1d9 |
|
MD5 | 2edd933b6beedd2b7f90faa4b7f3c554 |
|
BLAKE2b-256 | 4a900fab4a26204d1e5e63f40d4125115b64aea08daa6098c471f42a7de0fc25 |
File details
Details for the file rehoused_nlp-0.0.0.1-py3.9.egg
.
File metadata
- Download URL: rehoused_nlp-0.0.0.1-py3.9.egg
- Upload date:
- Size: 89.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d9a3bdba00260e41744bc59f0d0b4a237d10c12b9c6221e79b32263f5e795b24 |
|
MD5 | 161cd48148e386fdf030d5b177530b9b |
|
BLAKE2b-256 | 42081049ed3708151b5e81f2dd607ff54278327f3953e4a1c470b9b6de2da9bb |
File details
Details for the file rehoused_nlp-0.0.0.1-py3.8.egg
.
File metadata
- Download URL: rehoused_nlp-0.0.0.1-py3.8.egg
- Upload date:
- Size: 89.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 66e93e69cb547511e4429d5ddb4181294d6d9b02a461fa10a3435630f227d85a |
|
MD5 | 28bbca9e8a3c4bdc6c3aa465887ba064 |
|
BLAKE2b-256 | 7876a58bd787b2b48dbc0a57bc1b140a9f5e9998cd6811a258979a9987983e27 |