Skip to main content

A package for quantiative linguistics

Project description

QuantLing

QuantLing:A python package for Quantitative Linguistics.

PyPI version Build Status License

Description

QuantLing is a Python library for Quantitative Linguistics. It provides functionality to quantify linguistic structures and explore language patterns.

This package is consisted of four main parts:

  • depval.py: some indicators about dependency structures and valency structures.
  • lawfitter.py: a small fitter for some laws in QL.
  • lingnet.py: a module for complex network construction.

Installation

You can install QuantLing via pip:

pip install quantling

nltk and conllu are required.

pip install nltk conllu

Quick Start

Here's a simple example of how to use QuantLing:

1. depval

DependencyAnalyzer : some indicators about dependency structures.

from quantling.depval import DependencyAnalyzer   
data = open(r'your_treebank.conllu',encoding='utf-8')
dep = DependencyAnalyzer(data) 

# dependency distance distribution
dep.dd_distribution()
# mean dependency distance of specific wordclasses
dep.mdd(upos='NOUN')
# mean dependency distance of specific dependency relations
dep.mdd(depedency='subj')
# proportion of dependency distance
dep.pdd()
# tree width and tree depth
dep.tree()
# tree width distirbution and tree depth distribution
dep.tree_distribution()

ValencyAnalyzer : some indicators about valency structures.

from quantling.depval import ValencyAnalyzer   
data = open(r'your_treebank.conllu',encoding='utf-8')
val = ValencyAnalyzer(data) 

# mean valency
val.mean_valency()
# valency distribution
val.distribution()
# probalistic valency pattern 
val.PVP()

or:

dep = getDepFeatures(data)
val = getValFeatures(data)
print(dep)
print(val)

2. lawfitter

from quantling.lawfitter import fit   
#results = fit(data,model,variant)
results = fit([[1,2,3,4,5,6],[3,4,2,6,8,15]],'zipf')
print(resluts)

3. lingnet

from quantling.lingnet import conllu2edge
import networkx as nx   
# use a conllu file to construction a network
data = open(r'your_treebank.conllu',encoding='utf-8')
edges = conllu2edge(data,mode='dependency')
# or to construct a co-occurance network 
#edges = conllu2edge(data,mode='adjacency')
G = nx.Graph()
G.add_edges_from(edges)

# to estimate the degree exponents
degree =[i[1] for i in G.degree()]
degree_exponents = fitPowerLaw(degree)
print(degree_exponents)

Documentation

For more detailed information, please refer to the video (in Chinese).

Features

  • Dependency distance distribution
  • Mean dependency distance of specific wordclasses
  • Mean dependency distance of specific dependency relations
  • Proportion of dependency distance
  • Tree width and tree depth
  • Tree width distribution and tree depth distribution
  • Mean valency
  • Valency distribution
  • Probabilistic valency pattern
  • Law fitter
  • Complex network construction

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

acknowledgements

If our project has been helpful to you, please give it a star and cite our articles. We would be very grateful.

@article{Yang_2022,
doi = {10.1209/0295-5075/ac8bf2},
url = {https://dx.doi.org/10.1209/0295-5075/ac8bf2},
year = {2022},
month = {sep},
publisher = {EDP Sciences, IOP Publishing and Società Italiana di Fisica},
volume = {139},
number = {6},
pages = {61002},
author = {Mu Yang and Haitao Liu},
title = {The role of syntax in the formation of scale-free language networks},
journal = {Europhysics Letters},
abstract = {The overall structure of a network is determined by its micro features, which are different in both syntactic and non-syntactic networks. However, the fact that most language networks are small-world and scale-free raises the question: does syntax play a role in forming the scale-free feature? To answer this question, we build syntactic networks and co-occurrence networks to compare the generation mechanisms of nodes, and to investigate whether syntactic and non-syntactic factors have distinct roles. The results show that frequency is the foundation of the scale-free feature, while syntax is beneficial to enhance this feature. This research introduces a microscopic approach, which may shed light on the scale-free feature of language networks.}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

quantling-0.0.3-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file quantling-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: quantling-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 10.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.13

File hashes

Hashes for quantling-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b15de6e62a07496772853345e53abc343842e333b41efe8b9e73e9f29aef86b8
MD5 8fab85a90b239cea061d724503ddda84
BLAKE2b-256 db36c3f5ad777aaf0520f61c4147442a56dd8c3dc146454758e21be41cc051e1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page