Skip to main content

A statistical package to manage and analyse data

Project description

xuerui-stat

Build Status

An open-source Python package for using statistical tools and methods.

Source code

https://github.com/Xuerui-Yang/xuerui-stat

Installation

pip install xuerui-stat

Tools and methods

DataManager

  • Description

It is a tool to manage data files. Once a data file is imported, the tool would move it to a specified data directory automatically. And when importing data, the tool can search files in the data directory.

  • Example
from xuerui_stat import DataManager
dm = DataManager(enable=True)

This command defines a class to name the tool. For the first use, users should use the command 'set_dir(yourpath)' with your prefered path to set your data directory, for example,

dm.set_dir('/home/xuerui/Documents/Data/')

Once it is done, the directory path is printed like this:

Data directory: '/home/xuerui/Documents/Data/'

The parameter 'enable' for the class can be set as True or False. When it is True, the data file imported by the command below would be moved to the data directory automatically.

df=dm.import_data('/home/xuerui/Documents/PythonProjects/FactorAnalysis/example.csv')

Otherwise users can manually move the files using the such command:

dm.addto_dir('/home/xuerui/Documents/example.csv')

The moved data files are renamed by adding the script names, in order to identify them. For example, 'example.csv' is imported in 'my_example.py' using the above command. So it would be moved to the data directory and renamed as 'MyExample_example.csv'.

Users can also check the contents under the data directory using

dm.list_dir()

DecisionTree, PlotTree, RandomForest

  • Description

These commands give users tools for data mining by using tree relevant methods.

  • Example

As above, the following commands import the data which is to be analyse.

from xuerui_stat import *
dm = DataManager(enable=False)
data=dm.import_data("/home/xuerui/Documents/PythonProjects/test.csv")

The decision tree method can be applied to the data by specifying the name of category.

dt=DecisionTree(data,'Cat')
dt.train()
t=dt.tree
print(t)

pt=PlotTree(dt)
pt.tree_structure_plot()

dt.test()
pt.confusion_matrix_plot()

The tree and confusion matrix can be plotted via the 'PlotTree' module.

Furthermore, the random forest can also be used as follows:

rf=RandomForest(data,'Cat')
rf.train(num_tree=300,max_depth=0,min_gini=0)
print(rf.oob_error)
dt.test()
print(dt.confusion_matrix)


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xuerui-stat-0.0.8.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xuerui_stat-0.0.8-py3-none-any.whl (20.0 kB view details)

Uploaded Python 3

File details

Details for the file xuerui-stat-0.0.8.tar.gz.

File metadata

  • Download URL: xuerui-stat-0.0.8.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.7.5

File hashes

Hashes for xuerui-stat-0.0.8.tar.gz
Algorithm Hash digest
SHA256 c96a5f9d5a003c7b01e6b541f3d3421fd8dea4757fe70313f7351fb15cb3b46e
MD5 e91fac21cce4ba2d74b31f5a76dc0c86
BLAKE2b-256 6a907e217339ff5e244e9f605e1b4f02fad5c6aae54a99ca00e200b976ad36c6

See more details on using hashes here.

File details

Details for the file xuerui_stat-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: xuerui_stat-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 20.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.7.5

File hashes

Hashes for xuerui_stat-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 5a9db5cd52c79f0c672af5cf277e6429ca5538bcc4ce0a3a56e0930c68634e2c
MD5 7d1113912e3f1e6d287f756cc4ec16d6
BLAKE2b-256 95efbf554c924636a0190fa303d75eac7417a78be3a170ccbbd226ad73cf63dc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page