Skip to main content

Stata-like functions tab and summarize

Project description

StataLovers package

License: MIT

This package contains two functions that are produce similar output as Stata's tab and sum functions.

Installation

pip install StataLovers

Usage

  • The OOP way
  • There is (for now) one main class with 2 functions:
    • summarize - this function provides the print output similar to the function in Stata with the same name
    • tab - a function that is similar to tab function in Stata, for now take only one or two arguments.
    • the arguments passed into function 'summarize' should be a list of column names represented as a list of strings, and a pandas DataFrame, where all those columns can be found in.
    • the arguments passed into function 'tab': can be one pandas Series (as in a dataframe column) or two pandas series (as in two columns of a pandas dataframe). Both entered separately.

Example 1: summarize

import StataLovers
StataLovers.summarize(["BirthYear", "Year", "Married", "Health"], df)

Output

     Variable |        Obs        Mean    Std. Dev.       Min        Max
--------------+----------------------------------------------------------
    BirthYear |    59807.0    1913.154      245.723       -8.0     1987.0
         Year |    59807.0    2011.708        3.404     2006.0     2017.0
      Married |    59807.0       0.717        0.451        0.0        1.0
       Health |    59767.0       2.856        0.853        1.0        4.0

Example 2: tab

StataLovers.tab(df['Health'])

Output

      Health|      Freq.     Percent        Cum.
------------+-----------------------------------
         1.0|       4649       7.779       7.779
       1.333|        683       1.143       8.921
       1.667|       1947       3.258      12.179
         2.0|       5938       9.935      22.114
       2.333|       3648       6.104      28.218
       2.667|       5689       9.519      37.737
         3.0|      17681      29.583       67.32
       3.333|       4934       8.255      75.575
       3.667|       4651       7.782      83.357
         4.0|       9947      16.643       100.0
------------+-----------------------------------
      Total |      59767      100.00

Example 3: tab

StataLovers.tab(df["Health"], df["Female"])

Output

             |      Female
      Health |           0           1|     Total
-------------+------------------------+----------
         1.0 |        2052        2597|      4649
       1.333 |         404         279|       683
       1.667 |         909        1038|      1947
         2.0 |        2196        3742|      5938
       2.333 |        1752        1896|      3648
       2.667 |        2466        3223|      5689
         3.0 |        7074       10607|     17681
       3.333 |        2500        2434|      4934
       3.667 |        2160        2491|      4651
         4.0 |        5045        4902|      9947
-------------+------------------------+----------
       Total |       26581       33226|     59807

Note

Author

Kamila Kolpashnikova 2021

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

StataLovers-1.3.tar.gz (3.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page