This package provides regular vine modeling, sampling and testing algorithms. Also some popular bivariate copulas routines which are optimized for wider range of parameters, high precision and good performances.
Regular vine copula provides rich models for dependence structure modeling. It combines vine structures and families of bivariate copulas to construct a number of multivariate distributions that can model a wide range dependence patterns with different tail dependence for different pairs. Two special cases of regular vine copulas, C-vine and D-vine copulas, have been deeply investigated.
We propose the Python package, pyvine, for modeling, sampling and testing a more generalized regular vine copula (R-vine for short). R-vine modeling algorithm searches for the R-vine structure which maximizes the vine tree dependence, i.e., the sum of the absolute values of kendall’s tau for paired variables on edges using PRIM algorithm of minimum-spanning-tree in a sequential way. The maximum likelihood estimation algorithm takes the sequential estimation as initial value and uses L-BFGS-B algorithm for the likelihood value optimization. R-vine sampling algorithm traverses all the edges of vine structure from the last tree in a recursive way, and generates the marginal samples on each edge according to some nested conditions. Goodness-of-fit testing algorithm first generates Rosenblatt’s transformed data E, then tests the composite hypothesis H_0*: E ~ C* by using Anderson-Darling statistic, where C* is the independence copula. Bootstrap method will generate the empirical distribution of Anderson-Darling statistic replications to compute an adjusted P-value.
The computing of related functions of copulas such as cumulative distribution functions often meets with the problem of overflow. We solve this problem by reinvestigating the following six popular families of bivariate opulas: Normal, Student t, Clayton, Gumbel, Frank and Joe copulas. Approximations of the above related functions of copulas are given when the overflow occurs in the computations. All these are implemented in a subpackage bvcopula of pyvine, in which subroutines are written in Fortran and wrapped into Python via f2py and good performance and high precision are both guaranteed.
An example for Rvine copula modeling is given as below:
# Example import pandas as ps import pyvine as pv ## read the data and do rank transformation dat = ps.read_csv("data.csv",index_col = 0, parse_dates = 0) cp_dat = dat.rank() / ( len(dat) + 1 ) ## initialize R-vine object named rv rv = pv.Rvine(cp_dat) ## sequential estimation for rv. 'structure' accepts 'r' for R-vine, ## 'c' for C-vine and 'd' for D-vine, 'familyset' accepts list of ## integers from 1 to 6, 'threads_num' accepts integer specifying number ## of threads using for taking mle on edges of the same vine tree ## simultaneously. rv.modeling(structure = 'r', familyset = [1,2,3,4,5,6], threads_num = 2) ## maximum likelihood estimation for rv. 'disp' controls the printing ## of ratio of progress of iterating for L-BFGS-B algorithm, 'threads_num' ## specifies the number of threads using for computing loglikelihood value ## for each edge in the same vine tree. rv.mle(disp=False, threads_num = 2) ## plot the R-vine structure for modeled object rv. All the vine trees will ## be plotted as default. rv.plot() ## display the result of estimation on each edge. 'ndigits' controls number ## of decimal digits for result. rv.res(ndigits = 3) ## testing rv.test()
To compile and install on linux (substitute ‘gnu95’ with ‘mingw32’ on Windows):
$ python setup.py config_fc --opt="-fopenmp" build --fcompiler=gnu95 $ python setup.py install