Usage

Suppose that data are given in a NumPy array samples with shape (n, d), where n is the number of samples and d is the number of elements per sample. First, specify which of the elements are continuous. If, for instance, the distribution has three elements and the first and last elements are continuous whereas the second element is discrete:

is_continuous = [True, False, True]

To fit a mixed vine to the samples:

from mixedvines.mixedvine import MixedVine
vine = MixedVine.fit(samples, is_continuous)

vine is now a MixedVine object. Note that for the canonical vine, the order of elements is important. Elements should be sorted according to the importance of their dependencies to other elements, where elements with important dependencies to many other elements should come first. A heuristic way to select the order of elements is to calculate Kendall’s tau between all element pairs, to obtain a score for each element by summing the taus of the pairs the element occurs in and to sort elements in descending order according to their scores. This is what the MixedVine.fit method does internally by default to construct an improved canonical vine tree. This internal sorting is used to construct the vine tree only, so the order of elements is not changed in a user visible way. To prevent this internal sorting, set the keep_order argument to True.

To draw samples from the distribution, calculate their density and estimate the distribution entropy in units of bits:

samples = vine.rvs(size=100)
logpdf = vine.logpdf(samples)
entropy, standard_error_mean = vine.entropy(sem_tol=1e-2)