Skip to main content

make it easy to generate a FILtering Models for BIOlogical data

Project description

install

pip install biofilm
conda install -c conda-forge biofilm

Optimization options:

--methods str+ any  'extra_trees', 'passive_aggressive', 'random_forest', 'sgd', 'gradient_boosting', 'mlp'
--out str jsongoeshere
--n_jobs int 1
--time int 3600
--memoryMBthread int 8000
--randinit int 1337  # should be the same as the one in data.py
--preprocess bool False
--tmp_folder str
--refit bool True
--instancegroups str   # a jsonfile containing a dictionary instance_name -> group name
--autosk_debug bool False   # autosklearn logging
--autosk_debugfile str autosklearn.log
--autosk_debugout str+ file_handler  # console   is another option, chooses where to output debug
--ensemble int 1  # ensemble size, autosklearn will combine the best models

Feature selection options:

go to biofilm and run python biofilm-features.py -h

# feature selection options
--method str lasso  or svm or all or corr or variance
--out str numpycompressdumpgoeshere
--plot bool False
--svmparamrange float+ 0.01 0.15 0.001

# data reading options
--infile str myNumpyDump
--randinit int -1
--folds int 5
--subsample int -1
--Z bool False

data loading

a) tools.ndumpfile([X,y, featurenames, instancenames],fname) where feature and instancenames are optional or
b) provide --loader whose read function will be called (examples/npzloader)

defaultformat: X,y in a npz dump, features and instances get enumerated
a custom dataloader: X,y, features, instances
loadfoldsreturns: (X,Y,x,y) features namesOfTestInstances

outputs

optimize:
	out.model: {score:score, modelparams:modelparams}
	out.csv: instanceId, reallabel, predicted label, probability
feature selection:
	out: featuremask, featureproba, featureId

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biofilm-0.1.126.tar.gz (17.0 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page