Skip to main content

Python3 wrapper for the Weka Machine Learning Workbench

Project description

The python-weka-wrapper3 package makes it easy to run Weka algorithms and filters from within Python 3. It offers access to Weka API using thin wrappers around JNI calls using the jpype1 library.

Forum for project at:

https://groups.google.com/forum/#!forum/python-weka-wrapper

Changelog

0.3.2 (2024-08-05)

  • improved logging output in command-line scripts (eg pww-classifier)

  • added sub-command freeze to the pww-packages command-line tool to output the currently installed packages in pip-like format (unofficial packages get URL appended) either to stdout or to text file (eg requirements.txt)

  • the sub-command install of the pww-packages command-line tool can process a requirements text file now as well

  • added pww_version() function to weka.core.version to obtain the installed version of python-weka-wrapper3

  • added with_graph_support() function to weka.core.version to see whether python-weka-wrapper3 has support for graphs using pygraphviz

  • added with_plot_support() function to weka.core.version to see whether python-weka-wrapper3 has plotting support via matplotlib

  • added sub-command bootstrap to the pww-packages command-line tool to output a Python script that can reinstall python-weka-wrapper3 and all the packages in another virtual environment or even another machine

  • added is_official_package function to weka.core.packages module to check whether a package is an official one from the Weka package repository or not

0.3.1 (2024-07-17)

  • the max_heap_size parameter of the start method from the weka.core.jvm module now has an effect

0.3.0 (2024-07-12)

  • added is_additional_measure_producer variable, additional_measures property and additional_measure(str) method to the weka.classifiers.Classifier class, giving convenient access to classes implementing weka.core.AdditionalMeasureProducer

  • replaced javabridge with jpype behind the scenes, simplifying native API calls

  • removed property JavaObject.jclasswrapper

  • property JavaObject.jwrapper is now deprecated and JavaObject.jobject should be used instead

  • renamed JavaArray.new_instance to JavaArray.new_array

  • now works on M1/M2 Macs

  • updated doc strings

  • updated documentation, especially around installation

0.2.16 (2024-05-10)

  • setup.py no longer lists python-javabridge with github URL, as it fails to install via pip otherwise

0.2.15 (2024-05-10)

  • the methods create_instances_from_lists and create_instances_from_matrices of the weka.core.dataset module can handle missing values now (None in case of lists, nan in case of matrices), as well as being able to force columns to be nominal

  • added the method load_csv_file to the module weka.core.converters to provide a more reliable way of loading CSV files compared to Weka’s native CSVLoader converter (uses Python’s csv module and then calls create_instances_from_lists).

  • added simple_range library as dependency to make it easier to generate integer list from range strings

  • added weka.core.utils module with following methods: correlation, variance, normalize

  • updated requirements (and install instructions) to make pww3 work with Python 3.11+

0.2.14 (2023-07-04)

  • extracted arpack_combined.jar, core.jar and mtj.jar from weka.jar to make them available under Java 17

0.2.13 (2023-04-26)

  • moved Configurable and JSONObject into configurable-objects library

  • moved base flow components into simple-data-flow library

  • added methods subset_results, rank_results to class AttributeSelection (module: weka.attribute_selection) to give access to cross-validation output in numeric rather textual form (NB: it has to parse the textual CV output).

  • made the plot_experiments method (module weka.plot.experiments) more customizable

0.2.12 (2022-12-08)

  • method install_packages (module: weka.core.packages) now no longer interprets the installation message returned when installing from URL or local zip file as failure to install; also outputs any installation message now in the console; now has flags for fail_fast mode (first package that fails stops installation process) and whether to return details (dict per package rather than just a bool for all packages)

  • method install_package (module: weka.core.packages) can return either a boolean flag of success or detailed information

0.2.11 (2022-10-11)

  • methods install_packages and install_missing_packages of module weka.core.packages now allow a list of package names instead of tuples (name, version), assuming latest as version

  • method get_jclass in module weka.core.classes can handle primitive classes now as well (eg int -> java.lang.Integer.TYPE)

  • methods get_non_public_field and call_non_public_method in module weka.core.classes allow accessing private/protected fields and calling private/protected methods of Java objects, which avoids having to sub-class classes to get public access to them (NB: only works as long as the security manager allows that)

  • added split_commandline method to module weka.core.classes, which splits a command-line into classname and option list tuple

  • the Instances class (module: weka.core.dataset) now supports slicing

  • added methods plot_xmlbif_graph and xmlbif_to_dot to module weka.plot.graph for plotting XML BIF graphs generated by BayesNet using GraphViz

  • added method plot_graph to module weka.plot.graph to plot dot or XML BIF graphs

0.2.10 (2022-06-27)

  • added logging_level parameter to the start method of the weka.core.jvm module, enabling the user to turn off debugging output in an easy way (https://github.com/fracpete/python-weka-wrapper3/issues/40)

  • added method cv_splits to class Instances from module weka.core.dataset to return a list of train/test tuples as used by cross-validation

  • the Tester class (module: weka.experiments) now has an option to swap columns/rows for comparing datasets rather than classifiers

  • the SimpleExperiment class and derived classes (module: weka.experiments) now have the additional parameters in the constructor: class_for_ir_statistics, attribute_id, pred_target_column

  • the method is_installed (module: weka.core.packages) now can check whether a specific version is installed

  • added pww-packages entry point to allow managing of Weka packges from the command-line (actions: list/info/install/uninstall/suggest/is-installed)

0.2.9 (2022-04-17)

  • method JavaObject.new_instance in module weka.core.classes now automatically installs packages based on suggestions if the JVM was started with the auto_install flag enabled.

  • method test_model_once of class Evaluation (module: weka.classifiers) now has the additional parameter store, which allows the recording of the predictions (necessary for statistics like AUC)

0.2.8 (2022-03-24)

  • methods create_instances_from_lists and create_instances_from_matrices (module weka.core.dataset) now allow the specification of column names, for input and output variables.

0.2.7 (2022-02-22)

  • Added property for attribute indices to DistanceFunction class (module weka.core.distances) (thanks to Martin Trat, https://github.com/fracpete/python-weka-wrapper3/pull/39)

  • improved instantiation of classes, avoiding misleading output of exceptions

  • JavaArray class (module: weka.core.classes) now has __str__ and __repr__ methods that output classname and size

0.2.6 (2022-02-01)

  • upgraded bundled Weka to 3.9.6

0.2.5 (2021-12-17)

  • switched to python-javabridge, the new name (fork?) of the javabridge library

  • Package.__str__ (weka.core.packages module) method now returns a string rather than printing the name/version

  • added to_numpy(…) methods to Instance and Instances classes (module weka.core.dataset) to make it easy to obtain a numpy array from the Weka dataset

0.2.4 (2021-11-25)

  • added method help_for to weka.core.classes module to generate a help screen for an weka.core.OptionHandler class using just the classname.

  • the to_help method of the weka.core.classes.OptionHandler class now allows to tweak the generated output a bit better (e.g., what sections to output).

  • setting window title of Matplotlib is now dependent on version (to avoid deprecation notice being output)

  • plot_classifier_errors (module weka.plot.classifiers) now plots the diagonal after adding all the plot data to get the right limits

0.2.3 (2021-06-09)

  • added weka.core.distances module for distance functions, with DistanceFunction base class

  • added avg_silhouette_coefficient method to weka.clusterers to calculate the average silhouette coefficient

0.2.2 (2021-04-23)

  • the Package class of the weka.core.packages module now has a version property to quickly access the version which is stored in the meta-data; the metadata property now returns a proper Python dictionary

  • added convenience methods to the weka.core.packages module: install_packages to install more than one package, install_missing_package and install_missing_packages to install one or more packages if missing (can automatically stop the JVM and exit the process), uninstall_packages to remove more than one package in one operation

0.2.1 (2021-04-12)

  • the ASEvaluation class in the weka.attribute_selection module now offers the following methods for attribute transformers like PCA: transformed_header, transformed_data, convert_instance

  • classes derived from weka.core.classes.JavaObject are now serializable via pickle

  • added the method copy_structure to the weka.core.dataset.Instances class to quickly get the header of a dataset

  • added the property header to the following classes that returns the training data structure: ASEvaluation, ASSearch, Associator, Classifier, Clusterer, TSForecaster

  • methods from weka.core.serialization have been moved into weka.core.classes, with the following methods getting the serialization_ prefix: write, write_all, read, read_all

0.2.0 (2021-02-21)

  • classes.new_instance method can take an options list now as well

  • added classes.get_enum method to return the instance of a Java enum item

  • added classes.new_instance method to create new instance of Java class

  • added typeconv.jstring_list_to_string_list method to convert a java.util.List containing strings into a Python list

  • added typeconv.jdouble_to_float method to convert a java.lang.Double to a Python float

  • in module typeconv renamed methods: string_array_to_list to jstring_array_to_list, string_list_to_array to string_list_to_jarray, double_matrix_to_ndarray to jdouble_matrix_to_ndarray, enumeration_to_list to jenumeration_to_list, double_to_float to float_to_jfloat

  • added weka.timeseries module that wraps the timeseriesForecasting Weka package

0.1.16 (2020-12-26)

  • upgraded Weka to 3.9.5

0.1.15 (2020-10-25)

0.1.14 (2020-05-26)

  • added AttributeSelectedClassifier meta-classifier to module weka.classifiers

  • added AttributeSelection meta-filter to module weka.filters

0.1.13 (2020-05-06)

  • added class_index parameter to weka.core.converters.load_any_file and weka.core.converters.Loader.load_file, which allows specifying of index while loading it (first, second, third, last-2, last-1, last or 1-based index).

  • added append and clear methods to weka.filters.MultiFilter and weka.classifiers.MultipleClassifiersCombiner to make adding of filters/classifiers easier.

  • added attribute_names() method to weka.core.dataset.Instances class

  • added subset method to weka.core.dataset.Instances class, which returns a subset of columns and/or rows.

0.1.12 (2020-01-10)

  • added method list_property_names to weka.core.classes module to allow listing of Bean property names (which are used by GridSearch and MultiSearch) for a Java object.

0.1.11 (2020-01-04)

  • Upgraded Weka to 3.9.4

  • added method suggest_package to the weka.core.packages module for suggesting packages for partial class names/package names (NNge or .ft.) or exact class names (weka.classifiers.meta.StackingC)

  • the JavaObject.new_instance method now suggests packages (if possible) in case the instantiation fails due to package not installed or JVM not started with package support

0.1.10 (2019-12-02)

  • method train_test_split of the weka.dataset.Instances class now creates a copy of itself before applying randomization, to avoid changing the order of data for subsequent calls.

0.1.9 (2019-11-19)

  • method create_instances_from_matrices from module weka.core.dataset now works with pure numeric data again

  • added sections for creating datasets (manual, lists, matrices) to examples documentation

0.1.8 (2019-11-11)

  • added console scripts: pww-associator, pww-attsel, pww-classifier, pww-clusterer, pww-datagenerator, pww-filter

  • added serialize, deserialize methods to weka.classifiers.Classifier to simplify loading/saving model

  • added serialize, deserialize methods to weka.clusterers.Clusterer to simplify loading/saving model

  • added serialize, deserialize methods to weka.filters.Filter to simplify loading/saving filter

  • added methods plot_rocs and plot_prcs to weka.plot.classifiers module to plot ROC/PRC curve on same dataset for multiple classifiers

  • method plot_classifier_errors of weka.plot.classifiers module now allows plotting predictions of multiple classifiers by providing a dictionary

  • method create_instances_from_matrices from module weka.core.dataset now allows string and bytes as well

  • method create_instances_from_lists from module weka.core.dataset now allows string and bytes as well

0.1.7 (2019-01-11)

  • added wrapper classes for association classes that implement AssociationRuleProducer (package weka.associations): AssociationRules, AssociationRule, item

  • added to_source method to weka.classifiers.Classifier and weka.filters.Filter (underlying Java classes must implement the respective Sourcable interface)

0.1.6 (2018-10-28)

  • fixed logging setup in weka.core.jvm to avoid global setting global logging setup to DEBUG (thanks to https://github.com/Arnie97)

0.1.5 (2018-09-16)

  • upgraded to Weka 3.9.3

  • weka.jar now included in PyPi package

  • exposed the following methods in weka.classifiers.Evaluation: cumulative_margin_distribution, sf_prior_entropy, sf_scheme_entropy

0.1.4 (2018-02-18)

  • upgraded to Weka 3.9.2

  • properly initializing package support now, rather than adding package jars to classpath

  • added weka.core.ClassHelper Java class for obtaining classes and static fields, as javabridge only uses the system class loader

0.1.3 (2017-08-23)

  • added check_for_modified_class_attribute method to FilterClassifier class

  • added complete_classname method to weka.core.classes module, which allows completion of partial classnames like .J48 to weka.classifiers.trees.J48 if there is a unique match; JavaObject.new_instance and JavaObject.check_type now make use of this functionality, allowing for instantiations like Classifier(cls=”.J48”)

  • jvm.start(system_cp=True) no longer fails with a KeyError: ‘CLASSPATH’ if there is no CLASSPATH environment variable defined

  • Libraries mtl.jar, core.jar and arpack_combined_all.jar were added as is to the weka.jar in the 3.9.1 release instead of adding their content to it. Repackaged weka.jar to fix this issue (https://github.com/fracpete/python-weka-wrapper3/issues/5)

0.1.2 (2017-01-04)

  • typeconv.double_matrix_to_ndarray no longer assumes a square matrix (https://github.com/fracpete/python-weka-wrapper3/issues/4)

  • len(Instances) now returns the number of rows in the dataset (module weka.core.dataset)

  • added method insert_attribute to the Instances class

  • added class method create_relational to the Attribute class

  • upgraded Weka to 3.9.1

0.1.1 (2016-10-19)

  • plot_learning_curve method of module weka.plot.classifiers now accepts a list of test sets; * is index of test set in label template string

  • added missing_value() methods to weka.core.dataset module and Instance class

  • output variable y for convenience method create_instances_from_lists in module weka.core.dataset is now optional

  • added convenience method create_instances_from_matrices to weka.core.dataset module to easily create an Instances object from numpy matrices (x and y)

0.1.0 (2016-05-09)

  • initial release of Python3 port

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-weka-wrapper3-0.3.2.tar.gz (15.9 MB view details)

Uploaded Source

File details

Details for the file python-weka-wrapper3-0.3.2.tar.gz.

File metadata

  • Download URL: python-weka-wrapper3-0.3.2.tar.gz
  • Upload date:
  • Size: 15.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for python-weka-wrapper3-0.3.2.tar.gz
Algorithm Hash digest
SHA256 480e0b83747c71bfd0b9b212886389f842334090271d194c6e4384dc973e619b
MD5 4f8f6776f56e66a8c181cb74c6d9d01e
BLAKE2b-256 fbcbaaf24bf8ec1a8eb852dc967c0f1b2444ccc2ca0ab5dd8c0d1802edf57a8f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page