Bag of Factors allow you to analyze a corpus from its self_factors.
Project description
Bag of Factors
Bag of Factors allow you to analyze a corpus from its factors.
Free software: GNU General Public License v3
Documentation: https://balouf.github.io/bof/.
Features
TODO
Credits
This package was created with Cookiecutter and the francois-durand/package_helper_2 project template.
History
0.3.3 (2021-01-01): Cython/Numba balanced
All core CountVectorizer methods ported to Cython. Roughly 2.5X faster than sklearn counterpart (mainly because some features like min_df/max_df are not implemented).
Process numba methods NOT converted to Cython as Numba seems to be 20% faster for csr manipulation.
Numba functions are cached to avoid compilation lag.
0.3.2 (2020-12-30): Going Cython
First attempt to use Cython
Right now only the fit_transform method of CountVectorizer has been cythonized, for testing wheels.
If all goes well, numba will probably be abandoned and all the heavy-lifting will be in Cython.
0.3.1 (2020-12-28): Simplification of core algorithm
Attributes of the CountVectorizer have been reduced to the minimum: one dict!
Now faster than sklearn counterpart! (The reason been only one case is considered here so we can ditch a lot of checks and attributes).
0.3.0 (2020-12-15): CountVectorizer and Process
The core is now the CountVectorizer class. Lighter and faster. Only features are kept inside.
New process module inspired by fuzzywuzzy!
0.2.0 (2020-12-15): Fit/Transform
Full refactoring to make the package fit/transform compliant.
Add a fit_sampling method that allows to fit only a (random) subset of factors
0.1.1 (2020-12-12): Upgrades
Docstrings added
Common module (feat. save/load capabilities)
Joint Complexity module
0.1.0 (2020-12-12): First release
First release on PyPI.
Core FactorTree class added.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for bof-0.3.3-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f3c56ee6000dda9d350bf102aa2a5c3791aae00fc6217afd4fa03d0dcac1b6a3 |
|
MD5 | 51eadf0860f35cdc8433f16ee558ed9e |
|
BLAKE2b-256 | b2b274a77fc1a493a7254a130c58a786d763eee1a31a3a7262f5ac80d99135a9 |
Hashes for bof-0.3.3-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | de2da9fe476798fb44adad03ef2814cd6f7d8c36faf28be0f0b4cdbd90f6ab2e |
|
MD5 | e7c74bb6c0ed86c466bf88510b7176c3 |
|
BLAKE2b-256 | 01667e6ec825650c0db5855b6b59888d629172c9d9a2e9a2fbd58dcbdc5b4147 |
Hashes for bof-0.3.3-cp39-cp39-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dfa0742a7b552b4f6b134d8a5abade7413d8f9fcde5420fda1149116e066d306 |
|
MD5 | bd9b326bb26e7f543c785e267592fbe3 |
|
BLAKE2b-256 | 6291f4eef94a88bd735ee5302de821c42b42cfc943e1499e2f71e32466086848 |
Hashes for bof-0.3.3-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4be8ba35ddcb1e05c875188f467dd5d913c2ebdde3d3e2918a65707778facfeb |
|
MD5 | 4e487744caa9b2ad8d86744efbbb9309 |
|
BLAKE2b-256 | 0fbb70b8dfd54f94f9acd1c3c96335c09ca31f396c880c8768de713f59322598 |
Hashes for bof-0.3.3-cp39-cp39-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f38dc716c17fd7645b911487885567eb9d280318cdd6e22da98f83359af290f5 |
|
MD5 | cd1c7894b1cf40a566c77f2ddc47b22e |
|
BLAKE2b-256 | 3e1da22cb925d6f1ff3bb9f056ed822ad811c2677493d40776fc78fef876d527 |
Hashes for bof-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fef3132386fee7f4a62a10e03efacf827a0541a12227b1d17b3524dbd457bf34 |
|
MD5 | d710abe3dfe5c59f86679a394cff74a5 |
|
BLAKE2b-256 | ea5d0c2d1ae9dd52ba43a919391af16a9f61bc808535644e9c9773500f6789e3 |
Hashes for bof-0.3.3-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 14f9efef4948ebc5ddeee2be84cadf080981f78e96304a1db7d74bdfe4edf773 |
|
MD5 | d2e871824d245e7ef2f30498fc9599ab |
|
BLAKE2b-256 | 2a7a68649119e3051822f89ee1dc7e300d4e2037152144a945501854ff76385c |
Hashes for bof-0.3.3-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9fae059c5e8fd3b99f32ab65b71638841f23f7e1634f6090609c8267d24c403 |
|
MD5 | 83c08235a19cb933c46e946fc908f517 |
|
BLAKE2b-256 | 5388bb9c9a8bb47b32561bf737d9ed4c4b81aa83bddc2d59dcb0dd8a80cefcca |
Hashes for bof-0.3.3-cp38-cp38-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a3d77fabe1a6abc5b0b7c8579f033f100b4c96c1f45f6a30291cca6611e0c172 |
|
MD5 | 1eb99e937a496128adeabb59a2c55656 |
|
BLAKE2b-256 | 457758a445e274919243ae9385cad1c1f5099333cd229c4acdbabd2a3f9aaee3 |
Hashes for bof-0.3.3-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6447f170b24f82372c42c82f7b929d41b0cb278adf6d08e10444158d085563cd |
|
MD5 | 0e1dd8e8f6c374d0c25f89e3330aef5a |
|
BLAKE2b-256 | 785b83933a89c9a77a6543edcbcab65e9c7687c9d9d2f1d1c16dd2afa04e3a9a |
Hashes for bof-0.3.3-cp38-cp38-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 61e758e5a304c83fa0d061bc99456a927240543adf0ce253fd5fcf7783db3eb0 |
|
MD5 | c496706a969c67636197ae7b819cae04 |
|
BLAKE2b-256 | ae2a6adafaa6c7f5ca55921c5d20c89731c5bf42bca54cf89e6d93a90c0bd1cf |
Hashes for bof-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e5be13d86bd208b9201dad3e4cf071e82848da60cfafefee38d2f10a271ebf95 |
|
MD5 | 18b5d43c149aace5b1a31ed03ae90c79 |
|
BLAKE2b-256 | 48a5931122c20f3c7ca4bf526e5d7d3d482acd155b4ff5b58ed02fa7c8385608 |
Hashes for bof-0.3.3-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b6ac919a2b06413cf6a7cac1fdda37a92330ef458e7ed4a78513572af13d1f79 |
|
MD5 | 74bcab8afec06022ada3538d29614599 |
|
BLAKE2b-256 | c0777d0dfcfb10f7c5f8afb9654f43f242e440b6d4a7e7f14978bc6a6ed49783 |
Hashes for bof-0.3.3-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c7249d9fc04d45a11f0ad5843394be26b947d69a7c74e6865527cc7c2509c48 |
|
MD5 | 54fa8b9d883771ba13e0626ce685db5f |
|
BLAKE2b-256 | 76252560deca785183b90c25ef8e54307a1443ba1df62c8cca985e12845562a0 |
Hashes for bof-0.3.3-cp37-cp37m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 02ae359d6ffd0efec8694ad9618a003300bb5124f87606669f7fcb582df1a2e5 |
|
MD5 | c26d3bef19af67f07f36aae5a92d99fd |
|
BLAKE2b-256 | f05632a4f1ce8f1b7e114964dd64c882be9e4c0b868b8a3b2ea76c1784241692 |
Hashes for bof-0.3.3-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9465c9e63e6b9fdecf0fc335ccf9f2bfe8f8237533d65acd00b58acf1a036382 |
|
MD5 | ec72bb5b6e1741e18c5ec9ff3350dece |
|
BLAKE2b-256 | e21f85c53e1e074dbf5675ea5e5c3e9c669ede62bc8ffe1a36e22caaffbbf9cf |
Hashes for bof-0.3.3-cp37-cp37m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 33b8a65c13ca378a4b5199ccf5bb27df431a75dd49200ef4feabee040fb0e610 |
|
MD5 | 061910c1de04b6072b53dd975c918f67 |
|
BLAKE2b-256 | 66823d9f81f682c44ba10a6445a608e4ac029ca23231ac27217eed7831e62485 |
Hashes for bof-0.3.3-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b650d9d78ad21ad4c7b4b76d255594e67f80c0de5e52cd74d1c91fa5a4f8207 |
|
MD5 | 85f823b880180e5731fe6aaa2c9e9742 |
|
BLAKE2b-256 | 136657b19225c3d8ac509a8708db67fc748d5d18d35ea77988f507b67de629f7 |
Hashes for bof-0.3.3-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c333958685739411fa4a0849c92c2d153791bdc021ecd45b5c7e5c5785f9a1a |
|
MD5 | 044fa5b10d028a6283c0aaf90cd5ff39 |
|
BLAKE2b-256 | 9c88f410262d8415c52f53847d4e357de2be1c7b04693d11c29aef932d737d28 |
Hashes for bof-0.3.3-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f569383ed0599e83f81980c94209e8a0b4bc014f98a17b6d4ba694828dad298a |
|
MD5 | 2d7205627e1c2b78538cc80b26080be5 |
|
BLAKE2b-256 | 75d0a64916f32bbb146ceb0bd68834d5e3467d70bcd28882e242e6f6982c2e2f |
Hashes for bof-0.3.3-cp36-cp36m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30f00db6684fb422598c9298c9ca3c5faf0e86d9c152b19ec0d76d19189521a0 |
|
MD5 | 4135192b626f3116d225c25c7b4026af |
|
BLAKE2b-256 | 7e21d452997c20711dc3d9a1655bf10f6e09a7e4acf45c53735f0b701899ef55 |
Hashes for bof-0.3.3-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 06fac722ff61d27542d4fbd2d745332c2e84eb37b76de9e2308eb41c0f7a85c4 |
|
MD5 | 85bfb7f05668b80c5d15f3bf217d6bd8 |
|
BLAKE2b-256 | e9e74f2856502a878c15b35104c7bd674f9010b0d1ca8d1cb65fc1645ac585c1 |
Hashes for bof-0.3.3-cp36-cp36m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 983411300acef2bb5b3f7396c689911403d6d02447b7be3560a8aaba156b191f |
|
MD5 | bde632d81a0fb420d2f57762be3a8bdf |
|
BLAKE2b-256 | ed0086bf7bab367dde0d288e5e4bb9f3ae1fd992e53b311d80adbdc1ba92a0a7 |
Hashes for bof-0.3.3-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c0e52a3288fa4206798d8464e54b948fe90c46084987e917178958047e125b10 |
|
MD5 | 3cfabd26df25f76891f0f0889532d837 |
|
BLAKE2b-256 | ebc455165060b97193a6e2d5350e27013292501e9bcdb7cda1092221c2f4fc4a |