Skip to main content

This is a short set of functions meant to help analyze cosine similarity between texts

Project description

tf_idf

This file will become your README and also the index of your documentation.

Install

pip install tf_idf

How to use

Fill me in please! Don’t forget code examples:

import tf_idf.core as tf_idf
import pandas as pd
AI = 'For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety'
ME = 'For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness'
# word_tokenize(AI.lower().split())
# preprocess_text(AI)
compare = tf_idf.preprocess_text(AI)
compare = pd.concat([compare, preprocess_text(ME)], ignore_index=True)
compare
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
DOCUMENT LOWERCASE CLEANING TOKENIZATION STOP-WORDS STEMMING
0 For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety for instance, in the design phase of a structural engineering project, monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety for instance in the design phase of a structural engineering project monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties providing valuable insights into its reliability and safety [for, instance, in, the, design, phase, of, a, structural, engineering, project, monte, carlo, simulations, can, help, evaluate, the, performance, of, a, proposed, design, under, different, loading, conditions, and, material, properties, providing, valuable, insights, into, its, reliability, and, safety] [instance, design, phase, structural, engineering, project, monte, carlo, simulations, evaluate, performance, proposed, design, different, loading, conditions, material, properties, providing, valuable, insights, reliability, safety] [instanc, design, phase, structur, engin, project, mont, carlo, simul, evalu, perform, propos, design, differ, load, condit, materi, properti, provid, valuabl, insight, reliabl, safeti]
1 For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness for instance, monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness for instance monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness [for, instance, monte, carlo, simulations, can, simulate, hundreds, or, thousands, of, different, combinations, of, loading, conditions, and, material, properties, to, create, statistical, predictions, of, structure, stiffness] [instance, monte, carlo, simulations, simulate, hundreds, thousands, different, combinations, loading, conditions, material, properties, create, statistical, predictions, structure, stiffness] [instanc, mont, carlo, simul, simul, hundr, thousand, differ, combin, load, condit, materi, properti, creat, statist, predict, structur, stiff]
compare_tfidf = calculate_tfidf(compare)
compare_tfidf
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
DOCUMENT LOWERCASE CLEANING TOKENIZATION STOP-WORDS STEMMING carlo combin condit creat ... propos provid reliabl safeti simul statist stiff structur thousand valuabl
0 For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety for instance, in the design phase of a structural engineering project, monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety for instance in the design phase of a structural engineering project monte carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties providing valuable insights into its reliability and safety [for, instance, in, the, design, phase, of, a, structural, engineering, project, monte, carlo, simulations, can, help, evaluate, the, performance, of, a, proposed, design, under, different, loading, conditions, and, material, properties, providing, valuable, insights, into, its, reliability, and, safety] [instance, design, phase, structural, engineering, project, monte, carlo, simulations, evaluate, performance, proposed, design, different, loading, conditions, material, properties, providing, valuable, insights, reliability, safety] [instanc, design, phase, structur, engin, project, mont, carlo, simul, evalu, perform, propos, design, differ, load, condit, materi, properti, provid, valuabl, insight, reliabl, safeti] 0.158850 0.000000 0.158850 0.000000 ... 0.223259 0.223259 0.223259 0.223259 0.158850 0.000000 0.000000 0.158850 0.000000 0.223259
1 For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness for instance, monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness for instance monte carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness [for, instance, monte, carlo, simulations, can, simulate, hundreds, or, thousands, of, different, combinations, of, loading, conditions, and, material, properties, to, create, statistical, predictions, of, structure, stiffness] [instance, monte, carlo, simulations, simulate, hundreds, thousands, different, combinations, loading, conditions, material, properties, create, statistical, predictions, structure, stiffness] [instanc, mont, carlo, simul, simul, hundr, thousand, differ, combin, load, condit, materi, properti, creat, statist, predict, structur, stiff] 0.193068 0.271351 0.193068 0.271351 ... 0.000000 0.000000 0.000000 0.000000 0.386137 0.271351 0.271351 0.193068 0.271351 0.000000

2 rows × 35 columns

tf_idf.cosineSimilarity(compare)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
DOCUMENT STEMMING COSIM
0 For instance, in the design phase of a structural engineering project, Monte Carlo simulations can help evaluate the performance of a proposed design under different loading conditions and material properties, providing valuable insights into its reliability and safety [instanc, design, phase, structur, engin, project, mont, carlo, simul, evalu, perform, propos, design, differ, load, condit, materi, properti, provid, valuabl, insight, reliabl, safeti] 1.000000
1 For instance, Monte Carlo simulations can simulate hundreds or thousands of different combinations of loading conditions and material properties to create statistical predictions of structure stiffness [instanc, mont, carlo, simul, simul, hundr, thousand, differ, combin, load, condit, materi, properti, creat, statist, predict, structur, stiff] 0.337359

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tf_idf_cosimm-0.0.2.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

tf_idf_cosimm-0.0.2-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file tf_idf_cosimm-0.0.2.tar.gz.

File metadata

  • Download URL: tf_idf_cosimm-0.0.2.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.14

File hashes

Hashes for tf_idf_cosimm-0.0.2.tar.gz
Algorithm Hash digest
SHA256 a3e9a38c4cd53e5720bca687215abdc273a71d5a39f7e59ae659f9abc4e69c96
MD5 a01ba3d2cea0953d7717ed50a3c0a26e
BLAKE2b-256 dedaf1897e332602ef43985bac7c301285a94e371c7ccdfd84935835037cc94b

See more details on using hashes here.

File details

Details for the file tf_idf_cosimm-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for tf_idf_cosimm-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 daac75b3065830310aa19fb844109e622f138a809e2d7d958545ce4a2e8cd667
MD5 055f30b238d0e168dc78460686e06c91
BLAKE2b-256 c6a0b83d5cd1985bc46ccab65f313708019a370deed2deee39de7e017f06cefd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page