Skip to main content

A tool that converts information such as tags in multiple strings into features in the form of vectors with 1s and 0s as elements

Project description

json_stock

下の方に日本語の説明があります

Overview

  • A tool that converts information such as tags in multiple strings into features in the form of vectors with 1s and 0s as elements

Usage

import tags2vec

# Convert training data
train_tags = [
	["Spicy", "Red", "Delicious"],
	["Sweet", "Green"]
]
# Convert training data Tag info -> Vector (training) [tags2vec]
train_x, tags_info = tags2vec.train_tr(train_tags)
"""
train_x (numpy array):
[[1. 1. 1. 0. 0.]
 [0. 0. 0. 1. 1.]]

tags_info: ["Spicy", "Red", "Delicious", "Sweet", "Green"]
"""

# Convert test data
test_tags = [
	["Sweet", "Red", "Delicious"],
	["Spicy", "Yellow"],
]
# Tag info -> Vector (prediction) [tags2vec]
test_x = tags2vec.pred_tr(test_tags, tags_info)
"""
test_x (numpy array):
[[0. 1. 1. 1. 0.]
 [1. 0. 0. 0. 0.]]
"""

detailed explanation

  • This tool is designed for pre-processing of supervised learning.
    • The vector output is therefore a numpy matrix.
  • During the training phase, a list of tag information is output in the form of tags_info variables
  • During the prediction phase, a vector is generated based on the tags_info information (list of tags and their order)
    • This ensures that prediction is consistent with the training phase
  • If a tag appears during prediction that was not present during training, it is ignored

概要

  • タグ情報をベクトル特徴量に変換するツール
  • 具体的には、複数の文字列のタグのような情報を、1と0を要素として持つベクトルの形の特徴量に変換するツール

使用例

import tags2vec

# 学習データの変換
train_tags = [
	["Spicy", "Red", "Delicious"],
	["Sweet", "Green"]
]
# タグ情報 -> ベクトル (学習時) [tags2vec]
train_x, tags_info = tags2vec.train_tr(train_tags)
"""
train_x (numpy array):
[[1. 1. 1. 0. 0.]
 [0. 0. 0. 1. 1.]]

tags_info: ["Spicy", "Red", "Delicious", "Sweet", "Green"]
"""

# 推論データの変換
test_tags = [
	["Sweet", "Red", "Delicious"],
	["Spicy", "Yellow"],
]
# タグ情報 -> ベクトル (推論時) [tags2vec]
test_x = tags2vec.pred_tr(test_tags, tags_info)
"""
test_x (numpy array):
[[0. 1. 1. 1. 0.]
 [1. 0. 0. 0. 0.]]
"""

詳細説明

  • このツールは教師あり学習の前処理を想定して作られています
    • そのため、ベクトルの出力はnumpy行列で出力されます
  • 学習のフェーズでタグ情報の一覧がtags_info変数の形で出力されます
  • 推論フェーズでは、tags_infoの情報 (タグ一覧とその順序) にもとづいてベクトルが生成されます
    • これによって、学習時と一貫した推論を行うことができます
  • 学習時に存在しなかったタグが推論時に現れた場合は、そのタグは無視されます

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tags2vec-0.0.0.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

tags2vec-0.0.0-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file tags2vec-0.0.0.tar.gz.

File metadata

  • Download URL: tags2vec-0.0.0.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8

File hashes

Hashes for tags2vec-0.0.0.tar.gz
Algorithm Hash digest
SHA256 8fb15ef17d1dbd11cb044d01037117f67508fe42734fbad60f01e9eb27289e6a
MD5 cb53e109a1ed6109abdbf5f25e3a4db8
BLAKE2b-256 bd9e7b424a52497b1c0ea8f5482da28912d9349d35cb940ca2669a4f1d31fe8f

See more details on using hashes here.

File details

Details for the file tags2vec-0.0.0-py3-none-any.whl.

File metadata

  • Download URL: tags2vec-0.0.0-py3-none-any.whl
  • Upload date:
  • Size: 3.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8

File hashes

Hashes for tags2vec-0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 532e518012dc0fa6302afce20a0a32cdf1b1c5f2d38073b1d35a52d348312f78
MD5 4ea8fdf38bf53d78fb1d0ba5175ff60c
BLAKE2b-256 17181662bd2f3b123fd7c60f6ccde30c0370a799b37dc8d32de448c9158c14f2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page