Skip to main content

image toolkit

Project description

What is it

imagetk是一个图像分析的Python包,用以进行图像处理、特征提取、边缘检测等。

imagetk is a Python package providing image process, feature extraction, and edge detection.

Where to get it

最新版本的源码和编译安装包可以在Python package index获取。

The source code and binary installers for the latest released version are available at the [Python package index].

https://pypi.org/project/imagetk

可以用pip安装imagetk。

You can install protege like this:

pip install imagetk

也可以用setup.py安装。

Or in the protege directory, execute:

python setup.py install

How to use it

特征提取

brisque

from imagetk import feature
import numpy
from PIL import Image

image=Image.open("test.png")#图片读取
image = image.convert('L')#转换为灰度图
image=numpy.array(image)#转换为numpy.array
brisque_feature=feature.brisque(image)#brique特征计算

mean_grad

image=Image.open("test.png")#图片读取
image = image.convert('L')#转换为灰度图
image=numpy.array(image)#转换为numpy.array
mean_grad=feature.mean_grad(image)#mean_grad计算

gray_quantile_expose

image=Image.open("test.png")#图片读取
image = image.convert('L')#转换为灰度图
image=numpy.array(image)#转换为numpy.array
bright,dark=feature.gray_quantile_expose(image,q=0.75,pix=(96,192))#gray_quantile_expose计算

block_value_expose

image=Image.open("test.png")#图片读取
image=numpy.array(image)#转换为numpy.array
bright,dark=feature.block_value_expose(array,stride=8,channel_first=False)

cumprob

image=Image.open("test.png")#图片读取
image = image.convert('L')#转换为灰度图
image=numpy.array(image)#转换为numpy.array
cumprob=feature.cumprob(image)

边缘检测

sobel

image=Image.open("test.png")#图片读取
image = image.convert('L')#转换为灰度图
image=numpy.array(image)#转换为numpy.array
edges=edge.sobel(image)

canny

image=Image.open("test.png")#图片读取
image = image.convert('L')#转换为灰度图
image=numpy.array(image)#转换为numpy.array
edges=edge.canny(image,kernel_size=3,sigma=1)

houghline

from imagetk import threshold

image=Image.open("test.png")#图片读取
image = image.convert('L')#转换为灰度图
image=numpy.array(image)#转换为numpy.array
# 二值化处理
thres=threshold.otsu(image)
image[image>thres]=255
image[image<=thres]=0
# 边缘检测
edge_image=edge.canny(image).astype(numpy.uint8)
# houghline 检测
lines=edge.houghlines(edge_image, rho=1, theta=numpy.pi/180, threshold=100)

图像查重 dedup

图像查重去重模块,包括基于图像感知相似度算法的查重去重。

imagetk.dedup.Hash(hash_func:Callable=hash.percept,dist_func:Callable=distance.hamming,process_func:Callable=None,
			            threshold:int=10,scale:Tuple[int,int]=None,hash_size:int=64,hash_hex:bool=False,
			            errors:Literal['ignore','raise','coerce']='raise',
			            suffix:Union[str,List[str]]=['JPEG', 'PNG', 'BMP', 'MPO', 'PPM', 'TIFF', 'GIF', 'WEBP', 'JPG'],
			            n_worker:int=cpu_count())

Parameters

  • hash_func : Callable, optional hash函数. The default is hash.percept.
  • dist_func : Callable, optional 距离度量函数. The default is distance.hamming.
  • process_func : Callable, optional 图像预处理函数,如果为空,采用self.process. The default is None.
  • threshold : int, optional 图像重复的判断阈值,根据dist_func计算距离,若距离小于threshold,判断为重复图像. The default is 10.
  • scale : Tuple[int,int], optional self.process中对图像进行缩放的参数. The default is None.
  • hash_size : int, optional hash值长度. The default is 64.
  • hash_hex : bool, optional hash_func是否转成16进制,转成16进制可以节省存储空间. The default is False.
  • errors : str, optional 图像加载/查重错误时的处理方式,可选值为['ignore','raise','coerce'],'ignore'忽略文件,'raise'抛出异常,'coerce'强制转成空图像,. The default is 'raise'.
  • suffix : str or list, optional 可以处理的图像文件后缀名, 不区分大小写. The default is ['JPEG', 'PNG', 'BMP', 'MPO', 'PPM', 'TIFF', 'GIF', 'WEBP', 'JPG'].
  • n_worker : int, optional 并行处理进程数. The default is cpu_count().

Returns

None

Algorithms

图像感知相似度算法的原理参见:python计算图像感知相似度(PHash Sim)实例

基础用法

dedup.Hash针对文件、文件夹、压缩包分别提供了两种查重方法:

操作对象 查找与指定图像相似的图像 在指定范围内查找相似的图像
文件列表 find_dup_from_files(file,files) find_dup_in_files(files,score:Dict=None)
文件夹 find_dup_from_folder(file,path:str) find_dup_in_folder(path:str,score:Dict=None)
压缩包 find_dup_from_archive(file,path,mode='zipfile') find_dup_in_archive(path,score=None,mode='zipfile')
  • file: 指定图像的路径
  • files: 待查重文件路径列表
  • path: 待查重的文件夹/压缩包路径
  • score: 图像评分,类型为字典,如果提供了score,将按照score从大到小开始查重,如果某个图像与score较大的图像相似,将不再计算其它图像是否与该图像相似。
  • mode: 压缩包格式,支持zipfile和tarfile

Examples

import os
import numpy
from imagetk import dedup
data_path=os.path.join(os.path.dirname(dedup.__file__),'../test/data')

if __name__=='__main__':
    images=[os.path.join(data_path,i) for i in os.listdir(data_path) if i.endswith('.bmp') or i.endswith('.png') or i.endswith('.JPEG') or i.endswith('.jpg')]
    ref_image=data_path+'/I10.BMP'

    dedup_task=dedup.Hash()

    #find_dup_from_files
    dedup_task.find_dup_from_files(file=ref_image,files=images)
	#find_dup_in_files
    dedup_task.find_dup_in_files(files=images)
	#find_dup_in_files score
    score={i:numpy.random.randint(0,10) for i in images}
    dedup_task.find_dup_in_files(files=images,score=score)

	#find_dup_from_folder
    dedup_task.find_dup_from_folder(file=ref_image,path=data_path)
	#find_dup_in_folder
    dedup_task.find_dup_in_folder(path=data_path)

	#find_dup_from_archive zipfile
    dedup_task.find_dup_from_archive(file=data_path+'/ILSVRC2012_val_00020553.JPEG',path=data_path+'/imagewoof_train.zip',mode='zipfile')
	#find_dup_from_archive tarfile
    dedup_task.find_dup_from_archive(file=data_path+'/ILSVRC2012_val_00010420.JPEG',path=data_path+'/imagewoof_val.tar.gz',mode='tarfile')
	#find_dup_in_archive tarfile
    dedup_task.find_dup_in_archive(path=data_path+'/imagewoof_val.tar.gz',mode='tarfile')

高级用法

文件编码

操作对象 编码
文件 encode(image:numpy.ndarray)
文件列表 encode_files(files:List)
文件夹 encode_folder(path)
压缩包 encode_archive(path,mode='zipfile')

对编码后的文件进行查重

操作对象 查找与指定图像相似的图像 在指定范围内查找相似的图像
编码字典 find_dup_from_map(encode,encode_map:Dict=None) find_dup_in_map(encode_map:Dict,score:Dict=None)

Examples

import os
import numpy
from imagetk import dedup
data_path=os.path.join(os.path.dirname(dedup.__file__),'../test/data')

if __name__=='__main__':
    images=[os.path.join(data_path,i) for i in os.listdir(data_path) if i.endswith('.bmp') or i.endswith('.png') or i.endswith('.JPEG') or i.endswith('.jpg')]
    ref_image=data_path+'/I10.BMP'

    dedup_task=dedup.Hash()

    #encode file
    encode=dedup_task.encode_file(ref_image)
    #encode files
    encode_map=dedup_task.encode_files(images)
    #encode folder
    dedup_task.encode_folder(data_path)
    #encoder archive
    dedup_task.encode_archive(data_path+'/imagewoof_train.zip',mode='zipfile')

    #find_dup_from_map
    dedup_task.find_dup_from_map(encode,encode_map)
    #find_dup_in_map
    dedup_task.find_dup_in_map(encode_map)

项目地址:https://github.com/idealo/imagededup

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imagetk-0.1.3.tar.gz (434.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imagetk-0.1.3-cp312-cp312-win_amd64.whl (370.2 kB view details)

Uploaded CPython 3.12Windows x86-64

File details

Details for the file imagetk-0.1.3.tar.gz.

File metadata

  • Download URL: imagetk-0.1.3.tar.gz
  • Upload date:
  • Size: 434.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.0

File hashes

Hashes for imagetk-0.1.3.tar.gz
Algorithm Hash digest
SHA256 aaa54191fa366ef3453bb45588c3675f155538cdf5f06c1de93d9dd2f3c5c01e
MD5 ef6f4ae989d0fedaeaee139050125cbf
BLAKE2b-256 b6d4f5b1112dc9f71c7347993a8b1cfd14ec1f337251d53cd0ccc857a69658a2

See more details on using hashes here.

File details

Details for the file imagetk-0.1.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: imagetk-0.1.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 370.2 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.0

File hashes

Hashes for imagetk-0.1.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 f56442d38dcd9c294c3bbbe12f97b38e4eaf67656ec227bc5afb576d5afe7d42
MD5 384b158832814f3ac129d6f934669e4b
BLAKE2b-256 82766468a236aa2cd3b148ebb99cf927de7c88a45d5bf879a57f750809cae603

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page