swh.indexer

Software Heritage indexer

These details have not been verified by PyPI

Project links

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Tools to compute multiple indexes on SWH’s raw contents:

content:
- mimetype
- fossology-license
- metadata
origin:
- metadata (intrinsic, using the content indexer; and extrinsic)

An indexer is in charge of:

looking up objects
extracting information from those objects
store those information in the swh-indexer db

There are multiple indexers working on different object types:

content indexer: works with content sha1 hashes

revision indexer: works with revision sha1 hashes

origin indexer: works with origin identifiers

Indexation procedure:

receive batch of ids
retrieve the associated data depending on object type
compute for that object some index
store the result to swh’s storage

Current content indexers:

mimetype (queue swh_indexer_content_mimetype): detect the encoding and mimetype
fossology-license (queue swh_indexer_fossology_license): compute the license
metadata: translate file from an ecosystem-specific formats to JSON-LD (using schema.org/CodeMeta vocabulary)

Current origin indexers:

metadata: translate file from an ecosystem-specific formats to JSON-LD (using schema.org/CodeMeta and ForgeFed vocabularies)

Project details

These details have not been verified by PyPI

Project links

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

3.6.0

Nov 5, 2024

3.5.0

Sep 25, 2024

3.4.0

Aug 13, 2024

3.3.0

Jun 14, 2024

3.2.0

Feb 2, 2024

3.1.0

Jan 17, 2024

3.0.0

Jan 11, 2024

2.12.0

Dec 4, 2023

2.11.1

Nov 22, 2023

2.11.0

Nov 21, 2023

2.10.0

Jul 3, 2023

2.9.4

Apr 17, 2023

2.9.3

Feb 13, 2023

2.9.1

Nov 30, 2022

2.9.0

Nov 29, 2022

2.8.0

Nov 23, 2022

2.7.3

Nov 2, 2022

2.7.2

Oct 27, 2022

2.7.1

Oct 7, 2022

2.6.0

Sep 12, 2022

2.5.0

Aug 31, 2022

2.4.4

Aug 31, 2022

2.4.3

Aug 30, 2022

2.4.2

Aug 25, 2022

2.4.1

Aug 25, 2022

2.4.0

Aug 25, 2022

2.3.0

Aug 10, 2022

2.2.2

Jul 29, 2022

2.2.1

Jul 29, 2022

2.2.0

Jul 25, 2022

2.1.0

Jul 21, 2022

2.0.2

Jun 22, 2022

2.0.1

Jun 10, 2022

2.0.0

Jun 3, 2022

1.9.4

Apr 17, 2023

1.2.0

Jun 1, 2022

1.1.0

May 30, 2022

1.0.0

Feb 24, 2022

0.8.2

Jan 12, 2022

0.8.1

Dec 21, 2021

0.8.0

May 28, 2021

0.7.0

Feb 3, 2021

0.6.4

Feb 1, 2021

0.6.3

Nov 27, 2020

0.6.2

Nov 27, 2020

0.6.1

Nov 27, 2020

0.6.0

Nov 26, 2020

0.5.0

Nov 6, 2020

0.4.2

Oct 30, 2020

0.4.1

Oct 16, 2020

0.4.0

Oct 15, 2020

0.3.0

Oct 8, 2020

0.2.4

Sep 25, 2020

0.2.3

Sep 11, 2020

0.2.2

Sep 4, 2020

0.2.1

Aug 20, 2020

0.2.0

Aug 6, 2020

0.1.1

Jul 28, 2020

0.1.0

Jun 23, 2020

0.0.171

Apr 23, 2020

0.0.170

Mar 8, 2020

0.0.169

Mar 6, 2020

0.0.168

Mar 5, 2020

0.0.167

Mar 4, 2020

0.0.166

Mar 4, 2020

0.0.165

Mar 4, 2020

0.0.164

Mar 4, 2020

0.0.163

Mar 4, 2020

0.0.162

Feb 27, 2020

0.0.161

Feb 25, 2020

0.0.160

Feb 5, 2020

0.0.159

Feb 5, 2020

0.0.158

Nov 20, 2019

0.0.157

Nov 8, 2019

0.0.156

Nov 5, 2019

0.0.155

Oct 15, 2019

0.0.154

Oct 7, 2019

0.0.153

Sep 11, 2019

0.0.152

Jul 19, 2019

0.0.151

Jul 3, 2019

0.0.150

Jul 3, 2019

0.0.149

Jul 2, 2019

0.0.148

Jul 1, 2019

0.0.147

May 23, 2019

0.0.146

Apr 11, 2019

0.0.145

Mar 15, 2019

0.0.144

Mar 14, 2019

0.0.143

Mar 13, 2019

0.0.142

Mar 1, 2019

0.0.141

Mar 1, 2019

0.0.140

Feb 25, 2019

0.0.139

Feb 22, 2019

0.0.138

Feb 22, 2019

0.0.137

Feb 22, 2019

0.0.136

Feb 14, 2019

0.0.135

Feb 14, 2019

0.0.134

Feb 14, 2019

0.0.133

Feb 12, 2019

0.0.132

Jan 30, 2019

0.0.131

Jan 30, 2019

0.0.129

Jan 29, 2019

0.0.128

Jan 29, 2019

0.0.127

Jan 15, 2019

0.0.126

Jan 14, 2019

0.0.125

Jan 11, 2019

0.0.124

Jan 8, 2019

0.0.123

Jan 7, 2019

0.0.121

Dec 18, 2018

0.0.120

Dec 14, 2018

0.0.118

Nov 30, 2018

0.0.68

Dec 17, 2018

0.0.67

Nov 30, 2018

0.0.65

Nov 26, 2018

0.0.60

Nov 21, 2018

0.0.59

Nov 20, 2018

0.0.57

Nov 20, 2018

0.0.56

Nov 20, 2018

0.0.55

Oct 30, 2018

0.0.54.post3

Oct 26, 2018

0.0.52

Oct 18, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swh_indexer-3.6.0.tar.gz (153.2 kB view hashes)

Uploaded Nov 5, 2024 Source

Built Distribution

swh.indexer-3.6.0-py3-none-any.whl (195.8 kB view hashes)

Uploaded Nov 5, 2024 Python 3

Hashes for swh_indexer-3.6.0.tar.gz

Hashes for swh_indexer-3.6.0.tar.gz
Algorithm	Hash digest
SHA256	`1afb2a5af58b432de5583e6d5f75aba70082116d5be4d472c6a759a16b08e749`
MD5	`5c7aafa3d3003105bafb6629b656b6a6`
BLAKE2b-256	`f0d1a147d162d08bbc8111cfa12f779c0a0f9a675c780d4fdd479e6a187d0a40`

Hashes for swh.indexer-3.6.0-py3-none-any.whl

Hashes for swh.indexer-3.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`38dbe5203f3362a58ef79d00e229cc0ddc42d562ba2f87fc333c115ec76f2335`
MD5	`74e42d6ac5f5cf9a7875b1a57c427a25`
BLAKE2b-256	`66777c9f62bee7745e11e0bbc2036256b18f9c86b36a38cffcca1a0caf617bf2`