Skip to main content

Python library to work with ConceptNet offline without the need of PostgreSQL

Project description

conceptnet-lite

Conceptnet-lite is a Python library for working with ConceptNet offline without the need for PostgreSQL.

The basic usage is as follows.

Loading the database object

ConceptNet releases happen once a year. You can build your own database from an assertions file, but if there is a pre-built file it will be faster to just download that one. Here is the compressed database file for ConceptNet 5.7 release.

import conceptnet_lite

conceptnet_lite.connect('/path/to/conceptnet.db')

Building the database for a new release.

The assertion files for ConceptNet are provided here.

(building instructions TBA)

Accessing concepts

Concepts objects are created by looking for every entry that matches the input string exactly. If none is found, the peewee.DoesNotExist exception will be raised.

from conceptnet_lite import Label

cat_concepts = Label.get(text='cat').concepts  #
for c in cat_concepts:
    print("    Concept URI:", c.uri)
    print("    Concept text:", c.text)

concept.uri provides access to ConceptNet URIs, as described here. You can also retrieve only the text of the entry by concept.text.

Working with languages

You can limit the languages to search for matches. Label.get() takes an optional language attribute that is expected to be an instance Language, which in turn is created by calling Language.get() with name argument. List of available languages and their codes are described here.

from conceptnet_lite import Label, Language

english = Language.get(name='en')
cat_concepts = Label.get(text='cat', language=english).concepts  #
for c in cat_concepts:
    print("    Concept URI:", c.uri)
    print("    Concept text:", c.text)
    print("    Concept language:", c.language.name)

Querying edges between concepts

To retrieve the set of relations between two concepts, you need to create the concept objects (optionally specifying the language as described above). cn.edges_between() method retrieves all edges between the specified concepts. You can access its URI and a number of attributes, as shown below.

Some ConceptNet relations are symmetrical: for example, the antonymy between white and black works both ways. Some relations are asymmetrical: e.g. the relation between cat and mammal is either hyponymy or hyperonymy, depending on the direction. The two_way argument lets you choose whether the query should be symmetrical or not.

from conceptnet_lite import Label, Language, edges_between

english = Language.get(name='en')
introvert_concepts = Label.get(text='introvert', language=english).concepts
extrovert_concepts = Label.get(text='extrovert', language=english).concepts
for e in edges_between(introvert_concepts, extrovert_concepts, two_way=False):
    print("  Edge URI:", e.uri)
    print(e.relation.name, e.start.text, e.end.text, e.etc)
  • e.relation.name: the name of ConceptNet relation. Full list here.

  • e.start.text, e.end.text: the source and the target concepts in the edge

  • e.etc: the ConceptNet metadata dictionary contains the source dataset, sources, weight, and license. For example, the introvert:extrovert edge for English contains the following metadata:

{
	"dataset": "/d/wiktionary/en",
	"license": "cc:by-sa/4.0",
	"sources": [{
		"contributor": "/s/resource/wiktionary/en",
		"process": "/s/process/wikiparsec/2"
	}, {
		"contributor": "/s/resource/wiktionary/fr",
		"process": "/s/process/wikiparsec/2"
	}],
	"weight": 2.0
}

Accessing all relations for a given concepts

You can also retrieve all relations between a given concepts and all other concepts, with the same options as above:

from conceptnet_lite import Label, Language, edges_for

english = Language.get(name='en')
for e in edges_for(Label.get(text='introvert', language=english).concepts, same_language=True):
    print("  Edge URI:", e.uri)
    print(e.relation.name, e.start.text, e.end.text, e.etc)

Note that we have used optional argument same_language=True. By supplying this argument we make edges_for return relations, both ends of which are in the same language. If this argument is skipped it is possible to get edges to concepts in languages other than the source concepts language.

Accessing concept edges with a given relation direction

You can also query the relations that have a specific concept as target or source. This is achieved with concept.edges_out and concept.edges_in, as follows:

from conceptnet_lite import Language, Label

english = Language.get(name='en')
cat_concepts = Label.get(text='introvert', language=english).concepts  #
for c in cat_concepts:
    print("    Concept text:", c.text)
    if c.edges_out:
        print("      Edges out:")
        for e in c.edges_out:
            print("        Edge URI:", e.uri)
            print("        Relation:", e.relation.name)
            print("        End:", e.end.text)
    if c.edges_in:
        print("      Edges in:")
        for e in c.edges_in:
            print("        Edge URI:", e.uri)
            print("        Relation:", e.relation.name)
            print("        End:", e.end.text)

Traversing all the data for a language

You can go over all concepts for a given language. For illustration, let us try Avestan, a "small" language with the code "ae" and vocab size of 371, according to the ConceptNet language statistics.

from conceptnet_lite import Language

mylanguage = Language.get(name='ae')
for l in mylanguage.labels:
    print("  Label:", l.text)
    for c in l.concepts:
        print("    Concept URI:", c.uri)
        if c.edges_out:
            print("      Edges out:")
            for e in c.edges_out:
                print("        Edge URI:", e.uri)
        if c.edges_in:
            print("      Edges in:")
            for e in c.edges_in:
                print("        Edge URI:", e.uri)

Todo:

  • add database file link
  • describe how to build the database
  • add sample outputs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

conceptnet-lite-0.1.13.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

conceptnet_lite-0.1.13-py3-none-any.whl (34.4 kB view details)

Uploaded Python 3

File details

Details for the file conceptnet-lite-0.1.13.tar.gz.

File metadata

  • Download URL: conceptnet-lite-0.1.13.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/0.12.5 CPython/3.7.3 Linux/5.2.10-1-default

File hashes

Hashes for conceptnet-lite-0.1.13.tar.gz
Algorithm Hash digest
SHA256 d987ffdb0fe4f5d072b4089d4ddfd3d33c5840515338aff60122e1e0471b2c39
MD5 1a29146c7a93eafb955a5a3b8918b331
BLAKE2b-256 9b47f9fa686343388d3f55dfdb3a957be5ce22b9eacd8f1a6cc181eecc9f532b

See more details on using hashes here.

File details

Details for the file conceptnet_lite-0.1.13-py3-none-any.whl.

File metadata

  • Download URL: conceptnet_lite-0.1.13-py3-none-any.whl
  • Upload date:
  • Size: 34.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/0.12.5 CPython/3.7.3 Linux/5.2.10-1-default

File hashes

Hashes for conceptnet_lite-0.1.13-py3-none-any.whl
Algorithm Hash digest
SHA256 1f7a91d9d5e7a1265d2006d90d6debc0c23037eb63312dade55ed67701ea1999
MD5 a7759c0a795fad764c7e2c15e6da0085
BLAKE2b-256 48473f9d654feb12f91827bae2d7496e0cf4882b373a7e0852e06f9b8c4cd9b7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page