Skip to main content

A fast interval tree-like implementation in C, wrapped for the Python ecosystem.

Project description

Nested containment list

Deprecation notice

While I'll continue maintaining this library I suggest you change to ruranges which is a more lightweight and faster library with many more operations than NCLS.

NCLS

Build Status PyPI version

The Nested Containment List is a datastructure for interval overlap queries, like the interval tree. It is usually an order of magnitude faster than the interval tree both for building and query lookups.

The implementation here is a revived version of the one used in the now defunct PyGr library, which died of bitrot. I have made it less memory-consuming and created wrapper functions which allows batch-querying the NCLS for further speed gains.

It was implemented to be the cornerstone of the PyRanges project, but I have made it available to the Python community as a stand-alone library. Enjoy.

Original Paper: https://academic.oup.com/bioinformatics/article/23/11/1386/199545 Cite: http://dx.doi.org/10.1093/bioinformatics/btz615

Cite

If you use this library in published research cite

http://dx.doi.org/10.1093/bioinformatics/btz615

Install

pip install ncls

Usage

from ncls import NCLS

import pandas as pd

starts = pd.Series(range(0, 5))
ends = starts + 100
ids = starts

subject_df = pd.DataFrame({"Start": starts, "End": ends}, index=ids)

print(subject_df)
#    Start  End
# 0      0  100
# 1      1  101
# 2      2  102
# 3      3  103
# 4      4  104

ncls = NCLS(starts.values, ends.values, ids.values)

# python API, slower
it = ncls.find_overlap(0, 2)
for i in it:
    print(i)
# (0, 100, 0)
# (1, 101, 1)

starts_query = pd.Series([1, 3])
ends_query = pd.Series([52, 14])
indexes_query = pd.Series([10000, 100])

query_df = pd.DataFrame({"Start": starts_query.values, "End": ends_query.values}, index=indexes_query.values)

query_df
#        Start  End
# 10000      1   52
# 100        3   14


# everything done in C/Cython; faster
l_idxs, r_idxs = ncls.all_overlaps_both(starts_query.values, ends_query.values, indexes_query.values)
l_idxs, r_idxs
# (array([10000, 10000, 10000, 10000, 10000,   100,   100,   100,   100,
#          100]), array([0, 1, 2, 3, 4, 0, 1, 2, 3, 4]))

print(query_df.loc[l_idxs])
#        Start  End
# 10000      1   52
# 10000      1   52
# 10000      1   52
# 10000      1   52
# 10000      1   52
# 100        3   14
# 100        3   14
# 100        3   14
# 100        3   14
# 100        3   14
print(subject_df.loc[r_idxs])
#    Start  End
# 0      0  100
# 1      1  101
# 2      2  102
# 3      3  103
# 4      4  104
# 0      0  100
# 1      1  101
# 2      2  102
# 3      3  103
# 4      4  104

# return intervals in python (slow/mem-consuming)
intervals = ncls.intervals()
intervals
# [(0, 100, 0), (1, 101, 1), (2, 102, 2), (3, 103, 3), (4, 104, 4)]

There is also an experimental floating point version of the NCLS called FNCLS. See the examples folder.

Benchmark

Test file of 100 million intervals (created by subsetting gencode gtf with replacement):

Library Function Time (s) Memory (GB)
bx-python build 161.7 2.5
ncls build 3.15 0.5
bx-python overlap 148.4 4.3
ncls overlap 7.2 0.5

Building is 50 times faster and overlap queries are 20 times faster. Memory usage is one fifth and one ninth.

Original paper

Alexander V. Alekseyenko, Christopher J. Lee; Nested Containment List (NCList): a new algorithm for accelerating interval query of genome alignment and interval databases, Bioinformatics, Volume 23, Issue 11, 1 June 2007, Pages 1386–1393, https://doi.org/10.1093/bioinformatics/btl647

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ncls-0.0.70.tar.gz (592.4 kB view details)

Uploaded Source

Built Distributions

ncls-0.0.70-cp312-cp312-musllinux_1_1_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.12musllinux: musl 1.1+ x86-64

ncls-0.0.70-cp312-cp312-musllinux_1_1_i686.whl (2.6 MB view details)

Uploaded CPython 3.12musllinux: musl 1.1+ i686

ncls-0.0.70-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

ncls-0.0.70-cp312-cp312-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl (2.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.12+ i686manylinux: glibc 2.17+ i686

ncls-0.0.70-cp312-cp312-macosx_11_0_arm64.whl (925.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

ncls-0.0.70-cp311-cp311-musllinux_1_1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.11musllinux: musl 1.1+ x86-64

ncls-0.0.70-cp311-cp311-musllinux_1_1_i686.whl (2.6 MB view details)

Uploaded CPython 3.11musllinux: musl 1.1+ i686

ncls-0.0.70-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

ncls-0.0.70-cp311-cp311-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl (2.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.12+ i686manylinux: glibc 2.17+ i686

ncls-0.0.70-cp311-cp311-macosx_11_0_arm64.whl (930.7 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

ncls-0.0.70-cp310-cp310-musllinux_1_1_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.10musllinux: musl 1.1+ x86-64

ncls-0.0.70-cp310-cp310-musllinux_1_1_i686.whl (2.6 MB view details)

Uploaded CPython 3.10musllinux: musl 1.1+ i686

ncls-0.0.70-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

ncls-0.0.70-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl (2.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.12+ i686manylinux: glibc 2.17+ i686

ncls-0.0.70-cp310-cp310-macosx_11_0_arm64.whl (929.0 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

ncls-0.0.70-cp39-cp39-musllinux_1_1_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.9musllinux: musl 1.1+ x86-64

ncls-0.0.70-cp39-cp39-musllinux_1_1_i686.whl (2.6 MB view details)

Uploaded CPython 3.9musllinux: musl 1.1+ i686

ncls-0.0.70-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

ncls-0.0.70-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl (2.6 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.12+ i686manylinux: glibc 2.17+ i686

ncls-0.0.70-cp39-cp39-macosx_11_0_arm64.whl (930.6 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file ncls-0.0.70.tar.gz.

File metadata

  • Download URL: ncls-0.0.70.tar.gz
  • Upload date:
  • Size: 592.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ncls-0.0.70.tar.gz
Algorithm Hash digest
SHA256 7d16634a8f57fa79659e9ae7e5cc6edd1e02d5acb0eb57128dbed03e9f4fdd9c
MD5 b98b409cc507acd609678227e3703416
BLAKE2b-256 5fdd0c6a5a36ec132665f85e5e33f0480b58cf5aa8af8fbe1d5971410d789558

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp312-cp312-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp312-cp312-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 fd2050a150153b35a00a6be62bf4f10e5efcb5d7b4fe2a09ecb38f2f95ee1f09
MD5 c264ec2494b030520d6da5948e0bf760
BLAKE2b-256 273c531f23c817f3aaa8500bbafd84aed4e0314aec17f5fb38c39915a9189dde

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp312-cp312-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp312-cp312-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 6923443d8d3e25c7cbc7e2ac3b375230cffbecdfc413512c570734e547e3e35d
MD5 af73b9a4dde53a31b568eccd03a5f3cc
BLAKE2b-256 015dcde06f1b58c637afd81751f1d082782b593cf83ac5abefa56acae86dd01f

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 205779656889931338a7fa4c58a1466643a61dd5593c843f55cc7018b33a0331
MD5 5580ad6423e7b557e85388d86e4388c7
BLAKE2b-256 4003f5d0b979c6a1f8a8a11ba115a7c5b145671f092372a4ede164dc2597c466

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp312-cp312-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp312-cp312-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 68bc795a49a1caeb6efc7a7982b5549b2620266028cd015b70b8e9217f858d2a
MD5 64d9b2ba307749a22ab10fde8374cf3a
BLAKE2b-256 93960dc27f298c688f1ce175e018906dd36b8fc0d27a000bc07a7feeb3a1010d

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3df775f4274a8b9d3e6be778e9df122d14b64bb4c3dd3511b714c9e7e9d0b347
MD5 84007577530f522744c1af97675c2379
BLAKE2b-256 e9769af85bb0d7b0b68c45367cba6cda7e21e0caa30a891d6b42624e84c3779d

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp311-cp311-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 bc6f32a9aa77e1bf54205af60678e2cafa9d8c09057d0cba5f38e141cdbdd4ef
MD5 5227dd6ca5d23451fcd5393e2ec327fb
BLAKE2b-256 fefaecb38f0c20212db944aaa1446d5fc4e8fc6f195a6405cdfa04ffac5991a2

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp311-cp311-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp311-cp311-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 39ff95cbb8b21a24dec31adbc65b2d683ffb8d5fad0f0d319f2eed9a8cf8bc69
MD5 f4330e8ff1fa15cf44c73080207d2f19
BLAKE2b-256 41a8e74ccfeb1b2e1df3e4033554e44d6756e9d76faba757e1c39aae11b4b3a9

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 91d2fca94a5adaf3e7f42e5360c3bd8220ce3da9adde8209c05aa9849383cd1b
MD5 a71e3c2145659cc9ceb9863312288c19
BLAKE2b-256 fbecb0c23ec7fc9df5af527b2d63f15a92699f7fd0515986763ed8e50489a755

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp311-cp311-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp311-cp311-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 27b9f2538f07741706b1f3824b062432bca2d754bf77e67e67476265f8968ca8
MD5 3799b59a18e161f498123f12b66eea94
BLAKE2b-256 9197e05b262d5068374fc3ca65c3a944a5c8e411b21c07924bd07765ed5adffc

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b65284d3ea6a5b4aa397c4ebd080afac060303805ac2d4606fa65fdaf7f925d1
MD5 68000251239296c52d801313ee041330
BLAKE2b-256 0a8f16812ee742bbdda2746a15f5cc788cbaf4d329c35b2ca1f6cdf45685866c

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp310-cp310-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 b2db09e07738123da6175b3d44a37f6efe26bf02fdb3ff85ecc13622ce9bccef
MD5 bb628ab6847a56c2dada42021dc87e39
BLAKE2b-256 27a6c810cecdfe4243f9a5db8c5564f452e1839fa166ef2f71f2fb2a2b05fb9b

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp310-cp310-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp310-cp310-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 a5b255b90b5bcb14d22e4314ba3ed3ee4dc5010dbcf9f044b042a66c7c5927fe
MD5 9d2096175470b94083e4f2f376f38db7
BLAKE2b-256 ec58f8a15c3926667cbc1863a4aafdc78540ff00edb6366d1940e95374186721

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cda59fa472b204e22800836104d6a36ea3124a06f4e5afd9fab6a817bfaadf1b
MD5 d1a98dccd54b83f45ab9391e56dee080
BLAKE2b-256 357dbc4080a0d94719a039a96b1b5fb5b9a12d0048fab9f56efd9324fa07a096

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 1b910f45f5131329966fe74895a0658028d730496ceb5dad45f62da094d6688c
MD5 72e59ad4e13f932e01e03340185f9a02
BLAKE2b-256 a603e72a285991cdf9492239bef937e1dedf8f5e559cf8c763a32b96458a4641

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 aa4a10ef49f7e84d0d08a7e8bb06e2090966f76733e6def0be2b49ee4c520959
MD5 b2810f28c88ceaa1b42d7b1472c9162b
BLAKE2b-256 85dcbf8a9b7e289dd9b0b550b9964786231fe48264583eecd733f7ab77b374b7

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp39-cp39-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 9995b138b7b864f7ac011206215b35e976c938e18b402a4b777b0514244d2d2e
MD5 f165314afdc660bd22a34178426ced38
BLAKE2b-256 6247932609856b701bf0cda5e0aebbf070e3e1e946f4aaed484b2200df08d84b

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp39-cp39-musllinux_1_1_i686.whl.

File metadata

  • Download URL: ncls-0.0.70-cp39-cp39-musllinux_1_1_i686.whl
  • Upload date:
  • Size: 2.6 MB
  • Tags: CPython 3.9, musllinux: musl 1.1+ i686
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ncls-0.0.70-cp39-cp39-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 7b750c1e7b86506521a9a15e66d50737aa621b571fcd5e483f0740c21e9dabe4
MD5 c293d178b202b60284ea835d6d69ab5e
BLAKE2b-256 2c16b782ab8aaceca2af40e41153a5c56a88b8154e7ff05ea13439b2d5ed82ac

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8dd84885b1bd7eae70c6dc6115a36718dda1404353025b10e32dec7faccda762
MD5 2c548d656da46525e9987a02dd1bdbe4
BLAKE2b-256 0fe8e03fb7e67f7abaa5e65518bb2147ed8c3400d194f7e525ebdf962a267adb

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for ncls-0.0.70-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 fd3853d4b0f0f2e8637e3dd23c81222316776d5b7674ad9585545c944552d4e9
MD5 095220d29baee224b181ac2966b27fe1
BLAKE2b-256 ca1e69bec9d3c70cbf9fde06c123a3e58908ab664e21a2372a3d0289c85647d8

See more details on using hashes here.

File details

Details for the file ncls-0.0.70-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

  • Download URL: ncls-0.0.70-cp39-cp39-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 930.6 kB
  • Tags: CPython 3.9, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ncls-0.0.70-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 333f834a4bb7183e00beb6e94e4ed87b503ce0ba00b3c1a06974e1c9681d6a77
MD5 fb90a2229453bee2ac54e25679fb66e8
BLAKE2b-256 db8692bcc8526fd0c73587a4f50b65085e0c7c42b692b658687aa2334ea58c6b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page