My python library of classes and functions that help me work
Project description
`|Build Status| <https://travis-ci.org/childsish/lhc-python>`_
lhc-python
==========
This is my personal library of python classes and functions, many of
them have bioinformatics applications. The library changes constantly
and at a whim. If you want to use it, approach with caution. Over time
however, parts appear to be settling on a stable configuration.
lhc.binf
--------
**lhc.binf.alignment**
A pure Python implementation of the Smith-Waterman local alignment
algorithm.
**lhc.binf.digen**
A C++ and pure Python implementation of sequence generation algorithm.
The generated sequence will have a specified dinucleotide frequency.
**lhc.binf.genomic\_coordinate**
An implementation of intervals and points for genomic coordinates.
Useful for representing gene models.
**lhc.binf.genetic\_code**
A class to read genetic codes and translate DNA sequences into protein
sequences
**lhc.binf.iupac**
A class to convert protein names between the one and three letter codes
and the full name.
**lhc.binf.kmer**
A class that calculates k-mers for a given sequence. The class behaves
likea dict, but calculates new k-mers on the fly.
**lhc.binf.skew**
A class that calculates skews for a given sequence. The class behaves
like a dict, but calculates new skews on the fly.
lhc.collections
---------------
Several collections mostly for holding intervals. If only intervals need
to be held, use the IntervalTree, otherwise the MultiDimensionMap may be
more appropriate.
lhc.filetools
-------------
Classes for working with files
lhc.graph
---------
A pure Python implementation of graphs
lhc.indices
-----------
Intended to be my own code for indexing files but is still very unstable
an immature
lhc.interval
------------
A class for intervals and interval operations
lhc.io
------
Classes for parsing and working with several file formats
lhc.itertools
-------------
Classes for working with iterators
lhc.tools
---------
Various classes, mostly unused and out-of-date
lhc.random
----------
**lhc.random.reservoir**
An implementation of the reservoir sampling algorithm. Can also be run
from the command line to sample lines from files. To sample 50 lines
from a file called input\_file.txt, run:
::
python -m lhc.random.reservoir input_file.txt 50
lhc.stats
---------
Really old code. Probably the NIPALS and PCA algorithms are of most use.
lhc.test
--------
Unit tests! These should be mostly up-to-date now.
lhc.tools
---------
**lhc.tools.sorter**
A sorter for very large iterators. The iterator will be split into
chunks which are then sorted individually and then merged into a single
file.
**lhc.tools.tokeniser**
A basic tokeniser. Users define which characters belong to which classes
and the tokeniser will split strings into substrings where all
characters have the same type.
::
>>> tokeniser = Tokeniser({'word': 'abcdefghijklmnopqrstuvwxyz',
'number': '0123456789',
'space': ' \t'})
>>> tokens = tokeniser.tokenise('there were 1000 bottles on the wall')
>>> tokeniser.next()
Token(type='word', value='there')
>>> tokeniser.next()
Token(type='space', value=' ')
>>> tokeniser.next()
Token(type='word', value='were')
>>> tokeniser.next()
Token(type='space', value=' ')
>>> tokeniser.next()
Token(type='number', value='1000')
.. |Build
Status| image:: https://travis-ci.org/childsish/lhc-python.svg?branch=master
lhc-python
==========
This is my personal library of python classes and functions, many of
them have bioinformatics applications. The library changes constantly
and at a whim. If you want to use it, approach with caution. Over time
however, parts appear to be settling on a stable configuration.
lhc.binf
--------
**lhc.binf.alignment**
A pure Python implementation of the Smith-Waterman local alignment
algorithm.
**lhc.binf.digen**
A C++ and pure Python implementation of sequence generation algorithm.
The generated sequence will have a specified dinucleotide frequency.
**lhc.binf.genomic\_coordinate**
An implementation of intervals and points for genomic coordinates.
Useful for representing gene models.
**lhc.binf.genetic\_code**
A class to read genetic codes and translate DNA sequences into protein
sequences
**lhc.binf.iupac**
A class to convert protein names between the one and three letter codes
and the full name.
**lhc.binf.kmer**
A class that calculates k-mers for a given sequence. The class behaves
likea dict, but calculates new k-mers on the fly.
**lhc.binf.skew**
A class that calculates skews for a given sequence. The class behaves
like a dict, but calculates new skews on the fly.
lhc.collections
---------------
Several collections mostly for holding intervals. If only intervals need
to be held, use the IntervalTree, otherwise the MultiDimensionMap may be
more appropriate.
lhc.filetools
-------------
Classes for working with files
lhc.graph
---------
A pure Python implementation of graphs
lhc.indices
-----------
Intended to be my own code for indexing files but is still very unstable
an immature
lhc.interval
------------
A class for intervals and interval operations
lhc.io
------
Classes for parsing and working with several file formats
lhc.itertools
-------------
Classes for working with iterators
lhc.tools
---------
Various classes, mostly unused and out-of-date
lhc.random
----------
**lhc.random.reservoir**
An implementation of the reservoir sampling algorithm. Can also be run
from the command line to sample lines from files. To sample 50 lines
from a file called input\_file.txt, run:
::
python -m lhc.random.reservoir input_file.txt 50
lhc.stats
---------
Really old code. Probably the NIPALS and PCA algorithms are of most use.
lhc.test
--------
Unit tests! These should be mostly up-to-date now.
lhc.tools
---------
**lhc.tools.sorter**
A sorter for very large iterators. The iterator will be split into
chunks which are then sorted individually and then merged into a single
file.
**lhc.tools.tokeniser**
A basic tokeniser. Users define which characters belong to which classes
and the tokeniser will split strings into substrings where all
characters have the same type.
::
>>> tokeniser = Tokeniser({'word': 'abcdefghijklmnopqrstuvwxyz',
'number': '0123456789',
'space': ' \t'})
>>> tokens = tokeniser.tokenise('there were 1000 bottles on the wall')
>>> tokeniser.next()
Token(type='word', value='there')
>>> tokeniser.next()
Token(type='space', value=' ')
>>> tokeniser.next()
Token(type='word', value='were')
>>> tokeniser.next()
Token(type='space', value=' ')
>>> tokeniser.next()
Token(type='number', value='1000')
.. |Build
Status| image:: https://travis-ci.org/childsish/lhc-python.svg?branch=master
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
lhc-python-1.3.8.tar.gz
(66.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
lhc_python-1.3.8-py3.5.egg
(272.7 kB
view details)
File details
Details for the file lhc-python-1.3.8.tar.gz.
File metadata
- Download URL: lhc-python-1.3.8.tar.gz
- Upload date:
- Size: 66.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ef6a6f823ef5274bc2ffef1762a0cb68865824d576dcd0a85c7564a28b7c58a
|
|
| MD5 |
6ced29ad8e6a5c9cb7219838f3f3e9f5
|
|
| BLAKE2b-256 |
3738b1b149ac23bc492b15e2a9dd3a4c376362539066e5db666a66176cf6f4f3
|
File details
Details for the file lhc_python-1.3.8-py3.5.egg.
File metadata
- Download URL: lhc_python-1.3.8-py3.5.egg
- Upload date:
- Size: 272.7 kB
- Tags: Egg
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1babe67d7a4d6e720decd145f39f2e597b35d1d9774c7db482daaa6b56f7cf5
|
|
| MD5 |
834fba2780c1cee8c4a0c7a2c03cce15
|
|
| BLAKE2b-256 |
6e4424d29f0e7888ce4e1834b52b5f9f4caef18c97717edf45f96d4265e7b8a4
|