A Python interface to OSCAR data
Project description
Python interface for OSCAR data
This is a convenience library to access OSCAR dataset files.
Installation
easy_install --user --upgrade oscar
Git objects
This package provides interfaces to git objects: Commit, Tree, Blog and Tag.
All git objects have sha
property, which represents SHA1 object hash as a hex string,
and bin_sha
, its binary counterpart.
All objects can be instantiated using their hashes, either a 40-char hex string or 20-bytes binary.
Example.:
>>> c = Commit('1e971a073f40d74a1e72e07c682e1cba0bae159b')
>>> c.sha
'1e971a073f40d74a1e72e07c682e1cba0bae159b'
>>> c.bin_sha
'\x1e\x97\x1a\x07?@\xd7J\x1er\xe0|h.\x1c\xba\x0b\xae\x15\x9b'
Another shared property is data
, a binary representation of a git object.
It is not expected to be used directly, but who knows what your research will take.
Commit
Represents Git commit. It has following properties:
-
tree - a
Tree
object referring to the root tree of this commit (see below) -
parents - a tuple of parent
Commit
objects. Note that there might be any number of parents, from zero (initial commit) to many. At least three parent commits were spotted in the wild. -
message - the first line of the commit message. Most of commits have only one line message and also the first line is what you will see by default on GitHub. However, messages can be arbitrary long. Messages of squashed commits are just a concatenation of source commit messages.
-
full_message - full message, including the first line
-
author - name and email, e.g.
'John Doe <johndoe@yahoo.com>'
-
committer - similar to author
-
authored_at - unix timestamp with a timezone as a sting, e.g.
'1336361613 +1100'
-
committed_at - similar to authored_at
All commit properties are lazy, i.e. they will be instantiated on the first access. However, the properties above will be instantiated at once if you access any of them.
- projects - list of project urls this commit belongs to
- children - list of projects having this commit as a parent
Example:
>>> commit = Commit('1e971a073f40d74a1e72e07c682e1cba0bae159b')
>>> commit.message
'Initial commit'
>>> commit.children
(<Commit: 9bd02434b834979bb69d0b752a403228f2e385e8>,)
>>> commit.projects
['user2589_minicms']
Commits can be queried by author, by project and by file name. Example:
>>> Commit.by_author('user2589 <valiev.m@gmail.com>')
(<Commit: 016ae4e8f82a88c7e136be26ec2e56ca37e8f0c4>,
... a rather long list of commits omitted ..
<Commit: fe7caac022031851d76f41216a2b3f44d52586a4>)
Note that Commit.by_file
returns only commits adding/changing/removing the file.
>>> Commit.by_file('minicms/templatetags/minicms_tags.py')
(<Commit: ba3659e841cb145050f4a36edb760be41e639d68>,
... 5 commits omitted ..
<Commit: d11431c3ef74770ac570a82b2fd9b19a690a4adc>)
Tree
Trees represent folders. Every commit has a root tree:
>>> commit.tree
<Tree: d20520ef8c1537a42628b72d481b8174c0a1de84>
Trees are iterable. Every element is a 3-string tuple: mode, filename, blob/tree sha:
>>> tree = Tree('a3a0624d9de2f153e4614863cc6ed2f086942b51')
>>> list(tree)
[('100755', '.gitignore', '9825f4f761657f2a8cc1352f2a5cd50a442fb624'),
('100644', 'MANIFEST.in', '96bc275bee57ddbe38acbd46776d907bc10f279f'),
('100644', 'README.rst', '7e2fa0485a64f0890f5f6ca7f8971bbd92dd9a87'),
('40000', 'minicms', '68223fc8336bc3c56e18cbe463d3713bb0d414ce'),
('100644', 'setup.py', 'a7550c30e0cb443ec79af189fc738ccf56ef3ed4')]
Note that subfolders are also trees. You can recognize them by mode "40000"
To recursively iterate a tree, use traverse()
:
>>> list(tree.traverse())
[('100755', '.gitignore', '9825f4f761657f2a8cc1352f2a5cd50a442fb624'),
('100644', 'MANIFEST.in', '96bc275bee57ddbe38acbd46776d907bc10f279f'),
('100644', 'README.rst', '7e2fa0485a64f0890f5f6ca7f8971bbd92dd9a87'),
('40000', 'minicms', '68223fc8336bc3c56e18cbe463d3713bb0d414ce'),
... some output omitted ...
('100644', 'minicms/views.py', '1e397174b6a04fdc4831ce809fb17dde2bd7a295'),
('100644', 'setup.py', 'a7550c30e0cb443ec79af189fc738ccf56ef3ed4')]
full()
will return this list as a string - it's helpful for debugging
>>> print tree.full()
100755 .gitignore 9825f4f761657f2a8cc1352f2a5cd50a442fb624
100644 MANIFEST.in 96bc275bee57ddbe38acbd46776d907bc10f279f
100644 README.rst 7e2fa0485a64f0890f5f6ca7f8971bbd92dd9a87
40000 minicms 68223fc8336bc3c56e18cbe463d3713bb0d414ce
... a bunch of omitted files...
100644 minicms/views.py 1e397174b6a04fdc4831ce809fb17dde2bd7a295
100644 setup.py a7550c30e0cb443ec79af189fc738ccf56ef3ed4
Note that traverse()
includes subtrees. If you want files only, files
will return a dictionary of {filename: blob_sha}
>>> tree.files
{'.gitignore': '9825f4f761657f2a8cc1352f2a5cd50a442fb624',
'MANIFEST.in': '96bc275bee57ddbe38acbd46776d907bc10f279f',
'README.rst': '7e2fa0485a64f0890f5f6ca7f8971bbd92dd9a87',
... some output omitted ...
'minicms/utils.py': '10ce01a41e4abb4da59a634a22bd0bb51c332ee9',
'minicms/views.py': '1e397174b6a04fdc4831ce809fb17dde2bd7a295',
'setup.py': 'a7550c30e0cb443ec79af189fc738ccf56ef3ed4'}
If you just want blobs without file names, there is a shortcut:
>>> tree.blobs
(<Blob: e69de29bb2d1d6434b8b29ae775ad8c2e48c5391>,
<Blob: fed6a5206e25905978fc9f0ff61fee5cdada74f1>,
... some output omitted ...
<Blob: 1e397174b6a04fdc4831ce809fb17dde2bd7a295>)
Parent trees, i.e. trees including this one:
>>> Tree('bd0930554fd24ee1c5b47125c1a206c2ac30621b').parents
(<Tree: 68223fc8336bc3c56e18cbe463d3713bb0d414ce>,
<Tree: e7826353f91d9ff5511027443624b455d32c96ed>)
Note that some (root) trees don't have parents
>>> tree.parents
()
Blob
Blobs represent file content. Blob is not exactly a file, since several trees might refer the same blob under different file names.
String representation of a blob is a file content:
>>> blob = tree.blobs[-1]
>>> print blob
# encoding: utf-8
from django.conf import settings
from django import http
from django.shortcuts import render_to_response
from django.template import RequestContext
... some output omitted ...
It is possible to access commits changing a blob with blob.commits
,
but this relation is not reliable. Please use with care.
Although blobs have property parents
, which used to point at parent trees,
it is not maintained and will throw a DeprecationWarning
Tag
This is the most useless object so far. It doesn't provide any functionality except validation that this tag exists
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for oscar-0.0.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4432d709637468a8ab78957e8151a747c3b8ab4279205c87171ab5a09803db5f |
|
MD5 | 20e2585d06ff9f85bd8fac5501fea241 |
|
BLAKE2b-256 | 615f8b6c64a66d3fa27c1e4880356f7bc79b90297ea1605ddd5f7509db96d4e6 |