Convenient filesystem interface over Oracle Cloud's Object Storage
Project description
Oracle Cloud Infrastructure Object Storage fsspec Implementation
The Oracle Cloud Infrastructure Object Storage service is an internet-scale, high-performance storage platform that offers reliable and cost-efficient data durability. With Object Storage, you can safely and securely store or retrieve data directly from the internet or from within the cloud platform.
ocifs
is part of the fsspec
intake/filesystem_spec ecosystem
a template or specification for a file-system interface, that specific implementations should follow, so that applications making use of them can rely on a common interface and not have to worry about the specific internal implementation decisions with any given backend.
ocifs
joins the list of file systems supported with this package. Theintake/filesystem_spec
project is used by Pandas, Dask and other data libraries in python, this package adds Oracle OCI Object Storage capabilties to these libraries.
OCIFS file system style operations Example:
from ocifs import OCIFileSystem
fs = OCIFilesystem("~/.oci/config")
# 1.Create empty file or truncate in OCI objectstorage bucket
fs.touch("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt", truncate=True, data=b"Writing to Object Storage!")
# 2.Fetch(potentially multiple paths' contents
fs.cat("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt")
# 3.Get metadata about a file from a head or list call
fs.info("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt")
# 4.Get directory listing page
fs.ls("oci://<my_bucket>@<my_namespace>/<my_prefix>/", detail=True)
# 5.Is this entry directory-like?
fs.isdir("oci://<my_bucket>@<my_namespace>")
# 6.Is this entry file-like?
fs.isfile("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt")
# 7.If there is a file at the given path (including broken links)
fs.lexists("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt")
# 8.List of files for the given path
fs.listdir("oci://<my_bucket>@<my_namespace>/<my_prefix>", detail=True)
# 9.Get the first ``size`` bytes from file
fs.head("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt", size=1024)
# 10.Get the last ``size`` bytes from file
fs.tail("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt", size=1024)
# 11.Hash of file properties, to tell if it has changed
fs.ukey("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt")
# 12.Size in bytes of file
fs.size("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt")
# 13.Size in bytes of each file in a list of paths
paths = ["oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt"]
fs.sizes(paths)
# 14.Normalise OCI path string into bucket and key.
fs.split_path("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt")
# 15.Delete a file from the bucket
fs.rm("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt")
# 16.Get the contents of the file as a byte
fs.read_bytes("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt", start=0, end=13)
# 17.Get the contents of the file as a string
fs.read_text("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt", encoding=None, errors=None, newline=None)
# 18.Get the contents of the file as a byte
fs.read_block("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt", 0, 13)
# 19.Open a file for writing/flushing into file in OCI objectstorage bucket
# Ocifs sets the best-guessed content-type for hello.txt i.e "text/plain"
with fs.open("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt", 'w', autocommit=True) as f:
f.write("Writing data to buffer, before manually flushing and closing.") # data is flushed and file closed
f.flush()
# Ocifs uses the specified content-type passed in the open while writing to OCI objectstorage bucket
with fs.open("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt", 'w',content_type='text/plain') as f:
f.write("Writing data to buffer, before manually flushing and closing.") # data is flushed and file closed
f.flush()
# 20.Open a file for reading a file from OCI objectstorage bucket
with fs.open("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt") as f:
print(f.read())
# 21.Space used by files and optionally directories within a path
fs.du("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello10.csv")
# 22.Find files by glob-matching.
fs.glob("oci://<my_bucket>@<my_namespace>/<my_prefix>/*.txt")
# 23.Renames an object in a particular bucket in tenancy namespace on OCI
fs.rename("oci://<my_bucket>@<my_namespace>/<my_prefix>/hello.txt", "oci://<my_bucket>@<my_namespace>/<my_prefix>/hello2.txt")
# 24.Delete multiple files from the same bucket
pathlist = ["oci://<my_bucket>@<my_namespace>/<my_prefix>/hello2.txt"]
fs.bulk_delete(pathlist)
Or Use With Pandas
import pandas as pd
import ocifs
df = pd.read_csv(
"oci://my_bucket@my_namespace/my_object.csv",
storage_options={"config": "~/.oci/config"},
)
Or Use With PyArrow
import pandas as pd
import ocifs
df = pd.read_csv(
"oci://my_bucket@my_namespace/my_object.csv",storage_options={"config": "~/.oci/config"})
Or Use With ADSDataset
import ads
import pandas as pd
from ads.common.auth import default_signer
from ads.dataset.dataset import ADSDataset
ads.set_auth(auth="api_key", oci_config_location="~/.oci/config", profile="<profile_name>")
ds = ADSDataset(
df=pd.read_csv(f"oci://my_bucket@my_namespace/my_object.csv", storage_options=default_signer()),
type_discovery=False
)
print(ds.df)
Getting Started
python3 -m pip install ocifs
Software Prerequisites
Python >= 3.6
Environment Variables for Authentication:
export OCIFS_IAM_TYPE=api_key
export OCIFS_CONFIG_LOCATION=~/.oci/config
export OCIFS_CONFIG_PROFILE=DEFAULT
Note, if you are operating on OCI with an alternative valid signer, such as resource principal, instead set the following:
export OCIFS_IAM_TYPE=resource_principal
Environment Variables for enabling Logging:
To quickly see all messages, you can set the environment variable OCIFS_LOGGING_LEVEL=DEBUG.
export OCIFS_LOGGING_LEVEL=DEBUG
Documentation
Support
The built-in filesystems in fsspec
are maintained by the intake
project team, where as ocifs
is an external implementation (similar to s3fs
, gcsfs
, adl/abfs
, and so on), which is maintained by Oracle.
Contributing
This project welcomes contributions from the community. Before submitting a pull request, please review our contribution guide
Security
Please consult the security guide for our responsible security vulnerability disclosure process
License
Copyright (c) 2021, 2023 Oracle and/or its affiliates.
Released under the Universal Permissive License v1.0 as shown at https://oss.oracle.com/licenses/upl/.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ocifs-1.3.1.tar.gz
.
File metadata
- Download URL: ocifs-1.3.1.tar.gz
- Upload date:
- Size: 57.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a4e25ee1df75ec94d74cdb3b54f1629fc32d3cd0fb6c15fc89296550a9fc45f8 |
|
MD5 | 9998076000c47a7cadd932ed24a54995 |
|
BLAKE2b-256 | dc019742b6280e40f061e74cfe4cdfc064ea424208e33e7b011824bd0fdfe7a8 |
File details
Details for the file ocifs-1.3.1-py3-none-any.whl
.
File metadata
- Download URL: ocifs-1.3.1-py3-none-any.whl
- Upload date:
- Size: 67.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 55a96bfd4421f6bebadd11821a934bd5325d8fb51dc71ed56fd164b382c0af4c |
|
MD5 | 1a22ddcd7bab57cbdbb98f7d355968b0 |
|
BLAKE2b-256 | 709e0c69ccfafd952d60aa95fb5e943708ece44cbcb6a9e493dfa738061455dc |