Utility library for reading/writing Qlik View Data (QVD) files in Python.
Project description
PyQvd
Utility library for reading/writing Qlik View Data (QVD) files in Python.
The PyQvd library provides a simple API for reading/writing Qlik View Data (QVD) files in Python. Using this library, it is possible to parse the binary QVD file format and convert it to a Python object structure or vice versa.
- Install
- Usage
- QVD File Format
- API Documentation
- QvdDataFrame
@staticmethod from_qvd(path: str) -> QvdDataFrame
@staticmethod from_stream(source: BinaryIO) -> QvdDataFrame
@staticmethod from_dict(data: Dict[str, List[any]]) -> QvdDataFrame
@staticmethod from_pandas(data: pandas.DataFrame) -> QvdDataFrame
head(n: int) -> QvdDataFrame
tail(n: int) -> QvdDataFrame
select(*args: str) -> QvdDataFrame
rows(*args: int) -> QvdDataFrame
at(row: int, column: str) -> any
to_dict() -> Dict[str, List[any]]
to_qvd(path: str) -> None
to_stream(target: BinaryIO) -> None
to_pandas() -> pandas.DataFrame
- QvdDataFrame
- License
Install
PyQvd is a Python library available through pypi. The recommended way to install and maintain PyQvd as a dependency is through the package installer (PIP). Before installing this library, download and install Python.
You can get PyQvd using the following command:
pip install PyQvd
Usage
Below is a quick example how to use PyQvd.
from pyqvd import QvdDataFrame
df = QvdDataFrame.from_qvd('sample.qvd')
print(df.head(5))
The above example loads the PyQvd library and parses an example QVD file. A QVD file is typically loaded using the static
QvdDataFrame.from_qvd
function of the QvdDataFrame
class itself. After loading the file's content, numerous methods and properties are available to work with the parsed data.
QVD File Format
The QVD file format is a binary file format that is used by QlikView to store data. The format is proprietary. However, the format is well documented and can be parsed without the need of a QlikView installation. In fact, a QVD file consists of three parts: a XML header, and two binary parts, the symbol and the index table. The XML header contains meta information about the QVD file, such as the number of data records and the names of the fields. The symbol table contains the actual distinct values of the fields. The index table contains the actual data records. The index table is a list of indices which point to values in the symbol table.
XML Header
The XML header contains meta information about the QVD file. The header is always located at the beginning of the file and is in human readable text format. The header contains information about the number of data records, the names of the fields, and the data types of the fields.
Symbol Table
The symbol table contains the distinct/unique values of the fields and is located directly after the XML header. The order of columns in the symbol table corresponds to the order of the fields in the XML header. The length and offset of the symbol sections of each column are also stored in the XML header. Each symbol section consist of the unique symbols of the respective column. The type of a single symbol is determined by a type byte prefixed to the respective symbol value. The following type of symbols are supported:
Code | Type | Description |
---|---|---|
1 | Integer | signed 4-byte integer (little endian) |
2 | Float | signed 8-byte IEEE floating point number (little endian) |
4 | String | null terminated string |
5 | Dual Integer | signed 4-byte integer (little endian) followed by a null terminated string |
6 | Dual Float | signed 8-byte IEEE floating point number (little endian) followed by a null terminated string |
Index Table
After the symbol table, the index table follows. The index table contains the actual data records. The index table contains binary indices that refrences to the values of each row in the symbol table. The order of the columns in the index table corresponds to the order of the fields in the XML header. Hence, the index table does not contain the actual values of a data record, but only the indices that point to the values in the symbol table.
API Documentation
QvdDataFrame
The QvdDataFrame
class represents the data frame stored inside of a finally parsed QVD file. It provides a high-level abstraction access to the QVD file content. This includes meta information as well as access to the actual data records.
Property | Type | Description |
---|---|---|
shape |
tuple[int, int] |
The shape of the data frame. First value is number of rows, second value number of columns. |
data |
list[list[any]] |
The actual data. The first dimension represents the single rows. |
columns |
list[str] |
The names of the fields that are contained in the QVD file. |
@staticmethod from_qvd(path: str) -> QvdDataFrame
The static method QvdDataFrame.from_qvd
loads a QVD file from the given path and parses it. The method returns a QvdDataFrame
instance.
@staticmethod from_stream(source: BinaryIO) -> QvdDataFrame
The static method QvdDataFrame.from_stream
loads a QVD file from the given binary stream. The method returns a QvdDataFrame
instance.
@staticmethod from_dict(data: Dict[str, List[any]]) -> QvdDataFrame
The static method QvdDataFrame.from_dict
constructs a data frame from a dictionary. The dictionary must contain the columns and the actual data as properties. The columns property is an array of strings that contains the names of the fields in the QVD file. The data property is an array of arrays that contains the actual data records. The order of the values in the inner arrays corresponds to the order of the fields in the QVD file.
@staticmethod from_pandas(data: pandas.DataFrame) -> QvdDataFrame
The static method QvdDataFrame.from_pandas
constructs a data frame from a pandas data frame.
head(n: int) -> QvdDataFrame
The method head
returns the first n
rows of the data frame.
tail(n: int) -> QvdDataFrame
The method tail
returns the last n
rows of the data frame.
select(*args: str) -> QvdDataFrame
The method select
returns a new data frame that contains only the specified columns.
rows(*args: int) -> QvdDataFrame
The method rows
returns a new data frame that contains only the specified rows.
at(row: int, column: str) -> any
The method at
returns the value at the specified row and column.
to_dict() -> Dict[str, List[any]]
The method to_dict
returns the data frame as a dictionary. The dictionary contains the columns and the actual data as properties. The columns property is an array of strings that contains the names of the fields in the QVD file. The data property is an array of arrays that contains the actual data records. The order of the values in the inner arrays corresponds to the order of the fields in the QVD file.
to_qvd(path: str) -> None
The method to_qvd
writes the data frame to a QVD file at the specified path.
to_stream(target: BinaryIO) -> None
The method to_stream
writes the data frame as a QVD file to a binary stream.
to_pandas() -> pandas.DataFrame
The method to_pandas
returns the data frame as a pandas data frame.
License
Copyright (c) 2024 Constantin Müller
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
MIT License or LICENSE for more details.
Forbidden
Hold Liable: Software is provided without warranty and the software author/license owner cannot be held liable for damages.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file PyQvd-1.1.0.tar.gz
.
File metadata
- Download URL: PyQvd-1.1.0.tar.gz
- Upload date:
- Size: 12.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30a7a11401e7b69633bd2a0c8e7cdd4dabc1c4d94aeb01262e5c8567e9160250 |
|
MD5 | a9177d7a9fbfea18960768efe305eb61 |
|
BLAKE2b-256 | 10f04c5ff6cbf78cfee57fa67662872dfaf2e1d0685fc56d3bc553d9ac50a570 |
Provenance
File details
Details for the file PyQvd-1.1.0-py3-none-any.whl
.
File metadata
- Download URL: PyQvd-1.1.0-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f9681a7d0dceb6685d242c23ba1fa4f820c5dbdf7ee51e50f9a10e0d97f2833 |
|
MD5 | 45be83a8b42d74f040eb363f77ff3a68 |
|
BLAKE2b-256 | 9747efdeee6a287e21b4bcb4925cc826a9190b7bf87751e61882b730b64129c6 |