A python library used to query data from the Eigen Ingenuity system
Project description
CONTENTS
- About
- Eigen Ingenuity
- Asset Model Builder
- Coming soon
- Planned
- License
About
This library supports python 3.10 onwards. It may work for earlier versions of python3, but these are not supported. We do not support python 2 in any form.
The python-eigen-ingenuity library contains 2 modules:
1. Eigen Ingenuity
This module is used to query data from many of the databases in the Ingenuity Platform, including:
- timeseries historians (influx, PI, IP21, cognite)
- A Neo4j Graph database
- Sql sources (Oracle, msql, psql)
- Elasticsearch
The data can be returned in several formats, and supports multiple query types
2. Model Builder
We provide a portable CLI tool that can be used to build a model onto a neo4j instance from a list of csv files that define a set of nodes/properties and their relationships.
It can be used either to create a model from scratch (Though it does require an existing blank neo4j container/machine), or it can be directed to an existing neo4j to perform a merge/update.
It includes options for version nodes to track history and changes to nodes when a model is updated.
Eigen Ingenuity
Installation
Install python (3.10+), then in the terminal run:
pip install python-eigen-ingenuity
All required Third party libraries will be automatically installed.
Getting Started
Begin by Importing the module at the top of a script with
import eigeningenuity as eigen
To use this module, you must first set an Ingenuity server to query, and a datasource within the server.
For example, for a historian with Ingenuity instance "https://demo.eigen.co/" and datasource "Demo-influxdb",
server = eigen.EigenServer("https://demo.eigen.co/")
demo = eigen.get_historian("Demo-influxdb",server)
Alternatively, it is possible to set the Ingenuity instance as the environmental variable "EIGENSERVER",
os.environ["EIGENSERVER"] = "https://demo.eigen.co/"
demo = get_historian("Demo-influxdb")
If the datasource of interest is the default datasource for the ingenuity instance, it can also be omitted:
os.environ["EIGENSERVER"] = "https://demo.eigen.co/"
demo = get_historian()
With the datasource set, the historian data can be queried with functions such as,
demo.getInterpolatedRange(tag,start,end,points)
demo.getCurrentDataPoints(tag)
demo.listDataTags()
Where:
- tag is the name of the tag to query
- start is the epoch timestamp (ms) of the beginning of the query window
- end is the epoch timestamp (ms) of the end of the query window
- points is the number of points to be returned
Each function will return an list, each element consisting of list of a value, timestamp and status, corresponding to a single point of data
To convert a Datetime (UTC or Local) to epoch, or vice-versa, you can use this tool: https://www.epochconverter.com/
Historian
Data Format
Once the server and datasource have been configured, the historian data can be queried through functions we define in the EXAMPLE FUNCTIONS section.
These functions can be used to query a single tag, or multiple tags at once. A tag in ingenuity with the form "datasource/tagname", we query with, for example:
datasource = eigen.get_historian("datasource")
tagdata = datasource.getCurrentDataPoints("tagname")
Functions have multiple options on how to return the data, that can be specified using the "output" parameter:
- The Raw Response. (output="raw")
- A preformatted python dict (default: output="json")
- a pandas dataframe (default: output="df")
Example
x = influx.getInterpolatedRange("DEMO_02TI301.PV","1 hour ago","now",3)
-
Raw:
{'items': {'DEMO_02TI301.PV': [{'value': 38.0, 'timestamp': 1701166741139, 'status': 'OK'}, {'value': 37.5, 'timestamp': 1701168541139, 'status': 'OK'}, {'value': 38.0, 'timestamp': 1701170341139, 'status': 'OK'}]}, 'unknown': []}
-
Json
[{'value': 35.88444444444445, 'timestamp': 1701166983980, 'status': 'OK'}, {'value': 33.5, 'timestamp': 1701168783980, 'status': 'OK'}, {'value': 34.0, 'timestamp': 1701170583980, 'status': 'OK'}]
-
Dataframe
--- DEMO_02TI301.PV 2023-11-28 11:23:39.201 38.0 2023-11-28 10:53:39.201 36.0 2023-11-28 10:23:39.201 33.0
-
CSV
DEMO_02TI301.PV,37.1718,1701167341282,OK DEMO_02TI301.PV,35.5,1701169141282,OK DEMO_02TI301.PV,37.0,1701170941283,OK
The CSV output type allows for 2 additional optional parameters:
- multi-csv: Creates a separate csv for each tag queried, rather than placing them all in one. Also puts tag in filename rather than in row
- filepath: Specify a directory to write the csv files to
Query Multiple tags
if multiple tags are queried in a single request, the data will be returned as a dictionary, with the tag IDs as its keys, the individual dictionary entries will retain the same format returned when querying a single tag
Functions
General Functions:
Simple Functions to check server defaults
List Historians
Method: list_historians
Find all historians on the instance
from eigeningenuity import list_historians
list_historians(eigenserver)
Where:
- (Optional) eigenserver is the ingenuity instance of interest (If omitted will look for environmental variable EIGENSERVER)
Returns a list of strings
Get Default Historian
Method: get_default_historian_name
Find the name of the default historian of the instance, if one exists
from eigeningenuity import get_default_historian_name
get_default_historian_name(eigenserver)
Where:
- (Optional) eigenserver is the ingenuity instance of interest (If omitted will look for environmental variable EIGENSERVER)
Returns a string, or None
Read Functions
The following functions are designed to help the user pull and process data from historians into a python environment
Get Current Data Points
Method: getCurrentDataPoints
Find the most recent raw datapoint for each tag
demo.getCurrentDataPoints(tags,output)
Where:
- tags is a list of IDs of tags to query
- output (optional) See DATA FORMAT section
- multi-csv (optional) See DATA FORMAT section
- filepath (optional) See DATA FORMAT section
Returns one datapoint object per tag
Get Number of Points
Method: countPoints
Find the number of datapoints in the given time frame
demo.countPoints(tag, start, end, output)
Where:
- tags is a list of IDs of tags to query
- start is the datetime object (or epoch timestamp in ms) of the beginning of the query window
- end is the datetime object (or epoch timestamp in ms) of the end of the query window
- output (optional) See DATA FORMAT section
- multi-csv (optional) See DATA FORMAT section
- filepath (optional) See DATA FORMAT section
Returns one integer per tag
Get Interpolated Points in a Time Range
Method: getInterpolatedRange
Find a number of interpolated points of a tag, equally spaced over a set timeframe
demo.getInterpolatedRange(tag, start, end, count, output)
Where:
- tags is a list of IDs of the tags to query
- start is the datetime object (or epoch timestamp in ms) of the beginning of the query window
- end is the datetime object (or epoch timestamp in ms) of the end of the query window
- count is the total number of points to be returned
- output (optional) See DATA FORMAT section
- multi-csv (optional) See DATA FORMAT section
- filepath (optional) See DATA FORMAT section
Returns a list of count-many datapoints per tag
Get Values at Given Times
Method: getInterpolatedpoints
Find datapoints at given timestamps
demo.getInterpolatedPoints(tags, timestamps, output)
Where:
- tags is a list of IDs of the tags to query
- timestamps is a list of timestamps at which to query data
- output (optional) See DATA FORMAT section
- multi-csv (optional) See DATA FORMAT section
- filepath (optional) See DATA FORMAT section
Returns a list of datapoints (one at each timestamp) per tag
Get Raw Points in a Time Range
Method: getRawDataPoints
Find the first n Raw datapoints from a time window
demo.getRawDataPoints(tags, start, end, count, output)
Where:
- tags is a list of IDs of the tags to query
- start is the datetime object (or epoch timestamp in ms) of the beginning of the query window
- end is the datetime object (or epoch timestamp in ms) of the end of the query window
- (Optional) count is the maximum number of raw datapoints to return. (default is 1000)
- output (optional) See DATA FORMAT section
- multi-csv (optional) See DATA FORMAT section
- filepath (optional) See DATA FORMAT section
Returns a list of count-many datapoints per tag
Get Aggregates for a Time Range
Method: getAggregates
Finds a set of aggregate values for tags over a timeframe
demo.getAggregates(tags, start, end, count, aggfields, output)
Where:
- tags is a list of IDs of the tags to query
- start is the datetime object (or epoch timestamp in ms) of the beginning of the query window
- end is the datetime object (or epoch timestamp in ms) of the end of the query window
- (Optional) count is the number of divisions to split the time window into (i.e. if time window is one day, and count is 2, we return separate sets of aggregate data for first and second half of day). omit for count=1
- (Optional) aggfields is a list of aggregate functions to calculate, a subset of ["min","max","avg","var","stddev","numgood","numbad"]. Leave blank to return all aggregates.
- output (optional) See DATA FORMAT section
- multi-csv (optional) See DATA FORMAT section
- filepath (optional) See DATA FORMAT section
Returns a list of count-many Aggregate Data Sets per tag
Get Aggregates on Intervals over a Time Range
Method: getAggregateInterval
A variation of getAggregates which finds aggregates on fixed length intervals dividing the overall window
demo.getAggregateInterval(tags, start, end, interval, aggfields, output)
Where:
- tags is a list of IDs of the tags to query
- start is the datetime object (or epoch timestamp in ms) of the beginning of the query window
- end is the datetime object (or epoch timestamp in ms) of the end of the query window
- (Optional) interval is the length of the sub-intervals over which aggregates are calculated, it accepts values such as ["1s","1m","1h","1d","1M","1y"] being 1 second, 1 minute, 1 hour etc. Default is whole time window.
- (Optional) aggfields is a list of aggregate functions to calculate, a subset of ["min","max","avg","var","stddev","numgood","numbad"]. Default is all Aggregates.
- output (optional) See DATA FORMAT section
- multi-csv (optional) See DATA FORMAT section
- filepath (optional) See DATA FORMAT section
Returns a list of Aggregate Data Sets (One per interval) per tag
List Data Tags Matching Wildcard
Method: listDataTags
Find all tags in datasource, or all tags in datasource that match a search parameter
demo.listDataTags(match)
Where:
- (optional) match is the regex wildcard to match tags to (i.e. DEMO* will match all tags beginning with DEMO, *DEMO* will match all tags containing DEMO, and *DEMO will match all tags ending with DEMO) (Leave blank to return all tags in historian)
Returns a list of strings
Get Tag Metadata
Method: getMetaData
Find units, unitMultiplier and description of each tag
demo.getMetaData(tags, output)
Where:
- tags is a list of IDs of tags to query
- output (optional) Does Not Accept CSV. Otherwise, See DATA FORMAT section
Returns a dict with keys [units, unitMultiplier, description] per tag
Write Functions
The following functions are intended for users to update/create historian tags using data processed/loaded in python. They can only be run against Eigen's internal influx historians, not production systems.
Create Or Update Data Tag
Method: createDataTag
Creates a datatag with a specified ID, Unit type/label, and Description. You can use an existing tag name to update the metadata
demo.createDataTag(Name, Units, Description)
Where:
- Name is the unique ID/Identifier of the tag
- Units is the unit specifier of the data in the tag e.g. "m/s","Days" etc. (This will be shown on axis in ingenuity trends)
- Description is text/metadata describing the content/purpose of the tag (This will show up in search bar for ingenuity trends)
Returns a boolean representing success/failure to create tag
Write Data Points to Tag
Method: writeDataPoints
Writes sets of datapoints to the historian
from eigeningenuity.historian import DataPoint
dataPoints = []
point = DataPoint(value, timestamp, "OK")
dataPoint = {tagName: point}
dataPointList.append(dataPoint)
demo.writeDataPoints(dataPointList)
Where:
- value if the value of the datapoint at the timestamp
- timestamp is the datetime object (or epoch timestamp in ms) of the point
- "OK" is the status we give to a point that contains non-null data
Returns a boolean representing success/failure to write data
Asset Model
Currently the AM tools only support direct queries using cypher queries directly with the executeRawQuery method. More structured methods are planned.
Execute a Raw Cypher Query
Method: executeRawQuery
Executes a cypher query directly against our asset model
from eigeningenuity import get_assetmodel, EigenServer
demo = EigenServer("demo.eigen.co")
am = get_assetmodel(demo)
wells = demo.executeRawQuery(query)
Where:
- query is a string containing a valid neo4j/cypher language query e.g. "Match (n:Well) return n limit 25"
Returns the json response from neo4j
Asset Model Builder
A tool for updating a Neo4j model using .csv anf .cypher files as input
Running the tool
The tool is invoked by executing, anywhere on the cli
assetmodel
Command Line options
-p - sets the default path for input files. The program will look first in the current folder, and then in the default folder if it's not find locally. Output files are always created in the default folder
-dr - Set the frequency of progress update messages. No effect if -v present -sf - The first query to process in a .csv file. No effect on cypher files. Default is 1
-d - specifies the target database, as defined in the config file. Can specify the DB name (as defined in config.ini), or a position in the list e.g. -d 0 will connect to the first database defined
-c - the name of the config file if not config.ini, or it's not in the default folder
-s - the separator used in the .csv files. Default is ";"
-v - run in verification mode. Queries are generated and displayed on screen, but are not processed
-sq - show queries on screen as they are processed. Has no effect if -v present
-sr - causes data from RETURN clauses to be displayed on screen. No effect if -v present
-wq - causes the processed queries to be writen to the specified file. This is very useful to review all the queries that have been executed. It includes all the queries used to create and update Version nodes too. No effect if -v present
-wr - causes data from RETURN clauses to be writen to the specified file. No effect if -v present
-f - list of input files. Can be a mix of .csv and .cypher files. Also supports .lst files, which are a simple text file containing a list of files to process (can also include more .lst files). See -o for information on the order the files are processed in
-q - list of Cypher queries explicitly entered on the command line. These are processed before any files specified with -f, unless the default order is overriden by the -o switch
-pre - list of Cypher queries explicitly entered on the command line. These are processed first, before anything else
-da - delete all nodes with the specified labels. More than one set of labels can be specified. The delete is performed AFTER any -pre queries. Using -da with no labels will delete all nodes, so use with caution! Examples:
- -da :Person :EI_VERSION:Pump deletes all nodes with a Person label, and then all nodes with both EI_VERSION and Pump labels. Note: the leading ':' is optional so -da Person will also delete all Person nodes
- -da (with no labels) deletes ALL nodes - be careful!
-post - list of Cypher queries explicitly entered on the command line. These are processed last, after everything else
-o - defines the order the file types are processed in. Accepts up to 5 chars: 'n' for node csv files, 'r' for relationship csv files, 'c' for cypher files, 'q' for command line queries and 'x' for any unspecified file type, processed in the order given. The default is qnrc. Examples:
- -o x processes all the inputs in the order listed
- -o c will process all .cypher files first (in the order listed), then others in the default order. This is the same as -o cqnr
- -o cx will process all .cypher files first, then others in the order listed
- -o nxr processes nodes, then cypher files and command line queries in the order listed, and finally relationship files
-nov - suppresses the creation and update of version nodes
-in - treat node columns in a relationship csv file as properties of the relationship. This allow a relationship to have a property that is the same as that used to identify a node. Without using this qualifier, the system will report an 'Ambiguous data' error because it cannot determine if the csv file is intended to be a node file or a relationship file.
-sd - prevents the creation of a version node if none of the node properties or labels will be changed by the update
-ab - Update a property even if it is blank
-b - group queries into batches of the given size. Note: creation of version nodes is disabled in batch mode i.e. -nov is in effect
Examples
assetmodelbuilder -p models/Testing -d neo4j-test -f People.csv "Extended.cypher" -v
Looks in a folder models/Testing for the file People.csv and config.ini. It uses this csv file to generate queries that it simulates executing against the server configured with name neo4j-test in config.ini. However the database is not written to due to -v flag.
assetmodelbuilder -p models/Testing -d neo4j-test -q "MATCH (n) RETURN n" -sr
Looks in a folder models/Testing for the file config.ini. Then executes cypher command "MATCH (n) RETURN n" against the server configured with name neo4j-test in config.ini and returns the neo4j response to the console due to -sr flag
python src/assetmodelbuilder.py -p models/Testing -d 1 -f Rebuild.lst Versions.csv RelationVersions.csv -o x
Deletes all the nodes and recreate them, with Version nodes and updated Relationships.
The Config File
This file contains key information about the structure of the Asset Model. The layout is
[DEFAULT]
[Database]
#Define the database connections here. The name is used by the program to identify the db to connect to
DatabaseNames=Database0 Name,Database1 Name
URLs=Database0 URL,Database1 URL
passwords=Database0 password,Database1 password
[Model]
#The PrimaryIDProperty is the property that is used to uniquely identify it. It is created and managed
#by the model builder. The default value is uuid
PrimaryIDProperty=uuid
#The PrimaryProperty is the default property to use to match on when creating relationships
#This can be overriden by specifying the required property in the csv files
PrimaryProperty=node
#Specify any labels that are required by ALL nodes. This is a shortcut so that are not needed in the .csv file
#Labels defined in the .csv file can be REMOVED by listing them here with a leading !
#In this example, any nodes with a Person label will have that removed, and all nodes will get a People label
RequiredLabels= !Person , People
#Similar to labels, list any properties that a node must have together with a default value. If a value is provided
#in the input file, that value is used rather than the default value here
#Also like labels, properties can be removed by using the ! prefix (no need to give a default)
#In this example, everyone will have a Nationality property equal to British, apart from those whose Nationality is in the input
#Everyone will have their Weight property removed. Phew!
RequiredProperties=Nationality=British , !Weight
#All nodes are timestamped when they are created or updated. Specify the name of the node property to be used
#for these using CreationTime and UpdateTime
CreationTime=CreationTime
UpdateTime=UpdatedTime
#Sometimes there are columns in the .csv files that have the wrong name. These can be changed using a mapping
#in the format of old_name=new_name. The new_name can then be used in the [Alias] section (see below)
#In the example, the .csv files use as mix of Node and node for the name of the node. Both of these are mapped to
#name, so that all the nodes in the model have a name property. name can then be used to create relationships, for example.
Mappings=node=name,Node=name
#To set a default data use, use DataType=. The given format will be used, unless a format is specified in the CSV header. For example, to treat data as strings use DataType=str
[Aliases]
#The Alias section defines how to map column headings onto meaning in the tool
#Nodes and Labels are used to define nodes in a 'Properties' type file
Labels=Label, Code
Nodes=Node
#FromNodes, ToNodes and Relationships are used to create relationships, in the obvious way
FromNodes=From, Start Node, Start, From Node, StartNode,Start Node
ToNodes=To,To Node,ToNode,End
#FromLabels and ToLables are added to the From and To nodes in the obvious way to speed up Relationship matches
FromLabels=
ToLables=
Relationships=Relation,Relationship
[Versions]
Versions=
FrozenLabels=Person
VersionLabels=!Current,Version
VersionPrefix=Version
VersionCounterProperty=VersionCount
VersionNumberProperty=VersionNumber
FirstVersionProperty=FirstVersion
LastVersionProperty=LastVersion
VersionRelationship=HasVersion
NextVersionRelationship=NextVersion
VersionValidFromProperty=ValidFrom
VersionValidToProperty=ValidTo
Leading and training spaces are ignored in all the entries in the file so feel free to add spaces to improve readability.
For example, the provided aliases for FromNode are "From", "Start Node", "Start", "From Node", "StartNode" and "Start Node"
Note: you can see "Start Node" is in the list twice - this is not a problem
The program will treat any column in the csv file with any of those headings as FromNode
Example Config File
[DEFAULT]
[Database]
DatabaseNames=lunappi,lunappd,local
URLs=https://demo.eigen.co/ei-applet/neo4j/query,https://demo.eigen.co/ei-applet/neo4j/query,bolt://localhost:7687
users=,,neo4j
passwords=,,neo4j4neo
[Model]
PrimaryProperty=name
RequiredLabels=MoviesDB
RequiredProperties=
PrimaryIDProperty=uuid
CreationTime=creationDate
UpdateTime=updatedDate
Mappings=
[Aliases]
Labels=
Nodes=name
FromNodes=
FromLabels=
ToNodes=
ToLabels=
Relationships=
[Versions]
Versions=
FrozenLabels=
VersionLabels=Version,!Current
VersionLabelPrefix=Version
VersionCounterProperty=versionCount
VersionNumberProperty=versionNumber
FirstVersionProperty=firstVersion
LastVersionProperty=lastVersion
VersionRelationship=has_version
NextVersionRelationship=next_version
VersionValidFromProperty=validFrom
VersionValidToProperty=validTo
Project structure
assetmodelbuilder/
├── README.md # Project general information
├── src/ # All source code
│ ├── connectors # Folder for the database connector code
│ │ ├── boltconnector.py # Connection for the Bolt protocol (i.e. neo4j)
│ │ ├── dbconnectors.py # Connector manager
│ │ ├── genericconnector.py # Super class for all connectors
│ │ └── httpsconnector.py # Connector for https protocol
│ ├── queries # Folder to build and manager Cypher queries
│ │ ├── cypherbuilder.py # Main class to build a query. Modified version of Cymple
│ │ ├── query.py # A custom Neo4jQuery class
│ │ └── typedefs.py # Defines the 'properties' type used by Cymple
│ ├── assetmodelbuilder.py # Main program
│ ├── assetmodelutilities.py # Miscellaneous utility functions
│ ├── configmanager.py # Reads the config.ini file and manages the model configuration
│ ├── csvcontentsmanager.py # Reads a csv file and provides the format and content
│ ├── filemanager.py # Works out the file format of each input file and arranges them in process order
│ ├── fileprocessors.py # Processors for each type of input file
│ ├── messages.py # Formats and displays all messages to the user
│ └── versionmanager.py # Create a new version of a node when it is updated
└── models/ # Tests
├── default
│ └── config.ini # Contains default model parameters if not present in the model folder
└── Testing # Test data files, such as input/output files, mocks, etc.
├── config.ini # Model parameters for the Test model
├── DeleteAll.cypher # Cypher to delete all the Family nodes
├── People.csv # File containing some data defining People nodes
├── Relations.csv # And some relationships between them
├── Extended Family.cypher # Some Cypher queries creating extended family members and relations
├── Versions.csv # File with some updates to nodes to test Versioning
├── RelationVersions.csv # File with some updates to Relations to test Versioning
└── Rebuild.lst # File listing all the input files. Use with -o x to process in the listed order
Coming soon
Eigen Ingenuity
- Integration of a new historian api that allows for cross-datasource requests, with more request options/methods.
- Improvements to Azure Authentication functionality, and client integration
Asset Model Builder
- Options for special processing. For example, an option to over-write existing nodes, rather than just update them
Planned
Eigen Ingenuity
- Integration with 365 Events and Power Automate
- Sphynx/Jupyter notebook with worked examples for all functionality
Asset Model Builder
- Options for special processing. For example, an option to over-write existing nodes, rather than just update them
- Options to remove nodes, relations and/or properties and labels from them
License
Apache License 2.0
Copyright 2022 Eigen Ltd.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for python_eigen_ingenuity-0.4.36b2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab2af6189df98025c4bdb6fbd11accf81d3472a215712b5d08ba8d4a801f2eec |
|
MD5 | 435d25407a7f454e386f2b1f15d4b2b0 |
|
BLAKE2b-256 | 0c5026c8727e49582f58bab9d0ef46e813da334d419898c8187128f760b148b6 |
Hashes for python_eigen_ingenuity-0.4.36b2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b425d0280cee40c0c8bee4d81f11ac3eaf49185dc45e1b5556e908bff54f39b5 |
|
MD5 | 5bda4d32b7cfe82b1220627993cfe105 |
|
BLAKE2b-256 | ab6bb74905a9ad8d4200c069705b96e4d31c11e8ce32280cd4a139fed7857c14 |