Skip to main content

Connect to your tabular model and perform operations programmatically

Project description

PyTabular

PyPI version Downloads readthedocs pages-build-deployment flake8

What is it?

PyTabular (python-tabular in pypi) is a python package that allows for programmatic execution on your tabular models! This is possible thanks to Pythonnet and Microsoft's .Net APIs on Azure Analysis Services. Currently, this build is tested and working on Windows Operating System only. Help is needed to expand this for other operating systems. The package should have the dll files included when you import it. See Documentation Here. PyTabular is still considered alpha while I'm working on building out the proper tests and testing environments, so I can ensure some kind of stability in features. Please send bugs my way! Preferably in the issues section in Github. I want to harden this project so many can use it easily. I currently have local pytest for python 3.6 to 3.10 and run those tests through a local AAS and Gen2 model.

Getting Started

See the Pypi project for available version.

python3 -m pip install python-tabular

In your python environment, import pytabular and call the main Tabular Class. Only parameter needed is a solid connection string.

import pytabular
model = pytabular.Tabular(CONNECTION_STR)

I'm a big fan of logging, if you don't want any just get the logger and disable it.

import pytabular
pytabular.logger.disabled = True

You can query your models with the Query method from your tabular class. For Dax Queries, it will need the full Dax syntax. See EVALUATE example. This will return a Pandas DataFrame. If you are looking to return a single value, see below. Simply wrap your query in the the curly brackets. The method will take that single cell table and just return the individual value. You can also query your DMV. See below for example. See PyTabular Docs for Query.

#Run basic queries
DAX_QUERY = "EVALUATE TOPN(100, 'Table1')"
model.Query(DAX_QUERY) #returns pd.DataFrame()

#or...
DMV_QUERY = "select * from $SYSTEM.DISCOVER_TRACE_EVENT_CATEGORIES"
model.Query(DMV_QUERY) #returns pd.DataFrame()

#or...
SINGLE_VALUE_QUERY_EX = "EVALUATE {1}"
model.Query(SINGLE_VALUE_QUERY_EX) #returns 1

#or...
FILE_PATH = 'C:\\FILEPATHEXAMPLE\\file.dax' #or file.txt
model.Query(FILE_PATH) #Will return same logic as above, single values if possible else will return pd.DataFrame()

You can also explore your tables, partitions, and columns. Via the Attributes from your Tabular class.

#Explore tables...
dir(model.Tables['Table Name'])

#Explore columns & partitions
dir(model.Tables['Table Name'].Partitions['Partition Name'])

#Only a few features right now, but check out the built in methods.
model.Tables['Table Name'].Refresh(Tracing = True)
#or
model.Tables['Table Name'].Partitions['Partition Name'].Refresh(Tracing = True)
#or
model.Tables['Table Name'].Partitions['Partition Name'].Last_Refresh()
#or
model.Tables['Table Name'].Row_Count()
#or
model.Tables['Table Name'].Columns['Column Name'].Distinct_Count()

Refresh method to handle refreshes on your model. This is synchronous. Should be flexible enough to handle a variety of inputs. See PyTabular Docs for Refreshing Tables and Partitions. Most basic way to refresh is input the table name string. The method will search for table and output exeption if unable to find it. For partitions you will need a key, value combination. Example, {'Table1':'Partition1'}. You can also take the key value pair and iterate through a group of partitions. Example, {'Table1':['Partition1','Partition2']}. Rather than providing a string, you can also input the actual class. See below for those examples, and you can acess them from the built in attributes self.Tables, self.Partitions or explore through the .Net classes yourself in self.Model.Tables.

#You have a few options when refreshing. 
model.Refresh('Table Name')

#or...
model.Refresh(['Table1','Table2','Table3'])

#or...
model.Refresh(<Table Class>)

#or...
model.Refresh(<Partition Class>)

#or...
model.Refresh({'Table Name':'Partition Name'})

#or any kind of weird combination like
model.Refresh([{<Table Class>:<Partition Class>,'Table Name':['Partition1','Partition2']},'Table Name','Table Name2'])

#You can even run through the Tables & Partition Attributes
model.Tables['Table Name'].Refresh()

#or
model.Tables['Table Name'].Partitions['Partition Name'].Refresh()

#Default Tracing happens automatically, but can be removed by -- 
model.Refresh(['Table1','Table2'], trace = None)

It's not uncommon to need to run through some checks on specific Tables, Partitions, Columns, Etc...

#Get Row Count from model
model.Tables['Table Name'].Row_Count()

#Get Last Refresh time from a partition
model.Tables['Table Name'].Last_Refresh()

#Get Distinct Count or Values from a Column
model.Tables['Table Name'].Columns['Column Name'].Distinct_Count()
#or
model.Tables['Table Name'].Columns['Column Name'].Values()

Use Cases

If blank table, then refresh table.

This will use the function Return_Zero_Row_Tables and the method Refresh from the Tabular class.

import pytabular
model = pytabular.Tabular(CONNECTION_STR)
tables = pytabular.Return_Zero_Row_Tables(model)
if len(tables) > 0:
    model.Refresh(tables, Tracing = True) #Add a trace in there for some fun.

Sneak in a refresh.

This will use the method Is_Process and the method Refresh from the Tabular class. It will check the DMV to see if any jobs are currently running classified as processing.

import pytabular
model = pytabular.Tabular(CONNECTION_STR)
if model.Is_Process():
    #do what you want if there is a refresh happening
else:
    model.Refresh(TABLES_OR_PARTITIONS_TO_REFRESH)

Show refresh times in model.

This will use the function Table_Last_Refresh_Times and the method Create_Table from the Tabular class. It will search through the model for all tables and partitions and pull the 'RefreshedTime' property from it. It will return results into a pandas dataframe, which will then be converted into an M expression used for a new table.

import pytabular
model = pytabular.Tabular(CONNECTION_STR)
df = pytabular.Table_Last_Refresh_Times(model, group_partition = False)
model.Create_Table(df, 'Refresh Times')

If BPA Violation, then revert deployment.

Uses a few things. First the BPA Class, then the TE2 Class, and will finish with the Analyze_BPA method. Did not want to re-invent the wheel with the amazing work done with Tabular Editor and it's BPA capabilities.

import pytabular
model = pytabular.Tabular(CONNECTION_STR)
TE2 = pytabular.Tabular_Editor() #Feel free to input your TE2 File path or this will download for you.
BPA = pytabular.BPA() #Fee free to input your own BPA file or this will download for you from: https://raw.githubusercontent.com/microsoft/Analysis-Services/master/BestPracticeRules/BPARules.json
results = model.Analyze_BPA(TE2.EXE,BPA.Location)

if len(results) > 0:
    #Revert deployment here!

Backup & Revert a Table.

USE WITH CAUTION, obviously not in PROD. I have been experimenting with this concept. Made for selfish reasons. Will probably get removed and I'll keep in my own local version. But fun to work with. Uses two methods. Backup_Table and Revert_Table

import pytabular
model = pytabular.Tabular(CONNECTION_STR)
model.Backup_Table('TableName') #This will backup the table with surround items (columns,measures,relationships,roles,hierarchies,etc.) and will add a suffix of '_backup'
#-----------#
#Make any changes to your original table and then revert or delete backup as necessary
#-----------#
model.Revert_Table('TableName') #This will essentially replace your original with _backup

Loop through and query Dax files

Let's say you have multiple dax queries you would like to store and run through as checks. The Query method on the Tabular class can also take file paths. Can really be any file type as it's just checking os.path.isfile(). But would suggest .dax or .txt. It will read the file that use that as the new Query_str argument.

import pytabular
model = pytabular.Tabular(CONNECTION_STR)
LIST_OF_FILE_PATHS = ['C:\\FilePath\\file1.dax','C:\\FilePath\\file1.txt','C:\\FilePath\\file2.dax','C:\\FilePath\\file2.txt']
for file_path in LIST_OF_FILE_PATHS:
    model.Query(file_path)

Advanced Refreshing with Pre and Post Checks

Maybe you are introducing new logic to a fact table, and you need to ensure that a measure checking last month values never changes. To do that you can take advantage of the Refresh_Check and Refresh_Check_Collection classes (Sorry, I know the documentation stinks right now). But using those you can build out something that would first check the results of the measure, then refresh, then check the results of the measure after refresh, and lastly perform your desired check. In this case the pre value matches the post value. When refreshing and your pre does not equal post, it would fail and give an assertion error in your logging.

from pytabular import Tabular
from pytabular.refresh import Refresh_Check, Refresh_Check_Collection

model = Tabular(CONNECTION_STR)

# This is our custom check that we want to run after refresh.
# Does the pre refresh value match the post refresh value.
def sum_of_sales_assertion(pre, post):
    return pre == post

# This is where we put it all together into the `Refresh_Check` class. Give it a name, give it a query to run, and give it the assertion you want to make.
sum_of_last_month_sales = Refresh_Check(
    'Last Month Sales',
    lambda: model.Query("EVALUATE {[Last Month Sales]}")
    ,sum_of_sales_assertion
)

# Here we are adding it to a `Refresh_Check_Collection` because you can have more than on `Refresh_Check` to run.
all_refresh_check = Refresh_Check_Collection([sum_of_last_month_sales])

model.Refresh(
    'Fact Table Name',
    refresh_checks = Refresh_Check_Collection([sum_of_last_month_sales])
    
)

Contributing

See CONTRIBUTING.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_tabular-0.1.9.2.tar.gz (2.3 MB view hashes)

Uploaded Source

Built Distribution

python_tabular-0.1.9.2-py3-none-any.whl (2.3 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page