perf event wrapper for python
Project description
Performance Counters api for python
A high-level abstraction API for Linux perf events with low overhead
Table of contents
Install from pip
sudo apt install g++ gcc swig libpfm4-dev python3-dev python3-pip
pip install performance-features
Install from source
git clone https://github.com/VitorRamos/performance_features.git
cd performance_features
sudo ./install.sh
Usage
List events
from profiler import *
print(get_supported_pmus())
print(get_supported_events())
Sampling events
from profiler import *
try:
events= [['PERF_COUNT_HW_INSTRUCTIONS'],
['PERF_COUNT_HW_BRANCH_INSTRUCTIONS','PERF_COUNT_HW_BRANCH_MISSES'],
['PERF_COUNT_SW_PAGE_FAULTS']]
perf= Profiler(program_args= ['/bin/ls','/'], events_groups=events)
data= perf.run(sample_period= 0.01)
print(data)
except RuntimeError as e:
print(e.args[0])
How it works:
A c module create a workload using Linux ptrace to ensure we control the starting the application and collect the events data with minimal overhead. The events are setup using the perf_event_open syscall through the perfmom library.
What are the performance counters
Performance counters are special hardware registers available on most modern CPUs. These registers count the number of certain types of events: such as instructions executed, cache misses suffered, or branches mis-predicted without slowing down the kernel or applications. These registers can also trigger interrupts when a threshold number of events have passed and can thus be used to profile the code that runs on that CPU.
Reading Performance counters
- Instructions
-
rdmsr: Reads the contents of a 64-bit model specific register (MSR) specified in the ECX register into registers EDX:EAX. This instruction must be executed at privilege level 0 or in real-address mode
-
rdpmc: Is slightly faster that the equivelent rdmsr instruction. rdpmc can also be configured to allow access to the counters from userspace, without being priviledged.
-
- From Userspace (Linux) : The Linux Performance Counter subsystem provides an abstraction of these hardware capabilities. It provides per task and per CPU counters, counter groups, and it provides event capabilities on top of those. It provides "virtual" 64-bit counters, regardless of the width of the underlying hardware counters. Performance counters are accessed via special file descriptors. There's one file descriptor per virtual counter used. The special file descriptor is opened via the perf_event_open() system call. These system call do not use rdpmc but rdpmc is not necessarily faster than other methods for reading event values.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hashes for performance_features-0.2.4.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | b41377bd42e8cc1371aa9534a7288bdbde98f498d1fe987d9b8b690cbb9a58c8 |
|
MD5 | 34114727bea5770afc30675bce27b593 |
|
BLAKE2b-256 | 0979d893ab6b82980d903e75c2ed4fa6cd547705a0075c1763e030cef87c9046 |