Skip to main content

Calculating LRs for Collocated Tracks

Project description

Telcell

Telcell is a collection of scripts than can be used to determine the evidence that a pair of any phones is used by the same person.

Requirements

  1. Python 3.10

Pre-run

  1. Make sure the requirements are installed
pip install -r requirements.txt

Tests

To run the tests do:

pip install -r test-requirements.txt
coverage run --branch --source telcell --module pytest --strict-markers tests/

Run

python example.py

The script example.py contains information and an example pipeline to run the library. It uses testdata that is included in the repository and should return output like:

DummyModel: [1.0, 1.0, 1.0, 1.0]
MeasurementPairClassifier: [1.0, 1.0, 1.0, 1.0]

Input data

At this moment only a csv file can be used as input, but an extra input source can be easily realised if necessary. The following columns are expected to be present in the csv file:

    - owner
    - device
    - cellinfo.wgs84.lat
    - cellinfo.wgs84.lon
    - timestamp

Any additional columns are stored under the extra attribute of each resulting Measurement object.

After parsing, the data is stored as Tracks and Measurement objects. Each Track has an owner, device and a sequence of Measurement. A Measurement consists of coords (a Point object), a timestamp (datetime) and an extra (Mapping) for additional information.

Processing

The next step is data processing or crunching. The data will be transformed in a format that can be used by the models.

Models

Even though custom models can be used with telcell, a number of models have been implemented.

  • Dummy model: always returns an LR of 1.
  • MeasurementPairClassifier: generates colocated and dislocated training pairs from the tracks based on the time interval and the rarity of the location. Fits a Logistic Regression model and an ELUB bounded KDE calibrator. Returns LRs for the training data.
  • RarePairModel: uses coverage data (gps locations with corresponding antennas) to fit coverage_models for each time interval bin. For the validation/case data, a registration pair is chosen from a pair of tracks, for which the registrations are within a specific time interval and the location for the second measurement is the rarest with respect to the background of track_b. For this pair a calibrated score is given by the model/calibrator which is subsequently used to calculate the LR via a number of statistical calculations.

Evaluation

We use the evaluation of the library lrbenchmark. A Setup object is created with the run_pipeline function, the models that have to be evaluated, the necessary parameters and the data itself. All different combinations will be evaluated, resulting in multiple lrs that can be used to determine if the two phones were carried by the same person or not.

Generating LRs

Instead of using the pipeline for evaluation, functionality is also present to write the generated LRs to file. Replace make_output_plots with write_lrs with the required parameters and the LRs will be written to file.

Dashboards

The command to run the dashboard is PYTHONPATH=. streamlit run data_analysis/dashboard.py -- --file-name <PATH_TO_FILE> for the repository root. The PYTHONPATH is necessary to set, so streamlit can find the imported files correctly. In this dashboard, there exist two applications:

  1. Tracks and pairs, to visualize the registrations and pairs from a measurements.csv file (so this can both be casework as validation measurements). The <PATH_TO_FILE> refers to the path of this measurements.csv. You can overwrite the default path to this file with a different one. The extra -- in the above command is necessary for streamlit to parse the arguments belonging to the script, instead of to streamlit itself.
  2. Travel speed, to visualize the travel speeds found in a measurements.csv file, that is provided via <PATH_TO_FILE>.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

telcell-0.0.2.tar.gz (31.5 kB view details)

Uploaded Source

Built Distribution

telcell-0.0.2-py3-none-any.whl (39.2 kB view details)

Uploaded Python 3

File details

Details for the file telcell-0.0.2.tar.gz.

File metadata

  • Download URL: telcell-0.0.2.tar.gz
  • Upload date:
  • Size: 31.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for telcell-0.0.2.tar.gz
Algorithm Hash digest
SHA256 67166a1c3b5e1c4648df1bd54b3ae9b1cb49836fd100f69d7a903658e9421a0a
MD5 5047df54201e35b922c4d8b57eb268a9
BLAKE2b-256 0f707aab7ca9db340ab148ac947802e95e53787a1893a1de42c0922f70909baf

See more details on using hashes here.

File details

Details for the file telcell-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: telcell-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 39.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for telcell-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a2f2d6527752c48a2c2bedbc7de5291b7b3a569d79668e20ffc08c31ea1e5d14
MD5 e1defe832439e2d97b6b19dba4c796ce
BLAKE2b-256 9b966736d65ef3e8517d55d8779bb55fa9cebe519dc29ba507480e01e686e2e7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page