Skip to main content

A toolkit for keyword search and attack methods on user-provided datasets.

Project description

ROKSANA: Rewiring Of Keyword Search via Alteration of Network Architecture Toolkit

ROKSANA is a Python toolkit for performing keyword search and attack methods on user-provided datasets.

Features

  • Custom Datasets: Bring your own dataset using PyG Geometric's dataset structure.
  • Search Methods: Choose from pre-defined keyword search methods.
  • Attack Methods: Utilize pre-defined attack methods or implement your own.
  • Result Handling: Save results to files in various formats.
  • Leaderboard Integration: Submit your results to the leaderboard.

Installation

pip install roksana

Preparing the Test Set

To evaluate the effectiveness of the search methods, you can prepare a search set consisting of query nodes and their corresponding gold sets. The gold set for each query consists of all nodes in the dataset that share the exact same feature vector as the query node.

Function: prepare_search_set

from roksana.datasets import prepare_search_set

# Assume 'data' is a torch_geometric.data.Data object
queries, gold_sets = prepare_search_set(data, percentage=0.1, seed=42)

Attack Methods

ROKSANA provides a suite of attack methods to evaluate the robustness of your search algorithms. Currently, the package includes predefined attack methods that you can leverage out-of-the-box or extend with your custom implementations.

Available Attack Methods

  • random: Randomly adds or removes edges connected to the query node.
  • viking: Perturbs the feature vectors of the query node.

Using Attack Methods

from roksana.datasets import load_dataset, prepare_test_set
from roksana.attack_methods import get_attack_method

Load the Cora dataset

dataset = load_dataset(dataset_name='cora', root='data/')
data = dataset[0]

Prepare the test set

queries, gold_sets = prepare_test_set(data, percentage=0.1, seed=123)

Initialize an attack method

attack_method = get_attack_method('predefined_attack1', data=data, perturbations=2)

Perform attacks on queries

for query_node in queries:
    attack_details = attack_method.attack(query_node=query_node, perturbations=2)
    print(f"Attack on Node {query_node}: {attack_details}")

Evaluation

The Evaluation module in ROKSANA provides tools to assess the effectiveness of attack strategies on your search methods. By computing key metrics—Hit@k, Recall@k, and Demotion Value—you can quantify how attacks influence the performance and reliability of your search algorithms.

Key Metrics

  1. Hit@k

    • Definition: Measures whether at least one relevant node (from the gold set) appears in the top-k retrieved nodes.
    • Interpretation: Higher values indicate better performance in retrieving relevant nodes within the top-k results.
  2. Recall@k

    • Definition: Quantifies the proportion of relevant nodes that are retrieved in the top-k results.
    • Interpretation: Higher values signify that a larger fraction of relevant nodes are captured within the top-k retrieved nodes.
  3. Demotion Value

    • Definition: Measures the change in the rank of a target node (typically the query node itself) before and after an attack.
    • Interpretation: Positive values indicate that the target node has been ranked lower post-attack, reflecting the attack's effectiveness in degrading its visibility.

Using the Evaluation Module

Here's a step-by-step guide to evaluating the impact of an attack on a search method.

1. Load Dataset and Prepare Test Set

from roksana.datasets import load_dataset, prepare_test_set

Load the Cora dataset

dataset = load_dataset(dataset_name='cora', root='data/')
data = dataset[0]

Prepare the test set with 10% of nodes as queries

queries, gold_sets = prepare_test_set(data, percentage=0.1, seed=123)

Saving Evaluation Results

ROKSANA provides utility functions to save evaluation results in various formats, including JSON, CSV, and Pickle. These functions are located within the evaluation.utils module.

Available Functions

  • save_results_to_json(results: List[Dict[str, Any]], filepath: str) -> None
  • save_results_to_csv(results: List[Dict[str, Any]], filepath: str) -> None
  • save_results_to_pickle(results: List[Dict[str, Any]], filepath: str) -> None

Usage Example

from roksana.evaluation import save_results_to_csv, save_results_to_json, save_results_to_pickle

# Assuming 'results' is a list of dictionaries containing evaluation metrics
results = [
    {
        'query_node': 0,
        'k': 5,
        'Hit@k_before_attack': 1.0,
        'Hit@k_after_attack': 0.0,
        'Recall@k_before_attack': 0.5,
        'Recall@k_after_attack': 0.3,
        'Demotion_value': 2
    },
    # Add more results as needed
]

# Save results in different formats
save_results_to_csv(results, 'evaluation_results/results.csv')
save_results_to_json(results, 'evaluation_results/results.json')
save_results_to_pickle(results, 'evaluation_results/results.pkl')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

roksana-0.2.4.tar.gz (22.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ROKSANA-0.2.4-py3-none-any.whl (32.9 kB view details)

Uploaded Python 3

File details

Details for the file roksana-0.2.4.tar.gz.

File metadata

  • Download URL: roksana-0.2.4.tar.gz
  • Upload date:
  • Size: 22.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for roksana-0.2.4.tar.gz
Algorithm Hash digest
SHA256 7bc20c51f8354bca2fc91e6720114ef0ae5bd619306c5941f260e1526725c400
MD5 a8c1077a0e333307705fe9fecf84dfb9
BLAKE2b-256 d0e40dd8f4a7cd84ccc8d06e163baf2c26e161804a4cbb52b453d9a01d9ca172

See more details on using hashes here.

Provenance

The following attestation bundles were made for roksana-0.2.4.tar.gz:

Publisher: python-publish.yml on radinhamidi/roksana

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ROKSANA-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: ROKSANA-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 32.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for ROKSANA-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3635f241699e3b94e2ee565239d29ad32439325b13678c039669b7cabe02dac5
MD5 f3dae5025206e1f5b8d768f60c5d71e6
BLAKE2b-256 cdf810edcb0614dc8a0ba775c21befbf865076041d0278e5f17dd2e671e61da9

See more details on using hashes here.

Provenance

The following attestation bundles were made for ROKSANA-0.2.4-py3-none-any.whl:

Publisher: python-publish.yml on radinhamidi/roksana

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page