pygip.models.defense package

Subpackages

Submodules

pygip.models.defense.BackdoorWM module

class pygip.models.defense.BackdoorWM.BackdoorWM(dataset, attack_node_fraction, model_path=None, trigger_rate=0.01, l=20, target_label=0)[source]

Bases: BaseDefense

_abc_impl = <_abc_data object>
_load_model()[source]

Load a pre-trained model.

defend()[source]

Execute the backdoor watermark attack.

evaluate_model(model, features, labels)[source]

Evaluate model performance

inject_backdoor_trigger(data, trigger_rate=None, trigger_feat_val=0.99, l=None, target_label=None)[source]

Feature-based Trigger Injection

supported_api_types = {'dgl'}
train_target_model()[source]

Train the target model with backdoor injection.

verify_backdoor(model, trigger_nodes, target_label)[source]

Verify backdoor attack success rate

pygip.models.defense.ImperceptibleWM module

class pygip.models.defense.ImperceptibleWM.ImperceptibleWM(dataset, attack_node_fraction=0.3, model_path=None)[source]

Bases: BaseDefense

_abc_impl = <_abc_data object>
_load_model()[source]

Load pre-trained model.

_train_defense_model()[source]

This is an optional method.

_train_surrogate_model()[source]

This is an optional method.

_train_target_model()[source]

This is an optional method.

defend()[source]

Execute the defense mechanism.

supported_api_types = {'pyg'}
class pygip.models.defense.ImperceptibleWM.TriggerGenerator(*args: Any, **kwargs: Any)[source]

Bases: Module

forward(x, edge_index)[source]
pygip.models.defense.ImperceptibleWM.bi_level_optimization(target_model, generator, data, epochs=100, inner_steps=5)[source]
pygip.models.defense.ImperceptibleWM.calculate_metrics(model, data)[source]
pygip.models.defense.ImperceptibleWM.generate_trigger_graph(data, generator, target_model, num_triggers=50)[source]

pygip.models.defense.ImperceptibleWM2 module

class pygip.models.defense.ImperceptibleWM2.ImperceptibleWM2(dataset, attack_node_fraction=0.2, wm_node=50, target_label=None, N=50, M=5, epsilon1=1.0, epsilon2=0.5, epsilon3=1.0, owner_id=None, beta=0.001, T_acc=0.8)[source]

Bases: BaseDefense

_abc_impl = <_abc_data object>
_calculate_embedding_loss(f_theta, backdoor_graph)[source]

Calculate embedding loss for model training on backdoor graph. Computes cross-entropy loss for training the main model on the combined clean and trigger graph data.

Parameters:
  • f_theta (torch.nn.Module) – Main classification model

  • backdoor_graph (dgl.DGLGraph) – Combined graph with embedded watermark

Returns:

Embedding loss value

Return type:

torch.Tensor

_calculate_generation_loss_integrated(f_theta_s, f_g, V_p)[source]

Calculate integrated generation loss combining all generator objectives. Combines imperception, regulation, and trigger losses with respective weights to optimize the trigger generator network.

Parameters:
  • f_theta_s (torch.nn.Module) – Current state of the main model

  • f_g (torch.nn.Module) – Trigger generator network

  • V_p (torch.Tensor) – Poisoning node indices

Returns:

Combined generation loss

Return type:

torch.Tensor

_calculate_imperception_loss(trigger_features, V_p)[source]

Calculate imperception loss to make watermark features similar to clean features. Measures cosine similarity between trigger features and poisoning node features to ensure the watermark remains hidden.

Parameters:
  • trigger_features (torch.Tensor) – Generated trigger node features

  • V_p (torch.Tensor) – Poisoning node indices

Returns:

Imperception loss value

Return type:

torch.Tensor

_calculate_regulation_loss(trigger_features)[source]

Calculate regulation loss based on owner ID signature. Enforces the trigger features to embed owner identification information using cross-entropy loss with the owner ID as target.

Parameters:

trigger_features (torch.Tensor) – Generated trigger node features

Returns:

Regulation loss value

Return type:

torch.Tensor

_calculate_trigger_loss(f_theta, trigger_features, trigger_graph)[source]

Calculate trigger loss for watermark effectiveness. Measures how well the model classifies trigger nodes to the target label, ensuring the watermark functions correctly.

Parameters:
  • f_theta (torch.nn.Module) – Main classification model

  • trigger_features (torch.Tensor) – Generated trigger node features

  • trigger_graph (dgl.DGLGraph) – Trigger graph structure

Returns:

Trigger loss value

Return type:

torch.Tensor

_construct_backdoor_graph(clean_graph, trigger_graph, V_p)[source]

Construct a backdoor graph by combining clean graph with trigger graph. Merges the original graph with the watermark trigger graph by adding connections between poisoning nodes and trigger nodes.

Parameters:
  • clean_graph (dgl.DGLGraph) – Original clean graph

  • trigger_graph (dgl.DGLGraph) – Generated trigger/watermark graph

  • V_p (torch.Tensor) – Poisoning node indices for connection

Returns:

Combined backdoor graph with embedded watermark

Return type:

dgl.DGLGraph

_create_temp_trigger_graph(trigger_features, edge_probs)[source]

Create a temporary trigger graph for loss calculation. Constructs a temporary graph structure using generated features and edge probabilities for intermediate computations.

Parameters:
  • trigger_features (torch.Tensor) – Generated trigger node features

  • edge_probs (torch.Tensor) – Edge existence probabilities

Returns:

Temporary trigger graph

Return type:

dgl.DGLGraph

_evaluate_model_on_graph(model, graph)[source]

Evaluate model performance on a specific graph. Computes classification metrics for the given model on the provided graph, handling different model architectures appropriately.

Parameters:
  • model (torch.nn.Module) – Model to evaluate

  • graph (dgl.DGLGraph) – Graph data for evaluation

Returns:

(accuracy, precision, recall, f1_score) metrics

Return type:

tuple of float

_evaluate_with_metrics(model, dataloader)[source]

Evaluate model performance using multiple classification metrics.

Parameters:
  • model (torch.nn.Module) – The neural network model to evaluate

  • dataloader (torch.utils.data.DataLoader) – DataLoader containing evaluation data

Returns:

(accuracy, precision, recall, f1_score) metrics

Return type:

tuple of float

_generate_trigger_graph(f_g, V_p)[source]

Generate a watermark trigger graph using the generator network. Creates trigger features and edges based on poisoning nodes and constructs a DGL graph for watermark embedding.

Parameters:
  • f_g (torch.nn.Module) – Trigger generator network

  • V_p (torch.Tensor) – Selected poisoning node indices

Returns:

The generated watermark trigger graph with features and labels

Return type:

dgl.DGLGraph

_inner_optimization(f_theta, f_g, V_p, optimizer)[source]

Execute the watermark embedding phase of bilevel optimization. Performs M iterations of model training on the backdoor graph to embed the watermark into the model parameters.

Parameters:
  • f_theta (torch.nn.Module) – Main classification model

  • f_g (torch.nn.Module) – Trigger generator network

  • V_p (torch.Tensor) – Poisoning node indices

  • optimizer (torch.optim.Optimizer) – Optimizer for model parameters

Returns:

Updated model with embedded watermark

Return type:

torch.nn.Module

_select_poisoning_nodes(clean_model)[source]

Select nodes for watermark poisoning based on model predictions. Uses the clean model’s confidence scores to identify high-confidence nodes across different labels for creating the watermark trigger.

Parameters:

clean_model (torch.nn.Module) – Pre-trained clean model used for node selection

Returns:

Tensor of selected node indices for poisoning

Return type:

torch.Tensor

_train_defense_model()[source]

Train the defense model with watermark embedding using bilevel optimization. Implements the complete bilevel optimization process alternating between watermark embedding and trigger generation phases.

Returns:

(trained_defense_model, trigger_generator)

Return type:

tuple

_train_target_model()[source]

Train the target model on clean graph data. Creates and trains a GraphSAGE model on the original dataset without any watermark or defense mechanisms.

Returns:

Trained target model

Return type:

torch.nn.Module

defend()[source]

Execute the complete watermark defense strategy. Trains target model, applies watermark defense, and verifies ownership. Returns comprehensive evaluation metrics and ownership verification results.

Returns:

Dictionary containing attack metrics, defense metrics, ownership verification status, and trained generator

Return type:

dict

verify_ownership(suspicious_model)[source]

Verify ownership of a suspicious model using the watermark. Tests if the suspicious model correctly classifies the watermark trigger graph to determine if it contains the embedded watermark.

Parameters:

suspicious_model (torch.nn.Module) – Model to test for ownership verification

Returns:

(is_owner: bool, ownership_accuracy: float)

Return type:

tuple

class pygip.models.defense.ImperceptibleWM2.TriggerGenerator(*args: Any, **kwargs: Any)[source]

Bases: Module

Generate watermark trigger features and edge probabilities using a GCN-based architecture.

This module constructs a small graph template and applies multiple GCN layers to produce node features that represent the watermark trigger. It also learns a function to generate edge probabilities between nodes using a neural edge generator.

Parameters:
  • feature_dim (int) – Dimension of node feature vectors.

  • hidden_dim (int, optional) – Dimension of hidden layers in GCN and edge generator. Default is 64.

  • output_nodes (int, optional) – Number of nodes in the generated trigger graph. Default is 50.

_create_template_graph()[source]

Create a small template DGL graph structure to serve as the base for GCN processing.

This function builds a fully connected undirected graph (with self-loops) consisting of up to 10 nodes. This graph serves as a structural template for generating watermark trigger node features.

Returns:

A small connected DGL graph with self-loops, moved to the appropriate device.

Return type:

dgl.DGLGraph

forward(clean_features, selected_nodes)[source]

Forward pass to generate trigger node features and edge probabilities.

Constructs a trigger graph by first computing a prototype feature from selected clean nodes, propagating it through GCN layers, and generating additional nodes and edge probabilities to match the required trigger size.

Parameters:
  • clean_features (torch.Tensor) – Feature matrix from the clean graph (shape: [num_nodes, feature_dim]).

  • selected_nodes (list[int] or torch.Tensor) – Indices of nodes selected for constructing the prototype vector.

Returns:

  • trigger_features (torch.Tensor) – Feature matrix of generated trigger nodes (shape: [output_nodes, feature_dim]).

  • edge_probs (torch.Tensor) – A 1D tensor containing probabilities for edges between node pairs (upper triangular, shape: [output_nodes * (output_nodes - 1) / 2]).

pygip.models.defense.RandomWM module

class pygip.models.defense.RandomWM.RandomWM(dataset, attack_node_fraction=0.2, wm_node=50, pr=0.2, pg=0.2, attack_name=None)[source]

Bases: BaseDefense

A flexible defense implementation using watermarking to protect against model extraction attacks on graph neural networks.

This class combines the functionalities from the original watermark.py: - Generating watermark graphs - Training models on original and watermark graphs - Merging graphs for testing - Evaluating effectiveness against attacks - Dynamic selection of attack methods

_abc_impl = <_abc_data object>
_evaluate_attack_on_watermark(attack_model)[source]

Evaluate how well the attack model performs on the watermark graph.

Parameters:

attack_model (torch.nn.Module) – The model obtained from the attack

Returns:

Attack model’s accuracy on the watermark graph

Return type:

float

_evaluate_watermark(model)[source]

Evaluate watermark detection effectiveness.

Parameters:

model (torch.nn.Module) – The model to evaluate

Returns:

Watermark detection accuracy

Return type:

float

_generate_watermark_graph()[source]

Generate a watermark graph using Erdos-Renyi random graph model.

Returns:

The generated watermark graph

Return type:

dgl.DGLGraph

_get_attack_class(attack_name)[source]

Dynamically import and return the specified attack class.

Parameters:

attack_name (str) – Name of the attack class to import

Returns:

The requested attack class

Return type:

class

_test_on_watermark(model, wm_dataloader)[source]

Test a model’s accuracy on the watermark graph.

Parameters:
  • model (torch.nn.Module) – The model to test

  • wm_dataloader (DataLoader) – DataLoader for the watermark graph

Returns:

Accuracy on the watermark graph

Return type:

float

_train_defense_model()[source]

Helper function for training a defense model with watermarking.

Returns:

The trained defense model with embedded watermark

Return type:

torch.nn.Module

_train_target_model()[source]

Helper function for training the target model on the original graph.

Returns:

The trained target model

Return type:

torch.nn.Module

defend(attack_name=None)[source]

Main defense workflow: 1. Train a target model on the original graph 2. Attack the target model to establish baseline vulnerability 3. Train a defense model with watermarking 4. Test the defense model against the same attack 5. Print performance metrics

Parameters:

attack_name (str, optional) – Name of the attack class to use, overrides the one set in __init__

Returns:

Dictionary containing performance metrics

Return type:

dict

supported_api_types = {'dgl'}

pygip.models.defense.SurviveWM module

class pygip.models.defense.SurviveWM.SurviveWM(dataset, attack_node_fraction, model_path=None)[source]

Bases: BaseDefense

_abc_impl = <_abc_data object>
_load_model()[source]

Load a pre-trained model.

_to_cpu(tensor)[source]

Safely move tensor to CPU for NumPy operations

combine_with_trigger(base_graph, base_features, base_labels, trigger_data)[source]
compute_metrics(y_true, y_pred, y_score=None)[source]
defend()[source]

Execute the defense mechanism.

generate_key_graph(num_nodes=10, edge_prob=0.3)[source]
snn_loss(x, y, T=0.5)[source]
supported_api_types = {'dgl'}
train_with_snnl(model, graph, features, labels, train_mask, optimizer, T=0.5, alpha=0.1)[source]
verify_watermark(model, trigger_graph, trigger_labels)

pygip.models.defense.SurviveWM2 module

class pygip.models.defense.SurviveWM2.KeyInputOptimizer(training_dataset, key_inputs, T_opt=20)[source]

Bases: object

optimize()[source]
class pygip.models.defense.SurviveWM2.SAGEModel(input_dim, hidden_dim=64, num_classes=10, num_layers=3, dropout=0.1)[source]

Bases: Module

forward(x, edge_index, batch, return_embedding=False)[source]
class pygip.models.defense.SurviveWM2.SNNLLoss(temperature=1.0)[source]

Bases: Module

forward(embeddings, labels)[source]
Return type:

Tensor

class pygip.models.defense.SurviveWM2.SurviveWM2(dataset, attack_node_fraction, model_path=None, alpha=0.1, num_layers=4, clean_epochs=200, wm_epochs=200, **kwargs)[source]

Bases: BaseDefense

_abc_impl = <_abc_data object>
_load_model()[source]

Load pre-trained model.

_train_defense_model(clean_model=None)[source]

This is an optional method.

_train_surrogate_model()[source]

This is an optional method.

_train_target_model()[source]

This is an optional method.

defend()[source]

Main defense workflow: 1. Train a target model (clean) 2. (optional) Simulate attack on target model (if implemented) 3. Train defense (watermarked) model 4. Evaluate defense and print detailed metrics :returns: Dictionary containing performance metrics :rtype: dict

class pygip.models.defense.SurviveWM2.WatermarkGenerator(training_dataset, num_watermark_samples=None)[source]

Bases: object

_get_avg_num_nodes()[source]
Return type:

int

_get_num_classes()[source]
Return type:

int

algorithm_1_key_input_topology_generation(N_t, N=None)[source]
Return type:

Data

generate_watermark_set_with_clean_model(clean_model)[source]
Return type:

List[Tuple[Data, int]]

pygip.models.defense.SurviveWM2.evaluate_clean_accuracy(model, test_data, batch_size=32)[source]
Return type:

float

pygip.models.defense.SurviveWM2.evaluate_watermark_effectiveness(model, key_inputs)[source]
Return type:

float

pygip.models.defense.SurviveWM2.train_clean_model(training_data, epochs=200, batch_size=32, num_layers=3)[source]
Return type:

SAGEModel

pygip.models.defense.SurviveWM2.train_watermarked_model_full(training_data, key_inputs, epochs=300, alpha=0.1, num_layers=4, hidden_dim=160, dropout=0.05, lr=0.001, snnl_temperature=1.0)[source]

pygip.models.defense.base module

class pygip.models.defense.base.BaseDefense(dataset, attack_node_fraction, device=None)[source]

Bases: ABC

_abc_impl = <_abc_data object>
_check_dataset_compatibility()[source]
_load_model()[source]

Load pre-trained model.

_train_defense_model()[source]

This is an optional method.

_train_surrogate_model()[source]

This is an optional method.

_train_target_model()[source]

This is an optional method.

abstract defend()[source]

Execute the defense mechanism.

supported_api_types = {}
supported_datasets = {}

Module contents

class pygip.models.defense.ATOM(dataset, attack_node_fraction=0)[source]

Bases: BaseDefense

_abc_impl = <_abc_data object>
_load_data_and_model(dataset, batch_size=16, seed=0, lamb=0)[source]
defend()[source]

Execute the defense mechanism.

supported_api_types = {'pyg'}
supported_datasets = {'CiteSeer', 'Cora', 'PubMed'}
class pygip.models.defense.BackdoorWM(dataset, attack_node_fraction, model_path=None, trigger_rate=0.01, l=20, target_label=0)[source]

Bases: BaseDefense

_abc_impl = <_abc_data object>
_load_model()[source]

Load a pre-trained model.

defend()[source]

Execute the backdoor watermark attack.

evaluate_model(model, features, labels)[source]

Evaluate model performance

inject_backdoor_trigger(data, trigger_rate=None, trigger_feat_val=0.99, l=None, target_label=None)[source]

Feature-based Trigger Injection

supported_api_types = {'dgl'}
train_target_model()[source]

Train the target model with backdoor injection.

verify_backdoor(model, trigger_nodes, target_label)[source]

Verify backdoor attack success rate

class pygip.models.defense.ImperceptibleWM(dataset, attack_node_fraction=0.3, model_path=None)[source]

Bases: BaseDefense

_abc_impl = <_abc_data object>
_load_model()[source]

Load pre-trained model.

_train_defense_model()[source]

This is an optional method.

_train_surrogate_model()[source]

This is an optional method.

_train_target_model()[source]

This is an optional method.

defend()[source]

Execute the defense mechanism.

supported_api_types = {'pyg'}
class pygip.models.defense.ImperceptibleWM2(dataset, attack_node_fraction=0.2, wm_node=50, target_label=None, N=50, M=5, epsilon1=1.0, epsilon2=0.5, epsilon3=1.0, owner_id=None, beta=0.001, T_acc=0.8)[source]

Bases: BaseDefense

_abc_impl = <_abc_data object>
_calculate_embedding_loss(f_theta, backdoor_graph)[source]

Calculate embedding loss for model training on backdoor graph. Computes cross-entropy loss for training the main model on the combined clean and trigger graph data.

Parameters:
  • f_theta (torch.nn.Module) – Main classification model

  • backdoor_graph (dgl.DGLGraph) – Combined graph with embedded watermark

Returns:

Embedding loss value

Return type:

torch.Tensor

_calculate_generation_loss_integrated(f_theta_s, f_g, V_p)[source]

Calculate integrated generation loss combining all generator objectives. Combines imperception, regulation, and trigger losses with respective weights to optimize the trigger generator network.

Parameters:
  • f_theta_s (torch.nn.Module) – Current state of the main model

  • f_g (torch.nn.Module) – Trigger generator network

  • V_p (torch.Tensor) – Poisoning node indices

Returns:

Combined generation loss

Return type:

torch.Tensor

_calculate_imperception_loss(trigger_features, V_p)[source]

Calculate imperception loss to make watermark features similar to clean features. Measures cosine similarity between trigger features and poisoning node features to ensure the watermark remains hidden.

Parameters:
  • trigger_features (torch.Tensor) – Generated trigger node features

  • V_p (torch.Tensor) – Poisoning node indices

Returns:

Imperception loss value

Return type:

torch.Tensor

_calculate_regulation_loss(trigger_features)[source]

Calculate regulation loss based on owner ID signature. Enforces the trigger features to embed owner identification information using cross-entropy loss with the owner ID as target.

Parameters:

trigger_features (torch.Tensor) – Generated trigger node features

Returns:

Regulation loss value

Return type:

torch.Tensor

_calculate_trigger_loss(f_theta, trigger_features, trigger_graph)[source]

Calculate trigger loss for watermark effectiveness. Measures how well the model classifies trigger nodes to the target label, ensuring the watermark functions correctly.

Parameters:
  • f_theta (torch.nn.Module) – Main classification model

  • trigger_features (torch.Tensor) – Generated trigger node features

  • trigger_graph (dgl.DGLGraph) – Trigger graph structure

Returns:

Trigger loss value

Return type:

torch.Tensor

_construct_backdoor_graph(clean_graph, trigger_graph, V_p)[source]

Construct a backdoor graph by combining clean graph with trigger graph. Merges the original graph with the watermark trigger graph by adding connections between poisoning nodes and trigger nodes.

Parameters:
  • clean_graph (dgl.DGLGraph) – Original clean graph

  • trigger_graph (dgl.DGLGraph) – Generated trigger/watermark graph

  • V_p (torch.Tensor) – Poisoning node indices for connection

Returns:

Combined backdoor graph with embedded watermark

Return type:

dgl.DGLGraph

_create_temp_trigger_graph(trigger_features, edge_probs)[source]

Create a temporary trigger graph for loss calculation. Constructs a temporary graph structure using generated features and edge probabilities for intermediate computations.

Parameters:
  • trigger_features (torch.Tensor) – Generated trigger node features

  • edge_probs (torch.Tensor) – Edge existence probabilities

Returns:

Temporary trigger graph

Return type:

dgl.DGLGraph

_evaluate_model_on_graph(model, graph)[source]

Evaluate model performance on a specific graph. Computes classification metrics for the given model on the provided graph, handling different model architectures appropriately.

Parameters:
  • model (torch.nn.Module) – Model to evaluate

  • graph (dgl.DGLGraph) – Graph data for evaluation

Returns:

(accuracy, precision, recall, f1_score) metrics

Return type:

tuple of float

_evaluate_with_metrics(model, dataloader)[source]

Evaluate model performance using multiple classification metrics.

Parameters:
  • model (torch.nn.Module) – The neural network model to evaluate

  • dataloader (torch.utils.data.DataLoader) – DataLoader containing evaluation data

Returns:

(accuracy, precision, recall, f1_score) metrics

Return type:

tuple of float

_generate_trigger_graph(f_g, V_p)[source]

Generate a watermark trigger graph using the generator network. Creates trigger features and edges based on poisoning nodes and constructs a DGL graph for watermark embedding.

Parameters:
  • f_g (torch.nn.Module) – Trigger generator network

  • V_p (torch.Tensor) – Selected poisoning node indices

Returns:

The generated watermark trigger graph with features and labels

Return type:

dgl.DGLGraph

_inner_optimization(f_theta, f_g, V_p, optimizer)[source]

Execute the watermark embedding phase of bilevel optimization. Performs M iterations of model training on the backdoor graph to embed the watermark into the model parameters.

Parameters:
  • f_theta (torch.nn.Module) – Main classification model

  • f_g (torch.nn.Module) – Trigger generator network

  • V_p (torch.Tensor) – Poisoning node indices

  • optimizer (torch.optim.Optimizer) – Optimizer for model parameters

Returns:

Updated model with embedded watermark

Return type:

torch.nn.Module

_select_poisoning_nodes(clean_model)[source]

Select nodes for watermark poisoning based on model predictions. Uses the clean model’s confidence scores to identify high-confidence nodes across different labels for creating the watermark trigger.

Parameters:

clean_model (torch.nn.Module) – Pre-trained clean model used for node selection

Returns:

Tensor of selected node indices for poisoning

Return type:

torch.Tensor

_train_defense_model()[source]

Train the defense model with watermark embedding using bilevel optimization. Implements the complete bilevel optimization process alternating between watermark embedding and trigger generation phases.

Returns:

(trained_defense_model, trigger_generator)

Return type:

tuple

_train_target_model()[source]

Train the target model on clean graph data. Creates and trains a GraphSAGE model on the original dataset without any watermark or defense mechanisms.

Returns:

Trained target model

Return type:

torch.nn.Module

defend()[source]

Execute the complete watermark defense strategy. Trains target model, applies watermark defense, and verifies ownership. Returns comprehensive evaluation metrics and ownership verification results.

Returns:

Dictionary containing attack metrics, defense metrics, ownership verification status, and trained generator

Return type:

dict

verify_ownership(suspicious_model)[source]

Verify ownership of a suspicious model using the watermark. Tests if the suspicious model correctly classifies the watermark trigger graph to determine if it contains the embedded watermark.

Parameters:

suspicious_model (torch.nn.Module) – Model to test for ownership verification

Returns:

(is_owner: bool, ownership_accuracy: float)

Return type:

tuple

class pygip.models.defense.RandomWM(dataset, attack_node_fraction=0.2, wm_node=50, pr=0.2, pg=0.2, attack_name=None)[source]

Bases: BaseDefense

A flexible defense implementation using watermarking to protect against model extraction attacks on graph neural networks.

This class combines the functionalities from the original watermark.py: - Generating watermark graphs - Training models on original and watermark graphs - Merging graphs for testing - Evaluating effectiveness against attacks - Dynamic selection of attack methods

_abc_impl = <_abc_data object>
_evaluate_attack_on_watermark(attack_model)[source]

Evaluate how well the attack model performs on the watermark graph.

Parameters:

attack_model (torch.nn.Module) – The model obtained from the attack

Returns:

Attack model’s accuracy on the watermark graph

Return type:

float

_evaluate_watermark(model)[source]

Evaluate watermark detection effectiveness.

Parameters:

model (torch.nn.Module) – The model to evaluate

Returns:

Watermark detection accuracy

Return type:

float

_generate_watermark_graph()[source]

Generate a watermark graph using Erdos-Renyi random graph model.

Returns:

The generated watermark graph

Return type:

dgl.DGLGraph

_get_attack_class(attack_name)[source]

Dynamically import and return the specified attack class.

Parameters:

attack_name (str) – Name of the attack class to import

Returns:

The requested attack class

Return type:

class

_test_on_watermark(model, wm_dataloader)[source]

Test a model’s accuracy on the watermark graph.

Parameters:
  • model (torch.nn.Module) – The model to test

  • wm_dataloader (DataLoader) – DataLoader for the watermark graph

Returns:

Accuracy on the watermark graph

Return type:

float

_train_defense_model()[source]

Helper function for training a defense model with watermarking.

Returns:

The trained defense model with embedded watermark

Return type:

torch.nn.Module

_train_target_model()[source]

Helper function for training the target model on the original graph.

Returns:

The trained target model

Return type:

torch.nn.Module

defend(attack_name=None)[source]

Main defense workflow: 1. Train a target model on the original graph 2. Attack the target model to establish baseline vulnerability 3. Train a defense model with watermarking 4. Test the defense model against the same attack 5. Print performance metrics

Parameters:

attack_name (str, optional) – Name of the attack class to use, overrides the one set in __init__

Returns:

Dictionary containing performance metrics

Return type:

dict

supported_api_types = {'dgl'}
class pygip.models.defense.SurviveWM(dataset, attack_node_fraction, model_path=None)[source]

Bases: BaseDefense

_abc_impl = <_abc_data object>
_load_model()[source]

Load a pre-trained model.

_to_cpu(tensor)[source]

Safely move tensor to CPU for NumPy operations

combine_with_trigger(base_graph, base_features, base_labels, trigger_data)[source]
compute_metrics(y_true, y_pred, y_score=None)[source]
defend()[source]

Execute the defense mechanism.

generate_key_graph(num_nodes=10, edge_prob=0.3)[source]
snn_loss(x, y, T=0.5)[source]
supported_api_types = {'dgl'}
train_with_snnl(model, graph, features, labels, train_mask, optimizer, T=0.5, alpha=0.1)[source]
verify_watermark(model, trigger_graph, trigger_labels)
class pygip.models.defense.SurviveWM2(dataset, attack_node_fraction, model_path=None, alpha=0.1, num_layers=4, clean_epochs=200, wm_epochs=200, **kwargs)[source]

Bases: BaseDefense

_abc_impl = <_abc_data object>
_load_model()[source]

Load pre-trained model.

_train_defense_model(clean_model=None)[source]

This is an optional method.

_train_surrogate_model()[source]

This is an optional method.

_train_target_model()[source]

This is an optional method.

defend()[source]

Main defense workflow: 1. Train a target model (clean) 2. (optional) Simulate attack on target model (if implemented) 3. Train defense (watermarked) model 4. Evaluate defense and print detailed metrics :returns: Dictionary containing performance metrics :rtype: dict