pygip.models.defense package¶
Subpackages¶
- pygip.models.defense.atom package
- Submodules
- pygip.models.defense.atom.ATOM module
ATOM
FusionGRU
GCN
Memory
PPOAgent
PolicyNetwork
SequencesDataset
StateTransformMLP
TargetGCN
average_pooling_with_neighbors()
average_pooling_with_neighbors_batch()
build_loaders()
collate_fn_no_pad()
compute_embedding_batch()
compute_returns_and_advantages()
custom_reward_function()
get_node_embedding()
get_one_hop_neighbors()
k_core_decomposition()
load_data_and_model()
precompute_all_node_embeddings()
precompute_simple_embeddings()
preprocess_sequences()
set_seed()
simple_embedding_batch()
split_and_adjust()
test_model()
train_gcn()
- Module contents
Submodules¶
pygip.models.defense.BackdoorWM module¶
- class pygip.models.defense.BackdoorWM.BackdoorWM(dataset, attack_node_fraction, model_path=None, trigger_rate=0.01, l=20, target_label=0)[source]¶
Bases:
BaseDefense
- _abc_impl = <_abc_data object>¶
- inject_backdoor_trigger(data, trigger_rate=None, trigger_feat_val=0.99, l=None, target_label=None)[source]¶
Feature-based Trigger Injection
- supported_api_types = {'dgl'}¶
pygip.models.defense.ImperceptibleWM module¶
- class pygip.models.defense.ImperceptibleWM.ImperceptibleWM(dataset, attack_node_fraction=0.3, model_path=None)[source]¶
Bases:
BaseDefense
- _abc_impl = <_abc_data object>¶
- supported_api_types = {'pyg'}¶
- class pygip.models.defense.ImperceptibleWM.TriggerGenerator(*args: Any, **kwargs: Any)[source]¶
Bases:
Module
pygip.models.defense.ImperceptibleWM2 module¶
- class pygip.models.defense.ImperceptibleWM2.ImperceptibleWM2(dataset, attack_node_fraction=0.2, wm_node=50, target_label=None, N=50, M=5, epsilon1=1.0, epsilon2=0.5, epsilon3=1.0, owner_id=None, beta=0.001, T_acc=0.8)[source]¶
Bases:
BaseDefense
- _abc_impl = <_abc_data object>¶
- _calculate_embedding_loss(f_theta, backdoor_graph)[source]¶
Calculate embedding loss for model training on backdoor graph. Computes cross-entropy loss for training the main model on the combined clean and trigger graph data.
- Parameters:
f_theta (torch.nn.Module) – Main classification model
backdoor_graph (dgl.DGLGraph) – Combined graph with embedded watermark
- Returns:
Embedding loss value
- Return type:
torch.Tensor
- _calculate_generation_loss_integrated(f_theta_s, f_g, V_p)[source]¶
Calculate integrated generation loss combining all generator objectives. Combines imperception, regulation, and trigger losses with respective weights to optimize the trigger generator network.
- Parameters:
f_theta_s (torch.nn.Module) – Current state of the main model
f_g (torch.nn.Module) – Trigger generator network
V_p (torch.Tensor) – Poisoning node indices
- Returns:
Combined generation loss
- Return type:
torch.Tensor
- _calculate_imperception_loss(trigger_features, V_p)[source]¶
Calculate imperception loss to make watermark features similar to clean features. Measures cosine similarity between trigger features and poisoning node features to ensure the watermark remains hidden.
- Parameters:
trigger_features (torch.Tensor) – Generated trigger node features
V_p (torch.Tensor) – Poisoning node indices
- Returns:
Imperception loss value
- Return type:
torch.Tensor
- _calculate_regulation_loss(trigger_features)[source]¶
Calculate regulation loss based on owner ID signature. Enforces the trigger features to embed owner identification information using cross-entropy loss with the owner ID as target.
- Parameters:
trigger_features (torch.Tensor) – Generated trigger node features
- Returns:
Regulation loss value
- Return type:
torch.Tensor
- _calculate_trigger_loss(f_theta, trigger_features, trigger_graph)[source]¶
Calculate trigger loss for watermark effectiveness. Measures how well the model classifies trigger nodes to the target label, ensuring the watermark functions correctly.
- Parameters:
f_theta (torch.nn.Module) – Main classification model
trigger_features (torch.Tensor) – Generated trigger node features
trigger_graph (dgl.DGLGraph) – Trigger graph structure
- Returns:
Trigger loss value
- Return type:
torch.Tensor
- _construct_backdoor_graph(clean_graph, trigger_graph, V_p)[source]¶
Construct a backdoor graph by combining clean graph with trigger graph. Merges the original graph with the watermark trigger graph by adding connections between poisoning nodes and trigger nodes.
- Parameters:
clean_graph (dgl.DGLGraph) – Original clean graph
trigger_graph (dgl.DGLGraph) – Generated trigger/watermark graph
V_p (torch.Tensor) – Poisoning node indices for connection
- Returns:
Combined backdoor graph with embedded watermark
- Return type:
dgl.DGLGraph
- _create_temp_trigger_graph(trigger_features, edge_probs)[source]¶
Create a temporary trigger graph for loss calculation. Constructs a temporary graph structure using generated features and edge probabilities for intermediate computations.
- Parameters:
trigger_features (torch.Tensor) – Generated trigger node features
edge_probs (torch.Tensor) – Edge existence probabilities
- Returns:
Temporary trigger graph
- Return type:
dgl.DGLGraph
- _evaluate_model_on_graph(model, graph)[source]¶
Evaluate model performance on a specific graph. Computes classification metrics for the given model on the provided graph, handling different model architectures appropriately.
- Parameters:
model (torch.nn.Module) – Model to evaluate
graph (dgl.DGLGraph) – Graph data for evaluation
- Returns:
(accuracy, precision, recall, f1_score) metrics
- Return type:
tuple of float
- _evaluate_with_metrics(model, dataloader)[source]¶
Evaluate model performance using multiple classification metrics.
- Parameters:
model (torch.nn.Module) – The neural network model to evaluate
dataloader (torch.utils.data.DataLoader) – DataLoader containing evaluation data
- Returns:
(accuracy, precision, recall, f1_score) metrics
- Return type:
tuple of float
- _generate_trigger_graph(f_g, V_p)[source]¶
Generate a watermark trigger graph using the generator network. Creates trigger features and edges based on poisoning nodes and constructs a DGL graph for watermark embedding.
- Parameters:
f_g (torch.nn.Module) – Trigger generator network
V_p (torch.Tensor) – Selected poisoning node indices
- Returns:
The generated watermark trigger graph with features and labels
- Return type:
dgl.DGLGraph
- _inner_optimization(f_theta, f_g, V_p, optimizer)[source]¶
Execute the watermark embedding phase of bilevel optimization. Performs M iterations of model training on the backdoor graph to embed the watermark into the model parameters.
- Parameters:
f_theta (torch.nn.Module) – Main classification model
f_g (torch.nn.Module) – Trigger generator network
V_p (torch.Tensor) – Poisoning node indices
optimizer (torch.optim.Optimizer) – Optimizer for model parameters
- Returns:
Updated model with embedded watermark
- Return type:
torch.nn.Module
- _select_poisoning_nodes(clean_model)[source]¶
Select nodes for watermark poisoning based on model predictions. Uses the clean model’s confidence scores to identify high-confidence nodes across different labels for creating the watermark trigger.
- Parameters:
clean_model (torch.nn.Module) – Pre-trained clean model used for node selection
- Returns:
Tensor of selected node indices for poisoning
- Return type:
torch.Tensor
- _train_defense_model()[source]¶
Train the defense model with watermark embedding using bilevel optimization. Implements the complete bilevel optimization process alternating between watermark embedding and trigger generation phases.
- Returns:
(trained_defense_model, trigger_generator)
- Return type:
tuple
- _train_target_model()[source]¶
Train the target model on clean graph data. Creates and trains a GraphSAGE model on the original dataset without any watermark or defense mechanisms.
- Returns:
Trained target model
- Return type:
torch.nn.Module
- defend()[source]¶
Execute the complete watermark defense strategy. Trains target model, applies watermark defense, and verifies ownership. Returns comprehensive evaluation metrics and ownership verification results.
- Returns:
Dictionary containing attack metrics, defense metrics, ownership verification status, and trained generator
- Return type:
dict
- verify_ownership(suspicious_model)[source]¶
Verify ownership of a suspicious model using the watermark. Tests if the suspicious model correctly classifies the watermark trigger graph to determine if it contains the embedded watermark.
- Parameters:
suspicious_model (torch.nn.Module) – Model to test for ownership verification
- Returns:
(is_owner: bool, ownership_accuracy: float)
- Return type:
tuple
- class pygip.models.defense.ImperceptibleWM2.TriggerGenerator(*args: Any, **kwargs: Any)[source]¶
Bases:
Module
Generate watermark trigger features and edge probabilities using a GCN-based architecture.
This module constructs a small graph template and applies multiple GCN layers to produce node features that represent the watermark trigger. It also learns a function to generate edge probabilities between nodes using a neural edge generator.
- Parameters:
feature_dim (int) – Dimension of node feature vectors.
hidden_dim (int, optional) – Dimension of hidden layers in GCN and edge generator. Default is 64.
output_nodes (int, optional) – Number of nodes in the generated trigger graph. Default is 50.
- _create_template_graph()[source]¶
Create a small template DGL graph structure to serve as the base for GCN processing.
This function builds a fully connected undirected graph (with self-loops) consisting of up to 10 nodes. This graph serves as a structural template for generating watermark trigger node features.
- Returns:
A small connected DGL graph with self-loops, moved to the appropriate device.
- Return type:
dgl.DGLGraph
- forward(clean_features, selected_nodes)[source]¶
Forward pass to generate trigger node features and edge probabilities.
Constructs a trigger graph by first computing a prototype feature from selected clean nodes, propagating it through GCN layers, and generating additional nodes and edge probabilities to match the required trigger size.
- Parameters:
clean_features (torch.Tensor) – Feature matrix from the clean graph (shape: [num_nodes, feature_dim]).
selected_nodes (list[int] or torch.Tensor) – Indices of nodes selected for constructing the prototype vector.
- Returns:
trigger_features (torch.Tensor) – Feature matrix of generated trigger nodes (shape: [output_nodes, feature_dim]).
edge_probs (torch.Tensor) – A 1D tensor containing probabilities for edges between node pairs (upper triangular, shape: [output_nodes * (output_nodes - 1) / 2]).
pygip.models.defense.RandomWM module¶
- class pygip.models.defense.RandomWM.RandomWM(dataset, attack_node_fraction=0.2, wm_node=50, pr=0.2, pg=0.2, attack_name=None)[source]¶
Bases:
BaseDefense
A flexible defense implementation using watermarking to protect against model extraction attacks on graph neural networks.
This class combines the functionalities from the original watermark.py: - Generating watermark graphs - Training models on original and watermark graphs - Merging graphs for testing - Evaluating effectiveness against attacks - Dynamic selection of attack methods
- _abc_impl = <_abc_data object>¶
- _evaluate_attack_on_watermark(attack_model)[source]¶
Evaluate how well the attack model performs on the watermark graph.
- Parameters:
attack_model (torch.nn.Module) – The model obtained from the attack
- Returns:
Attack model’s accuracy on the watermark graph
- Return type:
float
- _evaluate_watermark(model)[source]¶
Evaluate watermark detection effectiveness.
- Parameters:
model (torch.nn.Module) – The model to evaluate
- Returns:
Watermark detection accuracy
- Return type:
float
- _generate_watermark_graph()[source]¶
Generate a watermark graph using Erdos-Renyi random graph model.
- Returns:
The generated watermark graph
- Return type:
dgl.DGLGraph
- _get_attack_class(attack_name)[source]¶
Dynamically import and return the specified attack class.
- Parameters:
attack_name (str) – Name of the attack class to import
- Returns:
The requested attack class
- Return type:
class
- _test_on_watermark(model, wm_dataloader)[source]¶
Test a model’s accuracy on the watermark graph.
- Parameters:
model (torch.nn.Module) – The model to test
wm_dataloader (DataLoader) – DataLoader for the watermark graph
- Returns:
Accuracy on the watermark graph
- Return type:
float
- _train_defense_model()[source]¶
Helper function for training a defense model with watermarking.
- Returns:
The trained defense model with embedded watermark
- Return type:
torch.nn.Module
- _train_target_model()[source]¶
Helper function for training the target model on the original graph.
- Returns:
The trained target model
- Return type:
torch.nn.Module
- defend(attack_name=None)[source]¶
Main defense workflow: 1. Train a target model on the original graph 2. Attack the target model to establish baseline vulnerability 3. Train a defense model with watermarking 4. Test the defense model against the same attack 5. Print performance metrics
- Parameters:
attack_name (str, optional) – Name of the attack class to use, overrides the one set in __init__
- Returns:
Dictionary containing performance metrics
- Return type:
dict
- supported_api_types = {'dgl'}¶
pygip.models.defense.SurviveWM module¶
pygip.models.defense.SurviveWM2 module¶
- class pygip.models.defense.SurviveWM2.KeyInputOptimizer(training_dataset, key_inputs, T_opt=20)[source]¶
Bases:
object
- class pygip.models.defense.SurviveWM2.SAGEModel(input_dim, hidden_dim=64, num_classes=10, num_layers=3, dropout=0.1)[source]¶
Bases:
Module
- class pygip.models.defense.SurviveWM2.SurviveWM2(dataset, attack_node_fraction, model_path=None, alpha=0.1, num_layers=4, clean_epochs=200, wm_epochs=200, **kwargs)[source]¶
Bases:
BaseDefense
- _abc_impl = <_abc_data object>¶
- class pygip.models.defense.SurviveWM2.WatermarkGenerator(training_dataset, num_watermark_samples=None)[source]¶
Bases:
object
- pygip.models.defense.SurviveWM2.evaluate_clean_accuracy(model, test_data, batch_size=32)[source]¶
- Return type:
float
- pygip.models.defense.SurviveWM2.evaluate_watermark_effectiveness(model, key_inputs)[source]¶
- Return type:
float
pygip.models.defense.base module¶
Module contents¶
- class pygip.models.defense.ATOM(dataset, attack_node_fraction=0)[source]¶
Bases:
BaseDefense
- _abc_impl = <_abc_data object>¶
- supported_api_types = {'pyg'}¶
- supported_datasets = {'CiteSeer', 'Cora', 'PubMed'}¶
- class pygip.models.defense.BackdoorWM(dataset, attack_node_fraction, model_path=None, trigger_rate=0.01, l=20, target_label=0)[source]¶
Bases:
BaseDefense
- _abc_impl = <_abc_data object>¶
- inject_backdoor_trigger(data, trigger_rate=None, trigger_feat_val=0.99, l=None, target_label=None)[source]¶
Feature-based Trigger Injection
- supported_api_types = {'dgl'}¶
- class pygip.models.defense.ImperceptibleWM(dataset, attack_node_fraction=0.3, model_path=None)[source]¶
Bases:
BaseDefense
- _abc_impl = <_abc_data object>¶
- supported_api_types = {'pyg'}¶
- class pygip.models.defense.ImperceptibleWM2(dataset, attack_node_fraction=0.2, wm_node=50, target_label=None, N=50, M=5, epsilon1=1.0, epsilon2=0.5, epsilon3=1.0, owner_id=None, beta=0.001, T_acc=0.8)[source]¶
Bases:
BaseDefense
- _abc_impl = <_abc_data object>¶
- _calculate_embedding_loss(f_theta, backdoor_graph)[source]¶
Calculate embedding loss for model training on backdoor graph. Computes cross-entropy loss for training the main model on the combined clean and trigger graph data.
- Parameters:
f_theta (torch.nn.Module) – Main classification model
backdoor_graph (dgl.DGLGraph) – Combined graph with embedded watermark
- Returns:
Embedding loss value
- Return type:
torch.Tensor
- _calculate_generation_loss_integrated(f_theta_s, f_g, V_p)[source]¶
Calculate integrated generation loss combining all generator objectives. Combines imperception, regulation, and trigger losses with respective weights to optimize the trigger generator network.
- Parameters:
f_theta_s (torch.nn.Module) – Current state of the main model
f_g (torch.nn.Module) – Trigger generator network
V_p (torch.Tensor) – Poisoning node indices
- Returns:
Combined generation loss
- Return type:
torch.Tensor
- _calculate_imperception_loss(trigger_features, V_p)[source]¶
Calculate imperception loss to make watermark features similar to clean features. Measures cosine similarity between trigger features and poisoning node features to ensure the watermark remains hidden.
- Parameters:
trigger_features (torch.Tensor) – Generated trigger node features
V_p (torch.Tensor) – Poisoning node indices
- Returns:
Imperception loss value
- Return type:
torch.Tensor
- _calculate_regulation_loss(trigger_features)[source]¶
Calculate regulation loss based on owner ID signature. Enforces the trigger features to embed owner identification information using cross-entropy loss with the owner ID as target.
- Parameters:
trigger_features (torch.Tensor) – Generated trigger node features
- Returns:
Regulation loss value
- Return type:
torch.Tensor
- _calculate_trigger_loss(f_theta, trigger_features, trigger_graph)[source]¶
Calculate trigger loss for watermark effectiveness. Measures how well the model classifies trigger nodes to the target label, ensuring the watermark functions correctly.
- Parameters:
f_theta (torch.nn.Module) – Main classification model
trigger_features (torch.Tensor) – Generated trigger node features
trigger_graph (dgl.DGLGraph) – Trigger graph structure
- Returns:
Trigger loss value
- Return type:
torch.Tensor
- _construct_backdoor_graph(clean_graph, trigger_graph, V_p)[source]¶
Construct a backdoor graph by combining clean graph with trigger graph. Merges the original graph with the watermark trigger graph by adding connections between poisoning nodes and trigger nodes.
- Parameters:
clean_graph (dgl.DGLGraph) – Original clean graph
trigger_graph (dgl.DGLGraph) – Generated trigger/watermark graph
V_p (torch.Tensor) – Poisoning node indices for connection
- Returns:
Combined backdoor graph with embedded watermark
- Return type:
dgl.DGLGraph
- _create_temp_trigger_graph(trigger_features, edge_probs)[source]¶
Create a temporary trigger graph for loss calculation. Constructs a temporary graph structure using generated features and edge probabilities for intermediate computations.
- Parameters:
trigger_features (torch.Tensor) – Generated trigger node features
edge_probs (torch.Tensor) – Edge existence probabilities
- Returns:
Temporary trigger graph
- Return type:
dgl.DGLGraph
- _evaluate_model_on_graph(model, graph)[source]¶
Evaluate model performance on a specific graph. Computes classification metrics for the given model on the provided graph, handling different model architectures appropriately.
- Parameters:
model (torch.nn.Module) – Model to evaluate
graph (dgl.DGLGraph) – Graph data for evaluation
- Returns:
(accuracy, precision, recall, f1_score) metrics
- Return type:
tuple of float
- _evaluate_with_metrics(model, dataloader)[source]¶
Evaluate model performance using multiple classification metrics.
- Parameters:
model (torch.nn.Module) – The neural network model to evaluate
dataloader (torch.utils.data.DataLoader) – DataLoader containing evaluation data
- Returns:
(accuracy, precision, recall, f1_score) metrics
- Return type:
tuple of float
- _generate_trigger_graph(f_g, V_p)[source]¶
Generate a watermark trigger graph using the generator network. Creates trigger features and edges based on poisoning nodes and constructs a DGL graph for watermark embedding.
- Parameters:
f_g (torch.nn.Module) – Trigger generator network
V_p (torch.Tensor) – Selected poisoning node indices
- Returns:
The generated watermark trigger graph with features and labels
- Return type:
dgl.DGLGraph
- _inner_optimization(f_theta, f_g, V_p, optimizer)[source]¶
Execute the watermark embedding phase of bilevel optimization. Performs M iterations of model training on the backdoor graph to embed the watermark into the model parameters.
- Parameters:
f_theta (torch.nn.Module) – Main classification model
f_g (torch.nn.Module) – Trigger generator network
V_p (torch.Tensor) – Poisoning node indices
optimizer (torch.optim.Optimizer) – Optimizer for model parameters
- Returns:
Updated model with embedded watermark
- Return type:
torch.nn.Module
- _select_poisoning_nodes(clean_model)[source]¶
Select nodes for watermark poisoning based on model predictions. Uses the clean model’s confidence scores to identify high-confidence nodes across different labels for creating the watermark trigger.
- Parameters:
clean_model (torch.nn.Module) – Pre-trained clean model used for node selection
- Returns:
Tensor of selected node indices for poisoning
- Return type:
torch.Tensor
- _train_defense_model()[source]¶
Train the defense model with watermark embedding using bilevel optimization. Implements the complete bilevel optimization process alternating between watermark embedding and trigger generation phases.
- Returns:
(trained_defense_model, trigger_generator)
- Return type:
tuple
- _train_target_model()[source]¶
Train the target model on clean graph data. Creates and trains a GraphSAGE model on the original dataset without any watermark or defense mechanisms.
- Returns:
Trained target model
- Return type:
torch.nn.Module
- defend()[source]¶
Execute the complete watermark defense strategy. Trains target model, applies watermark defense, and verifies ownership. Returns comprehensive evaluation metrics and ownership verification results.
- Returns:
Dictionary containing attack metrics, defense metrics, ownership verification status, and trained generator
- Return type:
dict
- verify_ownership(suspicious_model)[source]¶
Verify ownership of a suspicious model using the watermark. Tests if the suspicious model correctly classifies the watermark trigger graph to determine if it contains the embedded watermark.
- Parameters:
suspicious_model (torch.nn.Module) – Model to test for ownership verification
- Returns:
(is_owner: bool, ownership_accuracy: float)
- Return type:
tuple
- class pygip.models.defense.RandomWM(dataset, attack_node_fraction=0.2, wm_node=50, pr=0.2, pg=0.2, attack_name=None)[source]¶
Bases:
BaseDefense
A flexible defense implementation using watermarking to protect against model extraction attacks on graph neural networks.
This class combines the functionalities from the original watermark.py: - Generating watermark graphs - Training models on original and watermark graphs - Merging graphs for testing - Evaluating effectiveness against attacks - Dynamic selection of attack methods
- _abc_impl = <_abc_data object>¶
- _evaluate_attack_on_watermark(attack_model)[source]¶
Evaluate how well the attack model performs on the watermark graph.
- Parameters:
attack_model (torch.nn.Module) – The model obtained from the attack
- Returns:
Attack model’s accuracy on the watermark graph
- Return type:
float
- _evaluate_watermark(model)[source]¶
Evaluate watermark detection effectiveness.
- Parameters:
model (torch.nn.Module) – The model to evaluate
- Returns:
Watermark detection accuracy
- Return type:
float
- _generate_watermark_graph()[source]¶
Generate a watermark graph using Erdos-Renyi random graph model.
- Returns:
The generated watermark graph
- Return type:
dgl.DGLGraph
- _get_attack_class(attack_name)[source]¶
Dynamically import and return the specified attack class.
- Parameters:
attack_name (str) – Name of the attack class to import
- Returns:
The requested attack class
- Return type:
class
- _test_on_watermark(model, wm_dataloader)[source]¶
Test a model’s accuracy on the watermark graph.
- Parameters:
model (torch.nn.Module) – The model to test
wm_dataloader (DataLoader) – DataLoader for the watermark graph
- Returns:
Accuracy on the watermark graph
- Return type:
float
- _train_defense_model()[source]¶
Helper function for training a defense model with watermarking.
- Returns:
The trained defense model with embedded watermark
- Return type:
torch.nn.Module
- _train_target_model()[source]¶
Helper function for training the target model on the original graph.
- Returns:
The trained target model
- Return type:
torch.nn.Module
- defend(attack_name=None)[source]¶
Main defense workflow: 1. Train a target model on the original graph 2. Attack the target model to establish baseline vulnerability 3. Train a defense model with watermarking 4. Test the defense model against the same attack 5. Print performance metrics
- Parameters:
attack_name (str, optional) – Name of the attack class to use, overrides the one set in __init__
- Returns:
Dictionary containing performance metrics
- Return type:
dict
- supported_api_types = {'dgl'}¶
- class pygip.models.defense.SurviveWM(dataset, attack_node_fraction, model_path=None)[source]¶
Bases:
BaseDefense
- _abc_impl = <_abc_data object>¶
- supported_api_types = {'dgl'}¶
- verify_watermark(model, trigger_graph, trigger_labels)¶
- class pygip.models.defense.SurviveWM2(dataset, attack_node_fraction, model_path=None, alpha=0.1, num_layers=4, clean_epochs=200, wm_epochs=200, **kwargs)[source]¶
Bases:
BaseDefense
- _abc_impl = <_abc_data object>¶