pygip.models.defense.Integrity¶

Classes

`BitFlipAttack`(model[, attack_type, bit])
`InductiveFingerprintGenerator`(model, dataset)
`MettackHelper`(graph, features, labels, ...)
`QueryBasedVerificationDefense`(dataset[, ...])
`TransductiveFingerprintGenerator`(model, dataset)

class pygip.models.defense.Integrity.BitFlipAttack(model, attack_type='random', bit=0)[source]¶

Bases: object

_get_target_params()[source]¶

_true_bit_flip(tensor, index=None, bit=0)[source]¶

apply()[source]¶

class pygip.models.defense.Integrity.InductiveFingerprintGenerator(model, dataset, shadow_graph=None, knowledge='limited', candidate_fraction=0.3, num_fingerprints=5, randomize=True, random_seed=None, device='cpu', perturb_fingerprints=False, perturb_budget=5)[source]¶

Bases: object

_generate_full()[source]¶: Implements full knowledge fingerprint generation (gradient-based). Based on Section 4.2.1 and 5.2 of Wu et al. (2023).

_generate_limited()[source]¶: Implements limited knowledge fingerprint generation (output-based). Based on Section 4.2.2 and 5.2 of Wu et al. (2023).

_get_features()[source]¶

_greedy_edge_perturbation_f(node_idx, perturb_budget)[source]¶: Full knowledge edge perturbation (Inductive-F). Increases fingerprint score using model gradients while preserving prediction.

_greedy_edge_perturbation_l(node_idx, perturb_budget)[source]¶: Limited knowledge edge perturbation (Inductive-L). Uses confidence margin (1 - confidence) as proxy for fingerprint sensitivity.

compute_fingerprint_score(node_idx, graph_override=None)[source]¶: Computes the fingerprint score for a given node according to knowledge mode. If graph_override is provided, scoring is done on that graph instead of shadow_graph.

generate_fingerprint_nodes()[source]¶

Step 3: Identifies and returns the top-k (num_fingerprints) nodes with the highest fingerprint scores from the candidate set. (Section 4.2.2)

Returns:: Indices of selected fingerprint nodes.
Return type:: List[int]

generate_fingerprints(method='full')[source]¶

Generate inductive fingerprints for model watermarking.

Parameters:: method (str) – ‘full’ for gradient-based or ‘limited’ for output-based
Returns:: List of fingerprints

get_candidate_nodes()[source]¶

greedy_edge_perturbation(node_idx, perturb_budget=5, knowledge='full')[source]¶

Dispatch to greedy edge perturbation strategy based on verifier knowledge level.

Parameters:

node_idx (int) – Fingerprint node index.
perturb_budget (int) – Number of edge perturbations allowed.
knowledge (str) – ‘full’ or ‘limited’

greedy_perturb_fingerprints(node_indices)[source]¶

Greedily perturbs each fingerprint node’s features (not edges) to increase its fingerprint score, without changing the predicted label.

For each node, for each feature dimension:
- Add or subtract a small epsilon.
- Accept change if predicted label stays the same and fingerprint score increases.
- Stop after perturb_budget attempts or no improvement.

Returns:: Indices of perturbed fingerprint nodes (features in shadow_graph are updated in-place).
Return type:: List[int]

save_fingerprint_tuples(node_indices)[source]¶

class pygip.models.defense.Integrity.MettackHelper(graph, features, labels, train_mask, val_mask, test_mask, n_perturbations=5, device='cpu', max_perturbations=50, surrogate_epochs=30, candidate_sample_size=20)[source]¶

Bases: object

_apply_single_perturbation(graph, edge, action)[source]¶: Apply a single edge perturbation (add or remove) to the graph.

_apply_structure_attack()[source]¶: Runs the Mettack structure perturbation loop (bi-level optimization). - At each step, modify the adjacency matrix (add/remove an edge). - Select the perturbation that maximizes surrogate model loss on the validation nodes. - Repeat up to n_perturbations times. Returns a new DGLGraph with edges modified. (See Appendix A.2 in Wu et al.)

_compute_accuracy(logits, labels)[source]¶: Helper function to compute accuracy.

_compute_attack_loss(perturbed_graph)[source]¶: Compute the attack loss on a perturbed graph. This measures how much the surrogate model’s performance degrades. Uses proper bi-level optimization as in the original Mettack paper.

_create_val_mask_from_train(train_mask)[source]¶: Create a validation mask by taking a subset of training nodes. This is needed when the dataset doesn’t provide a validation mask.

_evaluate(poisoned_graph)[source]¶: Evaluates GCN accuracy before/after poisoning, etc.

_get_candidate_edges()[source]¶: Generate candidate edges for perturbation. Includes both existing edges (for removal) and non-existing edges (for addition).

_train_surrogate()[source]¶: Trains a surrogate GCN on the clean graph. (Matches Wu et al., Section 6.1)

run()[source]¶

Main entrypoint to run the Mettack algorithm. :returns: The perturbed graph with edges changed.

metrics (dict): Metrics for before/after attack, for evaluation.

Return type:: poisoned_graph (DGLGraph)

class pygip.models.defense.Integrity.QueryBasedVerificationDefense(dataset, defense_ratio=0.1, model_path=None)[source]¶

Bases: BaseDefense

_abc_impl = <_abc_data object>¶

static _convert_detection_to_binary(detection_results)[source]¶: Convert detection results to binary classification format

_evaluate_accuracy(model, dataset)[source]¶

Evaluates test accuracy of the given model on the dataset.

Parameters:

model – Trained GCN model
dataset – Dataset object (provides features, labels, test_mask, graph)

Returns:

float (test accuracy, 0-1)

Return type:

accuracy

_evaluate_fingerprints(model, fingerprints)[source]¶

Checks if fingerprinted nodes have changed labels under the given model.

Parameters:

model – The model to evaluate.
fingerprints – List of (graph, node_id, label) tuples.

Returns:

{: ‘flipped’: List[Tuple[node_id, old_label, new_label]], ‘flip_rate’: float

}

Return type:

results

_generate_fingerprints(model, mode='transductive', knowledge='full', k=5, **kwargs)[source]¶: Wrapper for fingerprint generation based on mode and knowledge level. :returns: List of fingerprints

_get_features()[source]¶

_load_model(model_path)[source]¶: Load pre-trained model.

_random_edge_addition_poisoning(node_fraction=0.1, edges_per_node=5, random_seed=None)[source]¶

Poison a fraction of nodes by adding random edges.

Parameters:

dataset – Dataset object (DGL-based)
node_fraction – Fraction of nodes to poison (e.g., 0.1 = 10%)
edges_per_node – Number of random edges to add per poisoned node
random_seed – Optional seed

Returns:

DGLGraph

Return type:

poisoned_graph

_retrain_poisoned_model(poisoned_graph, epochs=200)[source]¶

Retrain target GCN using the poisoned graph structure.

Parameters:

dataset – Original Dataset object (provides features, labels, masks)
poisoned_graph – DGLGraph (with new random edges added)
defense_class – The defense class to use for model training (e.g., QueryBasedVerificationDefense)
device – ‘cpu’ or ‘cuda’

Returns:

Trained GCN model

Return type:

model

_run_attack(model, attack_type='mettack', knowledge='full', **kwargs)[source]¶

Run the specified attack on the model. :returns: torch.nn.Module

metadata: dict with info about the attack

Return type:: poisoned_model

_train_target_model(epochs=200)[source]¶

Trains target GCN model according to protocol in Wu et al. (2023), Section 6.1 for graph node classification.

Returns:: model – The trained GCN model.
Return type:: torch.nn.Module

defend(fingerprint_mode='inductive', knowledge='full', attack_type='bitflip', k=5, num_trials=1, use_edge_perturbation=False, verbose=True, **kwargs)[source]¶: Execute the query-based verification defense.

evaluate_model(model, dataset)[source]¶: Evaluate model performance on downstream task

run_full_pipeline(attack_type='random', mode='transductive', knowledge='full', k=5, trials=1, **kwargs)[source]¶

Runs the full fingerprinting + attack + evaluation pipeline.

Parameters:

attack_type – ‘random’, ‘bitflip’, or ‘mettack’
mode – ‘transductive’ or ‘inductive’
knowledge – ‘full’ or ‘limited’
k – number of fingerprints
trials – number of repeated trials
kwargs – extra params for attack or fingerprinting

Prints per-trial results and summary statistics.

supported_api_types = {'dgl'}¶

supported_datasets = {}¶

train_target_model(metric_comp)[source]¶: Train the target model with defense mechanism.

verify_defense(model, fingerprints, attack_type, **kwargs)[source]¶: Verify defense effectiveness by running attack and checking fingerprints

class pygip.models.defense.Integrity.TransductiveFingerprintGenerator(model, dataset, candidate_fraction=0.3, random_seed=None, device='cpu', randomize=True)[source]¶

Bases: object

_get_features()[source]¶: Backend-agnostic feature getter (DGL or PyG).

compute_fingerprint_scores_full(candidate_nodes)[source]¶: Full-knowledge fingerprint scores (gradient-based).

compute_fingerprint_scores_limited(candidate_nodes)[source]¶: Limited-knowledge fingerprint scores (confidence margin).

generate_fingerprints(k=5, method='full')[source]¶

get_candidate_nodes()[source]¶: Randomly sample a subset of nodes as candidates.

select_top_fingerprints(scores, candidate_nodes, k, method='full')[source]¶: Selects top-k fingerprint nodes after filtering out extreme score outliers.