Operations

This module provides a set of operations that are supported by KGX.

Clique Merge

class kgx.operations.clique_merge.CliqueMerge(prefix_prioritization_map: dict = None)[source]

Bases: object

build_cliques(target_graph: networkx.classes.multidigraph.MultiDiGraph)[source]

Builds a clique graph from same_as edges in target_graph.

Parameters

target_graph (networkx.MultiDiGraph) – A MultiDiGraph that contains nodes and edges

Returns

The clique graph with only same_as edges

Return type

networkx.Graph

consolidate_edges() → networkx.classes.multidigraph.MultiDiGraph[source]

Move all edges from nodes in a clique to the clique leader.

Returns

The target graph where all edges from nodes in a clique are moved to clique leader

Return type

nx.MultiDiGraph

elect_leader()[source]

Elect leader for each clique in a graph.

get_category_from_equivalence(node: str, attributes: dict) → str[source]

Get category for a node based on its equivalent nodes in a graph.

Parameters
  • node (str) – Node identifier

  • attributes (dict) – Node’s attributes

Returns

Category for the node

Return type

str

get_leader_by_annotation(clique: list) → Tuple[Optional[str], Optional[str]][source]

Get leader by searching for leader annotation property in any of the nodes in a given clique.

Parameters

clique (list) – A list of nodes from a clique

Returns

A tuple containing the node that has been elected as the leader, and the election strategy

Return type

tuple[Optional[str], Optional[str]]

get_leader_by_prefix_priority(clique: list, prefix_priority_list: list) → Tuple[Optional[str], Optional[str]][source]

Get leader from clique based on a given prefix priority.

Parameters
  • clique (list) – A list of nodes that correspond to a clique

  • prefix_priority_list (list) – A list of prefixes in descending priority

Returns

A tuple containing the node that has been elected as the leader, and the election strategy

Return type

tuple[Optional[str], Optional[str]]

get_leader_by_sort(clique: list) → Tuple[Optional[str], Optional[str]][source]

Get leader from clique based on the first selection from an alphabetical sort of the node id prefixes.

Parameters

clique (list) – A list of nodes that correspond to a clique

Returns

A tuple containing the node that has been elected as the leader, and the election strategy

Return type

tuple[Optional[str], Optional[str]]

get_the_most_specific_category(categories: list) → Tuple[str, list][source]

From a list of categories, it tries to fetch ancestors for all. The category with the longest ancestor is considered to be the most specific.

Parameters

categories (list) – A list of categories

Returns

A tuple of the most specific category and a list of ancestors of that category

Return type

tuple[str, list]

update_categories(clique: list)[source]

For a given clique, get category for each node in clique and validate against BioLink Model, mapping to BioLink Model category where needed.

Ex.: If a node has gene as its category, then this method adds all of its ancestors.

Parameters

clique (list) – A list of nodes from a clique

validate_categories(clique: list) → Tuple[str, list][source]

For nodes in a clique, validate the category for each node to make sure that all nodes in a clique are of the same type.

Parameters

clique (list) – A list of nodes from a clique

Returns

A tuple of clique category string and a list of invalid nodes

Return type

tuple[str, list]