Utilities

The utilities module include all the utility methods used throughout KGX.

graph_utils

kgx.utils.graph_utils.curie_lookup(curie: str) → str[source]

Given a CURIE, find its label.

This method first does a lookup in predefined maps. If none found, it makes use of CurieLookupService to look for the CURIE in a set of preloaded ontologies.

Parameters

curie (str) – A CURIE

Returns

The label corresponding to the given CURIE

Return type

str

kgx.utils.graph_utils.get_ancestors(graph: networkx.classes.multidigraph.MultiDiGraph, node: str, relations: List[str] = None) → List[str][source]

Return all ancestors of specified node, filtered by relations.

Parameters
  • graph (networkx.MultiDiGraph) – Graph to traverse

  • node (str) – node identifier

  • relations (List[str]) – list of relations

Returns

A list of ancestor nodes

Return type

List[str]

kgx.utils.graph_utils.get_category_via_superclass(graph: networkx.classes.multidigraph.MultiDiGraph, curie: str, load_ontology: bool = True) → Set[str][source]

Get category for a given CURIE by tracing its superclass, via subclass_of hierarchy, and getting the most appropriate category based on the superclass.

Parameters
  • graph (networkx.MultiDiGraph) – Graph to traverse

  • curie (str) – Input CURIE

  • load_ontology (bool) – Determines whether to load ontology, based on CURIE prefix, or to simply rely on subclass_of hierarchy from graph

Returns

A set containing one (or more) category for the given CURIE

Return type

Set[str]

kgx.utils.graph_utils.get_parents(graph: networkx.classes.multidigraph.MultiDiGraph, node: str, relations: List[str] = None) → List[str][source]

Return all direct parents of a specified node, filtered by relations.

Parameters
  • graph (networkx.MultiDiGraph) – Graph to traverse

  • node (str) – node identifier

  • relations (List[str]) – list of relations

Returns

A list of parent node(s)

Return type

List[str]

kgx_utils

kgx.utils.kgx_utils.camelcase_to_sentencecase(s: str) → str[source]

Convert CamelCase to sentence case.

Parameters

s (str) – Input string in CamelCase

Returns

a normal string

Return type

str

kgx.utils.kgx_utils.contract(uri) → str[source]

Contract a URI a CURIE. We sort the curies to ensure that we take the same item every time.

Parameters

uri (Union[rdflib.term.URIRef, str]) – A URI

Returns

The CURIE

Return type

str

kgx.utils.kgx_utils.generate_edge_key(s: str, edge_label: str, o: str) → str[source]

Generates an edge key based on a given subject, edge_label and object.

Parameters
  • s (str) – Subject

  • edge_label (str) – Edge label

  • o (str) – Object

Returns

Edge key as a string

Return type

str

Get a BioLink Model mapping for a given category.

Parameters

category (str) – A category for which there is a mapping in BioLink Model

Returns

A BioLink Model class corresponding to category

Return type

str

kgx.utils.kgx_utils.get_cache(maxsize=10000)[source]

Get an instance of cachetools.cache

Parameters

maxsize (int) – The max size for the cache (10000, by default)

Returns

An instance of cachetools.cache

Return type

cachetools.cache

kgx.utils.kgx_utils.get_curie_lookup_service()[source]

Get an instance of kgx.curie_lookup_service.CurieLookupService

Returns

An instance of CurieLookupService

Return type

kgx.curie_lookup_service.CurieLookupService

kgx.utils.kgx_utils.get_toolkit() → bmt.Toolkit[source]

Get an instance of bmt.Toolkit If there no instance defined, then one is instantiated and returned.

Returns

an instance of bmt.Toolkit

Return type

bmt.Toolkit

kgx.utils.kgx_utils.make_curie(uri) → str[source]

Convert a given URI into a CURIE. This method tries to handle the http and https ambiguity in URI contraction.

Warning

This is a temporary solution and will be deprecated in the near future.

kgx.utils.kgx_utils.sentencecase_to_snakecase(s: str) → str[source]

Convert sentence case to snake_case.

Parameters

s (str) – Input string in sentence case

Returns

a normal string

Return type

str

kgx.utils.kgx_utils.snakecase_to_sentencecase(s: str) → str[source]

Convert snake_case to sentence case.

Parameters

s (str) – Input string in snake_case

Returns

a normal string

Return type

str

model_utils

TODO: add methods for ensuring that other biolink model specifications hold, like that all required properties are present and that they have the correct multiplicity, and that all identifiers are CURIE’s.

kgx.utils.model_utils.make_valid_types(G: networkx.classes.multidigraph.MultiDiGraph) → None[source]

Ensures that all the nodes have valid categories, and that all edges have valid edge labels.

Nodes will be deleted if they have no name and have no valid categories. If a node has no valid category but does have a name then its category will be set to the default category “named thing”.

Edges with invalid edge labels will have their edge label set to the default value “related_to”

rdf_utils

kgx.utils.rdf_utils.infer_category(iri: rdflib.term.URIRef, rdfgraph: rdflib.graph.Graph) → List[str][source]

Infer category for a given iri by traversing rdfgraph.

Parameters
  • iri (rdflib.term.URIRef) – IRI

  • rdfgraph (rdflib.Graph) – A graph to traverse

Returns

A list of category corresponding to the given IRI

Return type

List[str]

kgx.utils.rdf_utils.process_iri(iri: Union[str, rdflib.term.URIRef]) → str[source]

Casts iri to a string, and then checks whether it maps to any pre-defined values. If so returns that value, otherwise converts that iri to a curie and returns.

Parameters

iri (Union[str, URIRef]) – IRI to process; can be a string or a rdflib.term.URIRef

Returns

A string corresponding to the IRI

Return type

str