opfython.core

Core is the core. Essentially, it is the parent of everything. You should find parent classes defining the basis of our structure. They should provide variables and methods that will help to construct other modules.

A core package for all common opfython modules.

class opfython.core.Heap(size: Optional[int] = 1, policy: Optional[str] = 'min')

A standard implementation of a Heap structure.

__init__(self, size: Optional[int] = 1, policy: Optional[str] = 'min')

Initialization method.

Parameters
  • size – Maximum size of the heap.

  • policy – Heap’s policy (min or max).

property color(self)

List of nodes’ colors.

property cost(self)

List of nodes’ costs.

dad(self, i: int)

Gathers the position of the node’s dad.

Parameters

i – Node’s position.

Returns

The position of node’s dad.

Return type

(int)

go_down(self, i: int)

Goes down in the heap.

Parameters

i – Position to be achieved.

go_up(self, i: int)

Goes up in the heap.

Parameters

i – Position to be achieved.

insert(self, p: int)

Inserts a new node into the heap.

Parameters

p – Node’s value to be inserted.

Returns

Boolean indicating whether insertion was performed correctly.

Return type

(bool)

is_empty(self)

Checks if the heap is empty.

Returns

A boolean indicating whether the heap is empty.

Return type

(bool)

is_full(self)

Checks if the heap is full.

Returns

A boolean indicating whether the heap is full.

Return type

(bool)

property last(self)

Last element identifier.

left_son(self, i: int)

Gathers the position of the node’s left son.

Parameters

i – Node’s position.

Returns

The position of node’s left son

Return type

(int)

property p(self)

List of nodes’ values.

property policy(self)

Policy that rules the heap.

property pos(self)

List of nodes’ positioning markers.

remove(self)

Removes a node from the heap.

Returns

The removed node value.

Return type

(int)

right_son(self, i: int)

Gathers the position of the node’s right son.

Parameters

i – Node’s position.

Returns

The position of node’s right son.

Return type

(int)

property size(self)

Maximum size of the heap.

update(self, p: int, cost: float)

Updates a node with a new value.

Parameters
  • p – Node’s position.

  • cost – Node’s cost.

class opfython.core.Node(idx: Optional[int] = 0, label: Optional[int] = 0, features: Optional[numpy.array] = None)

A Node class is used as the lowest structure level in the OPF workflow.

__init__(self, idx: Optional[int] = 0, label: Optional[int] = 0, features: Optional[numpy.array] = None)

Initialization method.

Parameters
  • idx – The node’s identifier.

  • label – The node’s label.

  • features – An array of features.

property adjacency(self)

Adjacent nodes.

property cluster_label(self)

Node’s cluster assignment identifier.

property cost(self)

Node’s cost.

property density(self)

Node’s density.

property features(self)

np.array: N-dimensional array of features.

property idx(self)

Node’s index.

property label(self)

Node’s label (true label).

property n_plateaus(self)

Amount of adjacent nodes on plateaus.

property pred(self)

Identifier to the predecessor node.

property predicted_label(self)

Node’s predicted label.

property radius(self)

Maximum distance among the k-nearest neighbors.

property relevant(self)

Whether the node is relevant or not.

property root(self)

Cluster’s root node identifier.

property status(self)

Whether the node is a prototype or not.

class opfython.core.OPF(distance: Optional[str] = 'log_squared_euclidean', pre_computed_distance: Optional[str] = None)

A basic class to define all common OPF-related methods.

References

J. P. Papa, A. X. Falcão and C. T. N. Suzuki. LibOPF: A library for the design of optimum-path forest classifiers (2015).

__init__(self, distance: Optional[str] = 'log_squared_euclidean', pre_computed_distance: Optional[str] = None)

Initialization method.

Parameters
  • distance – An indicator of the distance metric to be used.

  • pre_computed_distance – A pre-computed distance file for feeding into OPF.

_read_distances(self, file_name: str)

Reads the distance between nodes from a pre-defined file.

Parameters

file_name – File to be loaded.

property distance(self)

Distance metric to be used.

property distance_fn(self)

Distance function to be used.

abstract fit(self, X: numpy.array, Y: numpy.array)

Fits data in the classifier.

It should be directly implemented in OPF child classes.

Parameters
  • X – Array of features.

  • Y – Array of labels.

load(self, file_name: str)

Loads the object from a pickle encoding.

Parameters

file_name – Pickle’s file path to be loaded.

property pre_computed_distance(self)

Whether OPF should use a pre-computed distance or not.

property pre_distances(self)

Pre-computed distance matrix.

abstract predict(self, X: numpy.array)

Predicts new data using the pre-trained classifier.

It should be directly implemented in OPF child classes.

Parameters

X – Array of features.

Returns

A list of predictions for each record of the data.

Return type

(List[int])

save(self, file_name: str)

Saves the object to a pickle encoding.

Parameters

file_name – File’s name to be saved.

property subgraph(self)

Subgraph’s instance.

class opfython.core.Subgraph(X: Optional[numpy.array] = None, Y: Optional[numpy.array] = None, I: Optional[numpy.array] = None, from_file: Optional[bool] = None)

A Subgraph class is used as a collection of Nodes and the basic structure to work with OPF.

__init__(self, X: Optional[numpy.array] = None, Y: Optional[numpy.array] = None, I: Optional[numpy.array] = None, from_file: Optional[bool] = None)

Initialization method.

Parameters
  • X – Array of features.

  • Y – Array of labels.

  • I – Array of indexes.

  • from_file – Whether Subgraph should be directly created from a file.

_build(self, X: numpy.array, Y: numpy.array, I: numpy.array)

This method serves as the object building process.

One can define several commands here that does not necessarily needs to be on its initialization.

Parameters
  • X – Features array.

  • Y – Labels array.

  • I – Indexes array.

_load(self, file_path: str)

Loads and parses a dataframe from a file.

Parameters

file_path – File to be loaded.

Returns

Arrays holding the features and labels.

Return type

(Tuple[np.array, np.array])

destroy_arcs(self)

Destroy the arcs present in the subgraph.

property idx_nodes(self)

List of ordered nodes indexes.

mark_nodes(self, i: int)

Marks a node and its whole path as relevant.

Parameters

i – An identifier of the node to start the marking.

property n_features(self)

Number of features.

property n_nodes(self)

Number of nodes.

property nodes(self)

List of nodes that belongs to the Subgraph.

reset(self)

Resets the subgraph predecessors and arcs.

property trained(self)

Indicate whether the subgraph is trained.