opfython.models¶
Each machine learning OPF-based technique is defined in this package. From Supervised OPF to Unsupervised OPF, you can use whatever suits your needs.
A modeling package for all common opfython modules.
- class opfython.models.KNNSupervisedOPF(max_k: Optional[int] = 1, distance: Optional[str] = 'log_squared_euclidean', pre_computed_distance: Optional[str] = None)¶
Bases:
opfython.core.OPF
A KNNSupervisedOPF which implements the supervised version of OPF classifier with a KNN subgraph.
References
J. P. Papa and A. X. Falcão. A Learning Algorithm for the Optimum-Path Forest Classifier. Graph-Based Representations in Pattern Recognition (2009).
- __init__(self, max_k: Optional[int] = 1, distance: Optional[str] = 'log_squared_euclidean', pre_computed_distance: Optional[str] = None)¶
Initialization method.
- Parameters
max_k – Maximum k value for cutting the subgraph.
distance – An indicator of the distance metric to be used.
pre_computed_distance – A pre-computed distance file for feeding into OPF.
- _clustering(self, force_prototype: Optional[bool] = False)¶
Clusters the subgraph.
- Parameters
force_prototype – Whether clustering should for each class to have at least one prototype.
- _learn(self, X_train: numpy.array, Y_train: numpy.array, I_train: numpy.array, X_val: numpy.array, Y_val: numpy.array, I_val: numpy.array)¶
Learns the best k value over the validation set.
- Parameters
X_train – Array of training features.
Y_train – Array of training labels.
I_train – Array of training indexes.
X_val – Array of validation features.
Y_val – Array of validation labels.
I_val – Array of validation indexes.
- fit(self, X_train: numpy.array, Y_train: numpy.array, X_val: numpy.array, Y_val: numpy.array, I_train: Optional[numpy.array] = None, I_val: Optional[numpy.array] = None)¶
Fits data in the classifier.
- Parameters
X_train – Array of training features.
Y_train – Array of training labels.
X_val – Array of validation features.
Y_val – Array of validation labels.
I_train – Array of training indexes.
I_val – Array of validation indexes.
- property max_k(self)¶
Maximum k value for cutting the subgraph.
- predict(self, X_test: numpy.array, I_test: Optional[numpy.array] = None)¶
Predicts new data using the pre-trained classifier.
- Parameters
X_test – Array of features.
I_test – Array of indexes.
- Returns
A list of predictions for each record of the data.
- Return type
(List[int])
- class opfython.models.SupervisedOPF(distance: Optional[str] = 'log_squared_euclidean', pre_computed_distance: Optional[str] = None)¶
Bases:
opfython.core.OPF
A SupervisedOPF which implements the supervised version of OPF classifier.
References
J. P. Papa, A. X. Falcão and C. T. N. Suzuki. Supervised Pattern Classification based on Optimum-Path Forest. International Journal of Imaging Systems and Technology (2009).
- __init__(self, distance: Optional[str] = 'log_squared_euclidean', pre_computed_distance: Optional[str] = None)¶
Initialization method.
- Parameters
distance – An indicator of the distance metric to be used.
pre_computed_distance – A pre-computed distance file for feeding into OPF.
- _find_prototypes(self)¶
Find prototype nodes using the Minimum Spanning Tree (MST) approach.
- fit(self, X_train: numpy.array, Y_train: numpy.array, I_train: Optional[numpy.array] = None)¶
Fits data in the classifier.
- Parameters
X_train – Array of training features.
Y_train – Array of training labels.
I_train – Array of training indexes.
- learn(self, X_train: numpy.array, Y_train: numpy.array, X_val: numpy.array, Y_val: numpy.array, n_iterations: Optional[int] = 10)¶
Learns the best classifier over a validation set.
- Parameters
X_train – Array of training features.
Y_train – Array of training labels.
X_val – Array of validation features.
Y_val – Array of validation labels.
n_iterations – Number of iterations.
- predict(self, X_val: numpy.array, I_val: Optional[numpy.array] = None)¶
Predicts new data using the pre-trained classifier.
- Parameters
X_val – Array of validation or test features.
I_val – Array of validation or test indexes.
- Returns
A list of predictions for each record of the data.
- Return type
(List[int])
- prune(self, X_train: numpy.array, Y_train: numpy.array, X_val: numpy.array, Y_val: numpy.array, n_iterations: Optional[int] = 10)¶
Prunes a classifier over a validation set.
- Parameters
X_train – Array of training features.
Y_train – Array of training labels.
X_val – Array of validation features.
Y_val – Array of validation labels.
n_iterations – Maximum number of iterations.
- class opfython.models.UnsupervisedOPF(min_k: Optional[int] = 1, max_k: Optional[int] = 1, distance: Optional[str] = 'log_squared_euclidean', pre_computed_distance: Optional[str] = None)¶
Bases:
opfython.core.OPF
An UnsupervisedOPF which implements the unsupervised version of OPF classifier.
References
L. M. Rocha, F. A. M. Cappabianco, A. X. Falcão. Data clustering as an optimum-path forest problem with applications in image analysis. International Journal of Imaging Systems and Technology (2009).
- __init__(self, min_k: Optional[int] = 1, max_k: Optional[int] = 1, distance: Optional[str] = 'log_squared_euclidean', pre_computed_distance: Optional[str] = None)¶
Initialization method.
- Parameters
min_k – Minimum k value for cutting the subgraph.
max_k – Maximum k value for cutting the subgraph.
distance – An indicator of the distance metric to be used.
pre_computed_distance – A pre-computed distance file for feeding into OPF.
- _best_minimum_cut(self, min_k: int, max_k: int)¶
Performs a minimum cut on the subgraph using the best k value.
- Parameters
min_k – Minimum value of k.
max_k – Maximum value of k.
- _clustering(self, n_neighbours: int)¶
Clusters the subgraph using using a k value (number of neighbours).
- Parameters
n_neighbours – Number of neighbours to be used.
- _normalized_cut(self, n_neighbours: int)¶
Performs a normalized cut over the subgraph using a k value (number of neighbours).
- Parameters
n_neighbours – Number of neighbours to be used.
- Returns
The value of the normalized cut.
- Return type
(int)
- fit(self, X_train: numpy.array, Y_train: Optional[numpy.array] = None, I_train: Optional[numpy.array] = None)¶
Fits data in the classifier.
- Parameters
X_train – Array of training features.
Y_train – Array of training labels.
I_train – Array of training indexes.
- property max_k(self)¶
Maximum k value for cutting the subgraph.
- property min_k(self)¶
Minimum k value for cutting the subgraph.
- predict(self, X_val: numpy.array, I_val: Optional[numpy.array] = None)¶
Predicts new data using the pre-trained classifier.
- Parameters
X_val – Array of validation features.
I_val – Array of validation indexes.
- Returns
A list of predictions for each record of the data.
- Return type
(List[int])
- propagate_labels(self)¶
Runs through the clusters and propagate the clusters roots labels to the samples.