LEAD

This module implements the algorithm LEAD.

References

[1]Yu-Feng Li, Shao-Bo Wang and Zhi-Hua Zhou. Graph Quality Judgement: A Large Margin Expedition. In: Proceedings of the 25th International Joint Confernece on Artificial Intelligence (IJCAI‘16), New York, NY, 2016.
[2]R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9(2008), 1871-1874.
License:
MIT
class s3l.data_quality.LEAD.LEAD(C1=1.0, C2=0.01)[source]

Bases: s3l.base.TransductiveEstimatorwithGraph

Parameters:
  • C1 (float (default=1.0)) – weight for the hinge loss of labeled instances. It was set as 1 in our paper.
  • C2 (float (default=0.01)) – weight for the hinge loss of unlabeled instances. It was set as 0.01 in our paper.

Examples

>>> from s3l.data_quality.LEAD import LEAD
>>> from s3l.datasets import data_manipulate, base
>>> X, y = base.load_covtype(True)
>>> W = base.load_graph_covtype(True)
>>> _, test_idxs, labeled_idxs, unlabeled_idxs = \
>>>                     data_manipulate.inductive_split(X=X, y=y)
>>> lead = LEAD(C1 = 1.0, C2 = 0.01)
>>> lead.fit(X,y,labeled_idxs,W)
>>> lead.predict(unlabeled_idxs)
[1,-1,-1,1,1...,1]

References

LEAD implements the LEAD algorithm in [1].

LEAD employs the Python version of liblinear [2] (available at http://www.csie.ntu.edu.tw/~cjlin/liblinear/).

[1]Yu-Feng Li, Shao-Bo Wang and Zhi-Hua Zhou. Graph Quality Judgement: A Large Margin Expedition. In: Proceedings of the 25th International Joint Confernece on Artificial Intelligence (IJCAI‘16), New York, NY, 2016.
[2]R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9(2008), 1871-1874.
fit(gssl_value, label, l_ind, W)[source]

Given prediction from gssl, train method judge the quality of prediction with large-margin model

Parameters:
  • gssl_value (array-like) – a matrix with size n * T, where n is the number of instances and T is the number of graphs that gssl takes.Each row is a set of predictive values of an instance.
  • label (array-like) – a column binary vector with length n. Each element is +1 or -1 for labeled instances. For unlabeled instances, this parameter could be used for computing accuracy if the ground truth is available.
  • l_ind (array-like) – a row vector with length l, where l is the number of labeled instance. Each element is an index of a labeled instance.
  • W (matrix) – affinity matrix, labels should be at the left-top corner, should be in sparse form.
predict(u_ind, baseline_pred=None)[source]

predict method replace the unsafe prediction with the baseline_pred to improve the safeness.

Parameters:
  • u_ind (array-like) – a row vector with length l, where l is the number of unlabeled instance. Each element is an index of a unlabeled instance.
  • baseline_pred (array-like) – a column vector with length n. Each element is a baseline predictive result of the corresponding instance. LEAD will replace the result of S3VM with this if the instance locates in the margin of S3VM.
Returns:

pred – the label of the instance, including labeled and unlabeled instances, even though for labeled instances the prediction is consistent with the true label.

Return type:

a column vector with length n. Each element is a prediction for

set_params(param)[source]

Parameter setting function.

Parameters::dict (param) – Store parameter names and corresponding values {‘name’: value}.