package noether
- Alphabetic
- By Inheritance
- noether
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
-
case class
AUC(metric: AUCMetric, samples: Int = 100) extends Aggregator[Prediction[Boolean, Double], MetricCurve, Double] with Product with Serializable
Compute the "Area Under the Curve" for a collection of predictions.
Compute the "Area Under the Curve" for a collection of predictions. Uses the Trapezoid method to compute the area.
Internally a linspace is defined using the given number of samples. Each point in the linspace represents a threshold which is used to build a confusion matrix. The area is then defined on this list of confusion matrices.
AUCMetric which is given to the aggregate selects the function to apply on the confusion matrix prior to the AUC calculation.
- metric
Which function to apply on the confusion matrix.
- samples
Number of samples to use for the curve definition.
-
sealed
trait
AUCMetric extends AnyRef
Which function to apply on the list of confusion matrices prior to the AUC calculation.
-
case class
BinaryConfusionMatrix(threshold: Double = 0.5) extends Aggregator[Prediction[Boolean, Double], Map[(Int, Int), Long], DenseMatrix[Long]] with Product with Serializable
Special Case for a Binary Confusion Matrix to make it easier to compose with other binary aggregators
Special Case for a Binary Confusion Matrix to make it easier to compose with other binary aggregators
- threshold
Threshold to apply on predictions
-
final
case class
CalibrationHistogram(lowerBound: Double = 0.0, upperBound: Double = 1.0, numBuckets: Int = 10) extends Aggregator[Prediction[Double, Double], Map[Double, (Double, Double, Long)], List[CalibrationHistogramBucket]] with Product with Serializable
Split predictions into Tensorflow Model Analysis compatible CalibrationHistogramBucket buckets.
Split predictions into Tensorflow Model Analysis compatible CalibrationHistogramBucket buckets.
If a prediction is less than the lower bound, it belongs to the bucket [-inf, lower bound) If it is greater than or equal to the upper bound, it belongs to the bucket (upper bound, inf]
- lowerBound
Left boundary, inclusive
- upperBound
Right boundary, exclusive
- numBuckets
Number of buckets in the histogram
-
final
case class
CalibrationHistogramBucket(lowerThresholdInclusive: Double, upperThresholdExclusive: Double, numPredictions: Double, sumLabels: Double, sumPredictions: Double) extends Product with Serializable
Histogram bucket.
Histogram bucket.
- lowerThresholdInclusive
Lower bound on bucket, inclusive
- upperThresholdExclusive
Upper bound on bucket, exclusive
- numPredictions
Number of predictions in this bucket
- sumLabels
Sum of label values for this bucket
- sumPredictions
Sum of prediction values for this bucket
-
final
case class
ClassificationReport(threshold: Double = 0.5, beta: Double = 1.0) extends Aggregator[Prediction[Boolean, Double], Map[(Int, Int), Long], Report] with Product with Serializable
Generate a Classification Report for a collection of binary predictions.
Generate a Classification Report for a collection of binary predictions. The output of this aggregator will be a Report object.
- threshold
Threshold to apply to get the predictions.
- beta
Beta parameter used in the f-score calculation.
-
final
case class
ConfusionMatrix(labels: Seq[Int]) extends Aggregator[Prediction[Int, Int], Map[(Int, Int), Long], DenseMatrix[Long]] with Product with Serializable
Generic Consfusion Matrix Aggregator for any dimension.
Generic Consfusion Matrix Aggregator for any dimension. Thresholds must be applied to make a prediction prior to using this aggregator.
- labels
List of possible label values
-
case class
Curve(metric: AUCMetric, samples: Int = 100) extends Aggregator[Prediction[Boolean, Double], MetricCurve, MetricCurvePoints] with Product with Serializable
Compute a series of points for a collection of predictions.
Compute a series of points for a collection of predictions.
Internally a linspace is defined using the given number of samples. Each point in the linspace represents a threshold which is used to build a confusion matrix. The (x,y) location of the line is then returned.
AUCMetric which is given to the aggregate selects the function to apply on the confusion matrix prior to the AUC calculation.
- metric
Which function to apply on the confusion matrix.
- samples
Number of samples to use for the curve definition.
-
case class
MeanAveragePrecision[T]() extends Aggregator[RankingPrediction[T], (Double, Long), Double] with Product with Serializable
Returns the mean average precision (MAP) of all the predictions.
Returns the mean average precision (MAP) of all the predictions. If a query has an empty ground truth set, the average precision will be zero
- case class MetricCurve(cm: Array[Map[(Int, Int), Long]]) extends Serializable with Product
- case class MetricCurvePoint(x: Double, y: Double) extends Serializable with Product
- case class MetricCurvePoints(points: Array[MetricCurvePoint]) extends Serializable with Product
-
final
case class
MultiClassificationReport(labels: Seq[Int], beta: Double = 1.0) extends Aggregator[Prediction[Int, Int], Map[(Int, Int), Long], Map[Int, Report]] with Product with Serializable
Generate a Classification Report for a collection of multiclass predictions.
Generate a Classification Report for a collection of multiclass predictions. A report is generated for each class by treating the predictions as binary of either "class" or "not class". The output of this aggregator will be a map of classes and their Report objects.
- labels
List of possible label values.
- beta
Beta parameter used in the f-score calculation.
-
case class
NdcgAtK[T](k: Int) extends Aggregator[RankingPrediction[T], (Double, Long), Double] with Product with Serializable
Compute the average NDCG value of all the predictions, truncated at ranking position k.
Compute the average NDCG value of all the predictions, truncated at ranking position k. The discounted cumulative gain at position k is computed as: sumi=1k (2{relevance of ith item} - 1) / log(i + 1), and the NDCG is obtained by dividing the DCG value on the ground truth set. In the current implementation, the relevance value is binary. If a query has an empty ground truth set, zero will be used as ndcg
See the following paper for detail:
IR evaluation methods for retrieving highly relevant documents. K. Jarvelin and J. Kekalainen
- k
the position to compute the truncated ndcg, must be positive
-
case class
PrecisionAtK[T](k: Int) extends Aggregator[RankingPrediction[T], (Double, Long), Double] with Product with Serializable
Compute the average precision of all the predictions, truncated at ranking position k.
Compute the average precision of all the predictions, truncated at ranking position k.
If for a prediction, the ranking algorithm returns n (n is less than k) results, the precision value will be computed as #(relevant items retrieved) / k. This formula also applies when the size of the ground truth set is less than k.
If a prediction has an empty ground truth set, zero will be used as precision together
See the following paper for detail:
IR evaluation methods for retrieving highly relevant documents. K. Jarvelin and J. Kekalainen
- k
the position to compute the truncated precision, must be positive
-
final
case class
Prediction[L, S](actual: L, predicted: S) extends Product with Serializable
Generic Prediction Object used by most aggregators
Generic Prediction Object used by most aggregators
- L
Type of the Real Value
- S
Type of the Predicted Value
- actual
Real value for this entry. Also normally seen as label.
- predicted
Predicted value. Can be a class or a score depending on the aggregator.
- type RankingPrediction[T] = Prediction[Array[T], Array[T]]
-
final
case class
Report(mcc: Double, fscore: Double, precision: Double, recall: Double, accuracy: Double, fpr: Double) extends Product with Serializable
Classification Report
Classification Report
- mcc
- fscore
- precision
- recall
- accuracy
- fpr
Value Members
- object AggregatorExample
-
object
ErrorRateSummary extends Aggregator[Prediction[Int, List[Double]], (Double, Long), Double] with Product with Serializable
Measurement of what percentage of values were predicted incorrectly.
- object LogLoss extends Aggregator[Prediction[Int, List[Double]], (Double, Long), Double] with Product with Serializable
- object PR extends AUCMetric with Product with Serializable
- object ROC extends AUCMetric with Product with Serializable