Class StrictMetricsEvaluator


  • public class StrictMetricsEvaluator
    extends java.lang.Object
    Evaluates an Expression on a DataFile to test whether all rows in the file match.

    This evaluation is strict: it returns true if all rows in a file must match the expression. For example, if a file's ts column has min X and max Y, this evaluator will return true for ts < Y+1 but not for ts < Y-1.

    Files are passed to eval(ContentFile), which returns true if all rows in the file must contain matching rows and false if the file may contain rows that do not match.

    Due to the comparison implementation of ORC stats, for float/double columns in ORC files, if the first value in a file is NaN, metrics of this file will report NaN for both upper and lower bound despite that the column could contain non-NaN data. Thus in some scenarios explicitly checks for NaN is necessary in order to not include files that may contain rows that don't match.

    • Constructor Detail

      • StrictMetricsEvaluator

        public StrictMetricsEvaluator​(Schema schema,
                                      Expression unbound)
      • StrictMetricsEvaluator

        public StrictMetricsEvaluator​(Schema schema,
                                      Expression unbound,
                                      boolean caseSensitive)
    • Method Detail

      • eval

        public boolean eval​(ContentFile<?> file)
        Test whether all records within the file match the expression.
        Parameters:
        file - a data file
        Returns:
        false if the file may contain any row that doesn't match the expression, true otherwise.