Package org.apache.iceberg
-
Interface Summary Interface Description Accessor<T> AddedRowsScanTask A scan task for inserts generated by adding a data file to the table.AppendFiles API for appending new files in a table.BatchScan API for configuring a batch scan.BlobMetadata A metadata about a statistics or indices blob.ChangelogScanTask A changelog scan task.ClientPool<C,E extends java.lang.Exception> ClientPool.Action<R,C,E extends java.lang.Exception> CombinedScanTask A scan task made of several ranges from files.ContentFile<F> Superinterface ofDataFile
andDeleteFile
that exposes common methods.ContentScanTask<F extends ContentFile<F>> A scan task over a range of bytes in a content file.DataFile Interface for data files listed in a table manifest.DataTask A task that returns data asrows
instead of where to read data.DeletedDataFileScanTask A scan task for deletes generated by removing a data file from the table.DeletedRowsScanTask A scan task for deletes generated by adding delete files to the table.DeleteFile Interface for delete files listed in a table delete manifest.DeleteFiles API for deleting files from a table.ExpireSnapshots API for removing oldsnapshots
from a table.FileScanTask A scan task over a range of bytes in a single data file.GenericPartitionStatisticsFile HasTableOperations Used to expose a table's TableOperations.HistoryEntry Table history entry.IncrementalAppendScan API for configuring an incremental table scan for appends only snapshotsIncrementalChangelogScan API for configuring a scan for table changes.IncrementalScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>> API for configuring an incremental scan.LockManager An interface for locking, used to ensure commit isolation.ManageSnapshots API for managing snapshots.ManifestFile Represents a manifest file that can be scanned to find files in a table.ManifestFile.PartitionFieldSummary Summarizes the values of one partition field stored in a manifest file.MergeableScanTask<ThisT> A scan task that can be potentially merged with other scan tasks.MetadataUpdate Represents a change to table or view metadata.MetricsModes.MetricsMode A metrics calculation mode.MetricsUtil.ReadableMetricColDefinition.MetricFunction MetricsUtil.ReadableMetricColDefinition.TypeFunction OverwriteFiles API for overwriting files in a table.PartitionScanTask A scan task for data within a particular partitionPartitionStatisticsFile Represents a partition statistics file that can be used to read table data more efficiently.PendingUpdate<T> API for table metadata changes.PositionDeletesScanTask AScanTask
for position delete filesReplacePartitions API for overwriting files in a table by partition.ReplaceSortOrder API for replacing table sort order with a newly created order.RewriteFiles API for replacing files in a table.RewriteManifests API for rewriting manifests for a table.RowDelta API for encoding row-level changes to a table.Scan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>> Scan objects are immutable and can be shared between threads.ScanTask A scan task.ScanTaskGroup<T extends ScanTask> A scan task that may include partial input files, multiple input files or both.Snapshot A snapshot of the data in a table at a point in time.SnapshotUpdate<ThisT> API for table changes that produce snapshots.SortOrderBuilder<R> Methods for building a sort order.SplittableScanTask<ThisT> A scan task that can be split into smaller scan tasks.StatisticsFile Represents a statistics file in the Puffin format, that can be used to read table data more efficiently.StructLike Interface for accessing data by position in a schema.Table Represents a table.TableOperations SPI interface to abstract table metadata access and updates.Tables Generic interface for creating and loading a table implementation.TableScan API for configuring a table scan.Transaction A transaction for performing multiple updates to a table.UpdateLocation API for setting a table's or view's base location.UpdatePartitionSpec API for partition spec evolution.UpdatePartitionStatistics API for updating partition statistics files in a table.UpdateProperties API for updating table properties.UpdateRequirement Represents a requirement for aMetadataUpdate
UpdateSchema API for schema evolution.UpdateStatistics API for updating statistics files in a table. -
Class Summary Class Description Accessors Position2Accessor and Position3Accessor here is an optimization.AllDataFilesTable ATable
implementation that exposes a table's valid data files as rows.AllDataFilesTable.AllDataFilesTableScan AllDeleteFilesTable ATable
implementation that exposes its valid delete files as rows.AllDeleteFilesTable.AllDeleteFilesTableScan AllEntriesTable ATable
implementation that exposes a table's manifest entries as rows, for both delete and data files.AllFilesTable ATable
implementation that exposes its valid files as rows.AllFilesTable.AllFilesTableScan AllManifestsTable ATable
implementation that exposes a table's valid manifest files as rows.AllManifestsTable.AllManifestsTableScan BaseCombinedScanTask BaseFileScanTask BaseMetadataTable Base class for metadata tables.BaseMetastoreCatalog BaseMetastoreTableOperations BaseOverwriteFiles BaseReplacePartitions BaseReplaceSortOrder BaseRewriteManifests BaseScanTaskGroup<T extends ScanTask> BaseTable BaseTable
implementation.BaseTransaction CachingCatalog Class that wraps an Iceberg Catalog to cache tables.CatalogProperties CatalogUtil ChangelogUtil ClientPoolImpl<C,E extends java.lang.Exception> DataFiles DataFiles.Builder DataFilesTable ATable
implementation that exposes a table's data files as rows.DataFilesTable.DataFilesTableScan DataOperations Data operations that produce snapshots.DataTableScan DeleteFilesTable ATable
implementation that exposes a table's delete files as rows.DeleteFilesTable.DeleteFilesTableScan DoubleFieldMetrics Iceberg internally tracked field level metrics, used by Parquet and ORC writers only.DoubleFieldMetrics.Builder EnvironmentContext FieldMetrics<T> Iceberg internally tracked field level metrics.FileMetadata FileMetadata.Builder Files FileScanTaskParser FilesTable ATable
implementation that exposes a table's files as rows.FilesTable.FilesTableScan FindFiles FindFiles.Builder FloatFieldMetrics Iceberg internally tracked field level metrics, used by Parquet and ORC writers only.FloatFieldMetrics.Builder GenericBlobMetadata GenericManifestFile GenericManifestFile.CopyBuilder GenericPartitionFieldSummary GenericStatisticsFile GuavaClasses HistoryTable ATable
implementation that exposes a table's history as rows.IcebergBuild Loads iceberg-version.properties with build information.LocationProviders ManifestEntriesTable ATable
implementation that exposes a table's manifest entries as rows, for both delete and data files.ManifestFiles ManifestReader<F extends ContentFile<F>> Base reader for data and delete manifest files.ManifestsTable ATable
implementation that exposes a table's manifest files as rows.ManifestWriter<F extends ContentFile<F>> Writer for manifest files.MetadataColumns MetadataLogEntriesTable MetadataTableUtils MetadataUpdate.AddPartitionSpec MetadataUpdate.AddSchema MetadataUpdate.AddSnapshot MetadataUpdate.AddSortOrder MetadataUpdate.AddViewVersion MetadataUpdate.AssignUUID MetadataUpdate.RemovePartitionStatistics MetadataUpdate.RemoveProperties MetadataUpdate.RemoveSnapshot MetadataUpdate.RemoveSnapshotRef MetadataUpdate.RemoveStatistics MetadataUpdate.SetCurrentSchema MetadataUpdate.SetCurrentViewVersion MetadataUpdate.SetDefaultPartitionSpec MetadataUpdate.SetDefaultSortOrder MetadataUpdate.SetLocation MetadataUpdate.SetPartitionStatistics MetadataUpdate.SetProperties MetadataUpdate.SetSnapshotRef MetadataUpdate.SetStatistics MetadataUpdate.UpgradeFormatVersion MetadataUpdateParser Metrics Iceberg file format metrics.MetricsConfig MetricsModes This class defines different metrics modes, which allow users to control the collection of value_counts, null_value_counts, nan_value_counts, lower_bounds, upper_bounds for different columns in metadata.MetricsModes.Counts Under this mode, only value_counts, null_value_counts, nan_value_counts are persisted.MetricsModes.Full Under this mode, value_counts, null_value_counts, nan_value_counts and full lower_bounds, upper_bounds are persisted.MetricsModes.None Under this mode, value_counts, null_value_counts, nan_value_counts, lower_bounds, upper_bounds are not persisted.MetricsModes.Truncate Under this mode, value_counts, null_value_counts, nan_value_counts and truncated lower_bounds, upper_bounds are persisted.MetricsUtil MetricsUtil.ReadableColMetricsStruct A struct of readable metric values for a primitive columnMetricsUtil.ReadableMetricColDefinition Fixed definition of a readable metric column, ie a mapping of a raw metric to a readable metricMetricsUtil.ReadableMetricsStruct A struct, consisting of allMetricsUtil.ReadableColMetricsStruct
for all primitive columns of the tableMicroBatches MicroBatches.MicroBatch MicroBatches.MicroBatchBuilder PartitionData PartitionField Represents a single field in aPartitionSpec
.Partitioning PartitionKey A struct of partition values.PartitionSpec Represents how to produce partition data for a table.PartitionSpec.Builder Used to create validpartition specs
.PartitionSpecParser PartitionsTable ATable
implementation that exposes a table's partitions as rows.PartitionStatisticsFileParser PositionDeletesTable ATable
implementation whoseScan
providesPositionDeletesScanTask
, for reading of position delete files.PositionDeletesTable.PositionDeletesBatchScan ReachableFileUtil RefsTable ATable
implementation that exposes a table's known snapshot references as rows.RollingManifestWriter<F extends ContentFile<F>> As opposed toManifestWriter
, a rolling writer could produce multiple manifest files.ScanSummary ScanSummary.Builder ScanSummary.PartitionMetrics Schema The schema of a data table.SchemaParser SerializableTable A read-only serializable table that can be sent to other nodes in a cluster.SerializableTable.SerializableMetadataTable SetLocation SetPartitionStatistics SetStatistics SingleValueParser SnapshotIdGeneratorUtil SnapshotManager SnapshotParser SnapshotRef SnapshotRef.Builder SnapshotRefParser SnapshotScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>> This is a common base class to share code between different BaseScan implementations that handle scans of a particular snapshot.SnapshotsTable ATable
implementation that exposes a table's known snapshots as rows.SnapshotSummary SnapshotSummary.Builder SortField A field in aSortOrder
.SortKey A struct of flattened sort field values.SortOrder A sort order that defines how data and delete files should be ordered in a table.SortOrder.Builder A builder used to create validsort orders
.SortOrderComparators SortOrderParser SparkDistributedDataScan A batch data scan that can utilize Spark cluster resources for planning.StaticTableOperations TableOperations implementation that provides access to metadata for a Table at some point in time, using a table metadata location.StatisticsFileParser StreamingDelete Delete
implementation that avoids loading full manifests in memory.SystemConfigs Configuration properties that are controlled by Java system properties or environmental variable.SystemConfigs.ConfigEntry<T> SystemProperties Deprecated. UseSystemConfigs
instead; will be removed in 2.0.0TableMetadata Metadata for a table.TableMetadata.Builder TableMetadata.MetadataLogEntry TableMetadata.SnapshotLogEntry TableMetadataParser TableProperties Transactions UnboundPartitionSpec UnboundSortOrder UpdateRequirement.AssertCurrentSchemaID UpdateRequirement.AssertDefaultSortOrderID UpdateRequirement.AssertDefaultSpecID UpdateRequirement.AssertLastAssignedFieldId UpdateRequirement.AssertLastAssignedPartitionId UpdateRequirement.AssertRefSnapshotID UpdateRequirement.AssertTableDoesNotExist UpdateRequirement.AssertTableUUID UpdateRequirement.AssertViewUUID UpdateRequirementParser UpdateRequirements -
Enum Summary Enum Description BaseMetastoreTableOperations.CommitStatus ChangelogOperation An enum representing possible operations in a changelog.DistributionMode Enum of supported write distribution mode, it defines the write behavior of batch or streaming job:FileContent Content type stored in a file, one of DATA, POSITION_DELETES, or EQUALITY_DELETES.FileFormat Enum of supported file formats.IsolationLevel An isolation level in a table.ManifestContent Content type stored in a manifest file, either DATA or DELETES.ManifestReader.FileType MetadataTableType NullOrder PlanningMode RewriteJobOrder Enum of supported rewrite job order, it defines the order in which the file groups should be written.RowLevelOperationMode Iceberg supports two ways to modify records in a table: copy-on-write and merge-on-read.SortDirection TableMetadataParser.Codec