Class AwsProperties

  • All Implemented Interfaces:
    java.io.Serializable

    public class AwsProperties
    extends java.lang.Object
    implements java.io.Serializable
    See Also:
    Serialized Form
    • Field Detail

      • S3FILEIO_SSE_TYPE

        public static final java.lang.String S3FILEIO_SSE_TYPE
        Type of S3 Server side encryption used, default to S3FILEIO_SSE_TYPE_NONE.

        For more details: https://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-encryption.html

        See Also:
        Constant Field Values
      • S3FILEIO_SSE_TYPE_NONE

        public static final java.lang.String S3FILEIO_SSE_TYPE_NONE
        No server side encryption.
        See Also:
        Constant Field Values
      • S3FILEIO_SSE_TYPE_KMS

        public static final java.lang.String S3FILEIO_SSE_TYPE_KMS
        S3 SSE-KMS encryption.

        For more details: https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html

        See Also:
        Constant Field Values
      • S3FILEIO_SSE_TYPE_S3

        public static final java.lang.String S3FILEIO_SSE_TYPE_S3
        S3 SSE-S3 encryption.

        For more details: https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingServerSideEncryption.html

        See Also:
        Constant Field Values
      • S3FILEIO_SSE_TYPE_CUSTOM

        public static final java.lang.String S3FILEIO_SSE_TYPE_CUSTOM
        S3 SSE-C encryption.

        For more details: https://docs.aws.amazon.com/AmazonS3/latest/dev/ServerSideEncryptionCustomerKeys.html

        See Also:
        Constant Field Values
      • S3FILEIO_SSE_KEY

        public static final java.lang.String S3FILEIO_SSE_KEY
        If S3 encryption type is SSE-KMS, input is a KMS Key ID or ARN. In case this property is not set, default key "aws/s3" is used. If encryption type is SSE-C, input is a custom base-64 AES256 symmetric key.
        See Also:
        Constant Field Values
      • S3FILEIO_SSE_MD5

        public static final java.lang.String S3FILEIO_SSE_MD5
        If S3 encryption type is SSE-C, input is the base-64 MD5 digest of the secret key. This MD5 must be explicitly passed in by the caller to ensure key integrity.
        See Also:
        Constant Field Values
      • GLUE_CATALOG_ID

        public static final java.lang.String GLUE_CATALOG_ID
        The ID of the Glue Data Catalog where the tables reside. If none is provided, Glue automatically uses the caller's AWS account ID by default.

        For more details, see https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-catalog-databases.html

        See Also:
        Constant Field Values
      • GLUE_ACCOUNT_ID

        public static final java.lang.String GLUE_ACCOUNT_ID
        The account ID used in a Glue resource ARN, e.g. arn:aws:glue:us-east-1:1000000000000:table/db1/table1
        See Also:
        Constant Field Values
      • GLUE_CATALOG_SKIP_ARCHIVE

        public static final java.lang.String GLUE_CATALOG_SKIP_ARCHIVE
        If Glue should skip archiving an old table version when creating a new version in a commit. By default Glue archives all old table versions after an UpdateTable call, but Glue has a default max number of archived table versions (can be increased). So for streaming use case with lots of commits, it is recommended to set this value to true.
        See Also:
        Constant Field Values
      • GLUE_CATALOG_SKIP_ARCHIVE_DEFAULT

        public static final boolean GLUE_CATALOG_SKIP_ARCHIVE_DEFAULT
        See Also:
        Constant Field Values
      • GLUE_CATALOG_SKIP_NAME_VALIDATION

        public static final java.lang.String GLUE_CATALOG_SKIP_NAME_VALIDATION
        If Glue should skip name validations It is recommended to stick to Glue best practice in https://docs.aws.amazon.com/athena/latest/ug/glue-best-practices.html to make sure operations are Hive compatible. This is only added for users that have existing conventions using non-standard characters. When database name and table name validation are skipped, there is no guarantee that downstream systems would all support the names.
        See Also:
        Constant Field Values
      • GLUE_CATALOG_SKIP_NAME_VALIDATION_DEFAULT

        public static final boolean GLUE_CATALOG_SKIP_NAME_VALIDATION_DEFAULT
        See Also:
        Constant Field Values
      • GLUE_LAKEFORMATION_ENABLED

        public static final java.lang.String GLUE_LAKEFORMATION_ENABLED
        If set, GlueCatalog will use Lake Formation for access control. For more credential vending details, see: https://docs.aws.amazon.com/lake-formation/latest/dg/api-overview.html. If enabled, the AwsClientFactory implementation must be LakeFormationAwsClientFactory or any class that extends it.
        See Also:
        Constant Field Values
      • GLUE_LAKEFORMATION_ENABLED_DEFAULT

        public static final boolean GLUE_LAKEFORMATION_ENABLED_DEFAULT
        See Also:
        Constant Field Values
      • S3FILEIO_MULTIPART_UPLOAD_THREADS

        public static final java.lang.String S3FILEIO_MULTIPART_UPLOAD_THREADS
        Number of threads to use for uploading parts to S3 (shared pool across all output streams), default to Runtime.availableProcessors()
        See Also:
        Constant Field Values
      • S3FILEIO_MULTIPART_SIZE

        public static final java.lang.String S3FILEIO_MULTIPART_SIZE
        The size of a single part for multipart upload requests in bytes (default: 32MB). based on S3 requirement, the part size must be at least 5MB. Too ensure performance of the reader and writer, the part size must be less than 2GB.

        For more details, see https://docs.aws.amazon.com/AmazonS3/latest/dev/qfacts.html

        See Also:
        Constant Field Values
      • S3FILEIO_MULTIPART_SIZE_DEFAULT

        public static final int S3FILEIO_MULTIPART_SIZE_DEFAULT
        See Also:
        Constant Field Values
      • S3FILEIO_MULTIPART_SIZE_MIN

        public static final int S3FILEIO_MULTIPART_SIZE_MIN
        See Also:
        Constant Field Values
      • S3FILEIO_MULTIPART_THRESHOLD_FACTOR

        public static final java.lang.String S3FILEIO_MULTIPART_THRESHOLD_FACTOR
        The threshold expressed as a factor times the multipart size at which to switch from uploading using a single put object request to uploading using multipart upload (default: 1.5).
        See Also:
        Constant Field Values
      • S3FILEIO_MULTIPART_THRESHOLD_FACTOR_DEFAULT

        public static final double S3FILEIO_MULTIPART_THRESHOLD_FACTOR_DEFAULT
        See Also:
        Constant Field Values
      • S3FILEIO_STAGING_DIRECTORY

        public static final java.lang.String S3FILEIO_STAGING_DIRECTORY
        Location to put staging files for upload to S3, default to temp directory set in java.io.tmpdir.
        See Also:
        Constant Field Values
      • S3FILEIO_ACL

        public static final java.lang.String S3FILEIO_ACL
        Used to configure canned access control list (ACL) for S3 client to use during write. If not set, ACL will not be set for requests.

        The input must be one of ObjectCannedACL, such as 'public-read-write' For more details: https://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html

        See Also:
        Constant Field Values
      • S3FILEIO_ENDPOINT

        public static final java.lang.String S3FILEIO_ENDPOINT
        Configure an alternative endpoint of the S3 service for S3FileIO to access.

        This could be used to use S3FileIO with any s3-compatible object storage service that has a different endpoint, or access a private S3 endpoint in a virtual private cloud.

        See Also:
        Constant Field Values
      • S3FILEIO_PATH_STYLE_ACCESS

        public static final java.lang.String S3FILEIO_PATH_STYLE_ACCESS
        If set true, requests to S3FileIO will use Path-Style, otherwise, Virtual Hosted-Style will be used.

        For more details: https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html

        See Also:
        Constant Field Values
      • S3FILEIO_PATH_STYLE_ACCESS_DEFAULT

        public static final boolean S3FILEIO_PATH_STYLE_ACCESS_DEFAULT
        See Also:
        Constant Field Values
      • S3FILEIO_ACCESS_KEY_ID

        public static final java.lang.String S3FILEIO_ACCESS_KEY_ID
        Configure the static access key ID used to access S3FileIO.

        When set, the default client factory will use the basic or session credentials provided instead of reading the default credential chain to create S3 access credentials. If S3FILEIO_SESSION_TOKEN is set, session credential is used, otherwise basic credential is used.

        See Also:
        Constant Field Values
      • S3FILEIO_SECRET_ACCESS_KEY

        public static final java.lang.String S3FILEIO_SECRET_ACCESS_KEY
        Configure the static secret access key used to access S3FileIO.

        When set, the default client factory will use the basic or session credentials provided instead of reading the default credential chain to create S3 access credentials. If S3FILEIO_SESSION_TOKEN is set, session credential is used, otherwise basic credential is used.

        See Also:
        Constant Field Values
      • S3FILEIO_SESSION_TOKEN

        public static final java.lang.String S3FILEIO_SESSION_TOKEN
        Configure the static session token used to access S3FileIO.

        When set, the default client factory will use the session credentials provided instead of reading the default credential chain to create S3 access credentials.

        See Also:
        Constant Field Values
      • S3_USE_ARN_REGION_ENABLED

        public static final java.lang.String S3_USE_ARN_REGION_ENABLED
        Enable to make S3FileIO, to make cross-region call to the region specified in the ARN of an access point.

        By default, attempting to use an access point in a different region will throw an exception. When enabled, this property allows using access points in other regions.

        For more details see: https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/S3Configuration.html#useArnRegionEnabled--

        See Also:
        Constant Field Values
      • S3_USE_ARN_REGION_ENABLED_DEFAULT

        public static final boolean S3_USE_ARN_REGION_ENABLED_DEFAULT
        See Also:
        Constant Field Values
      • S3_CHECKSUM_ENABLED

        public static final java.lang.String S3_CHECKSUM_ENABLED
        Enables eTag checks for S3 PUT and MULTIPART upload requests.
        See Also:
        Constant Field Values
      • S3_CHECKSUM_ENABLED_DEFAULT

        public static final boolean S3_CHECKSUM_ENABLED_DEFAULT
        See Also:
        Constant Field Values
      • S3FILEIO_DELETE_BATCH_SIZE

        public static final java.lang.String S3FILEIO_DELETE_BATCH_SIZE
        Configure the batch size used when deleting multiple files from a given S3 bucket
        See Also:
        Constant Field Values
      • S3FILEIO_DELETE_BATCH_SIZE_DEFAULT

        public static final int S3FILEIO_DELETE_BATCH_SIZE_DEFAULT
        Default batch size used when deleting files.

        Refer to https://github.com/apache/hadoop/commit/56dee667707926f3796c7757be1a133a362f05c9 for more details on why this value was chosen.

        See Also:
        Constant Field Values
      • S3FILEIO_DELETE_BATCH_SIZE_MAX

        public static final int S3FILEIO_DELETE_BATCH_SIZE_MAX
        Max possible batch size for deletion. Currently, a max of 1000 keys can be deleted in one batch. https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html
        See Also:
        Constant Field Values
      • DYNAMODB_ENDPOINT

        public static final java.lang.String DYNAMODB_ENDPOINT
        Configure an alternative endpoint of the DynamoDB service to access.
        See Also:
        Constant Field Values
      • DYNAMODB_TABLE_NAME_DEFAULT

        public static final java.lang.String DYNAMODB_TABLE_NAME_DEFAULT
        See Also:
        Constant Field Values
      • CLIENT_ASSUME_ROLE_ARN

        public static final java.lang.String CLIENT_ASSUME_ROLE_ARN
        Used by AssumeRoleAwsClientFactory. If set, all AWS clients will assume a role of the given ARN, instead of using the default credential chain.
        See Also:
        Constant Field Values
      • CLIENT_ASSUME_ROLE_TAGS_PREFIX

        public static final java.lang.String CLIENT_ASSUME_ROLE_TAGS_PREFIX
        Used by AssumeRoleAwsClientFactory to pass a list of sessions. Each session tag consists of a key name and an associated value.
        See Also:
        Constant Field Values
      • CLIENT_ASSUME_ROLE_TIMEOUT_SEC

        public static final java.lang.String CLIENT_ASSUME_ROLE_TIMEOUT_SEC
        Used by AssumeRoleAwsClientFactory. The timeout of the assume role session in seconds, default to 1 hour. At the end of the timeout, a new set of role session credentials will be fetched through a STS client.
        See Also:
        Constant Field Values
      • CLIENT_ASSUME_ROLE_TIMEOUT_SEC_DEFAULT

        public static final int CLIENT_ASSUME_ROLE_TIMEOUT_SEC_DEFAULT
        See Also:
        Constant Field Values
      • CLIENT_ASSUME_ROLE_EXTERNAL_ID

        public static final java.lang.String CLIENT_ASSUME_ROLE_EXTERNAL_ID
        Used by AssumeRoleAwsClientFactory. Optional external ID used to assume an IAM role.

        For more details, see https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user_externalid.html

        See Also:
        Constant Field Values
      • CLIENT_ASSUME_ROLE_REGION

        public static final java.lang.String CLIENT_ASSUME_ROLE_REGION
        Used by AssumeRoleAwsClientFactory. If set, all AWS clients except STS client will use the given region instead of the default region chain.

        The value must be one of Region, such as 'us-east-1'. For more details, see https://docs.aws.amazon.com/general/latest/gr/rande.html

        See Also:
        Constant Field Values
      • HTTP_CLIENT_TYPE

        public static final java.lang.String HTTP_CLIENT_TYPE
        The type of SdkHttpClient implementation used by AwsClientFactory If set, all AWS clients will use this specified HTTP client. If not set, HTTP_CLIENT_TYPE_DEFAULT will be used. For specific types supported, see HTTP_CLIENT_TYPE_* defined below.
        See Also:
        Constant Field Values
      • HTTP_CLIENT_TYPE_DEFAULT

        public static final java.lang.String HTTP_CLIENT_TYPE_DEFAULT
        See Also:
        Constant Field Values
      • S3_WRITE_TAGS_PREFIX

        public static final java.lang.String S3_WRITE_TAGS_PREFIX
        Used by S3FileIO to tag objects when writing. To set, we can pass a catalog property.

        For more details, see https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-tagging.html

        Example: s3.write.tags.my_key=my_val

        See Also:
        Constant Field Values
      • S3_DELETE_TAGS_PREFIX

        public static final java.lang.String S3_DELETE_TAGS_PREFIX
        Used by S3FileIO to tag objects when deleting. When this config is set, objects are tagged with the configured key-value pairs before deletion. This is considered a soft-delete, because users are able to configure tag-based object lifecycle policy at bucket level to transition objects to different tiers.

        For more details, see https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt.html

        Example: s3.delete.tags.my_key=my_val

        See Also:
        Constant Field Values
      • S3FILEIO_DELETE_THREADS

        public static final java.lang.String S3FILEIO_DELETE_THREADS
        Number of threads to use for adding delete tags to S3 objects, default to Runtime.availableProcessors()
        See Also:
        Constant Field Values
      • S3_DELETE_ENABLED

        public static final java.lang.String S3_DELETE_ENABLED
        Determines if S3FileIO deletes the object when io.delete() is called, default to true. Once disabled, users are expected to set tags through S3_DELETE_TAGS_PREFIX and manage deleted files through S3 lifecycle policy.
        See Also:
        Constant Field Values
      • S3_DELETE_ENABLED_DEFAULT

        public static final boolean S3_DELETE_ENABLED_DEFAULT
        See Also:
        Constant Field Values
      • S3_ACCESS_POINTS_PREFIX

        public static final java.lang.String S3_ACCESS_POINTS_PREFIX
        Used by S3FileIO, prefix used for bucket access point configuration. To set, we can pass a catalog property.

        For more details, see https://aws.amazon.com/s3/features/access-points/

        Example: s3.access-points.my-bucket=access-point

        See Also:
        Constant Field Values
    • Constructor Detail

      • AwsProperties

        public AwsProperties()
      • AwsProperties

        public AwsProperties​(java.util.Map<java.lang.String,​java.lang.String> properties)
    • Method Detail

      • s3FileIoSseType

        public java.lang.String s3FileIoSseType()
      • setS3FileIoSseType

        public void setS3FileIoSseType​(java.lang.String sseType)
      • s3FileIoSseKey

        public java.lang.String s3FileIoSseKey()
      • s3FileIoDeleteBatchSize

        public int s3FileIoDeleteBatchSize()
      • setS3FileIoDeleteBatchSize

        public void setS3FileIoDeleteBatchSize​(int deleteBatchSize)
      • setS3FileIoSseKey

        public void setS3FileIoSseKey​(java.lang.String sseKey)
      • s3FileIoSseMd5

        public java.lang.String s3FileIoSseMd5()
      • setS3FileIoSseMd5

        public void setS3FileIoSseMd5​(java.lang.String sseMd5)
      • glueCatalogId

        public java.lang.String glueCatalogId()
      • setGlueCatalogId

        public void setGlueCatalogId​(java.lang.String id)
      • glueCatalogSkipArchive

        public boolean glueCatalogSkipArchive()
      • setGlueCatalogSkipArchive

        public void setGlueCatalogSkipArchive​(boolean skipArchive)
      • glueCatalogSkipNameValidation

        public boolean glueCatalogSkipNameValidation()
      • setGlueCatalogSkipNameValidation

        public void setGlueCatalogSkipNameValidation​(boolean glueCatalogSkipNameValidation)
      • glueLakeFormationEnabled

        public boolean glueLakeFormationEnabled()
      • setGlueLakeFormationEnabled

        public void setGlueLakeFormationEnabled​(boolean glueLakeFormationEnabled)
      • s3FileIoMultipartUploadThreads

        public int s3FileIoMultipartUploadThreads()
      • setS3FileIoMultipartUploadThreads

        public void setS3FileIoMultipartUploadThreads​(int threads)
      • s3FileIoMultiPartSize

        public int s3FileIoMultiPartSize()
      • setS3FileIoMultiPartSize

        public void setS3FileIoMultiPartSize​(int size)
      • s3FileIOMultipartThresholdFactor

        public double s3FileIOMultipartThresholdFactor()
      • setS3FileIoMultipartThresholdFactor

        public void setS3FileIoMultipartThresholdFactor​(double factor)
      • s3fileIoStagingDirectory

        public java.lang.String s3fileIoStagingDirectory()
      • setS3fileIoStagingDirectory

        public void setS3fileIoStagingDirectory​(java.lang.String directory)
      • s3FileIoAcl

        public software.amazon.awssdk.services.s3.model.ObjectCannedACL s3FileIoAcl()
      • setS3FileIoAcl

        public void setS3FileIoAcl​(software.amazon.awssdk.services.s3.model.ObjectCannedACL acl)
      • dynamoDbTableName

        public java.lang.String dynamoDbTableName()
      • setDynamoDbTableName

        public void setDynamoDbTableName​(java.lang.String name)
      • isS3ChecksumEnabled

        public boolean isS3ChecksumEnabled()
      • setS3ChecksumEnabled

        public void setS3ChecksumEnabled​(boolean eTagCheckEnabled)
      • s3WriteTags

        public java.util.Set<software.amazon.awssdk.services.s3.model.Tag> s3WriteTags()
      • s3DeleteTags

        public java.util.Set<software.amazon.awssdk.services.s3.model.Tag> s3DeleteTags()
      • s3FileIoDeleteThreads

        public int s3FileIoDeleteThreads()
      • setS3FileIoDeleteThreads

        public void setS3FileIoDeleteThreads​(int threads)
      • isS3DeleteEnabled

        public boolean isS3DeleteEnabled()
      • setS3DeleteEnabled

        public void setS3DeleteEnabled​(boolean s3DeleteEnabled)
      • s3BucketToAccessPointMapping

        public java.util.Map<java.lang.String,​java.lang.String> s3BucketToAccessPointMapping()