The CDK Construct Library for AWS::Glue
Project description
The CDK Construct Library for AWS Glue
This module is part of the AWS Cloud Development Kit project.
Database
A Database
is a logical grouping of Tables
in the Glue Catalog.
new glue.Database(stack, 'MyDatabase', {
databaseName: 'my_database'
});
By default, a S3 bucket is created and the Database is stored under s3://<bucket-name>/
, but you can manually specify another location:
new glue.Database(stack, 'MyDatabase', {
databaseName: 'my_database',
locationUri: 's3://explicit-bucket/some-path/'
});
Table
A Glue table describes a table of data in S3: its structure (column names and types), location of data (S3 objects with a common prefix in a S3 bucket), and format for the files (Json, Avro, Parquet, etc.):
new glue.Table(stack, 'MyTable', {
database: myDatabase,
tableName: 'my_table',
columns: [{
name: 'col1',
type: glue.Schema.string,
}, {
name: 'col2',
type: glue.Schema.array(Schema.string),
comment: 'col2 is an array of strings' // comment is optional
}]
dataFormat: glue.DataFormat.Json
});
By default, a S3 bucket will be created to store the table's data but you can manually pass the bucket
and s3Prefix
:
new glue.Table(stack, 'MyTable', {
bucket: myBucket,
s3Prefix: 'my-table/'
...
});
Partitions
To improve query performance, a table can specify partitionKeys
on which data is stored and queried separately. For example, you might partition a table by year
and month
to optimize queries based on a time window:
new glue.Table(stack, 'MyTable', {
database: myDatabase,
tableName: 'my_table',
columns: [{
name: 'col1',
type: glue.Schema.string
}],
partitionKeys: [{
name: 'year',
type: glue.Schema.smallint
}, {
name: 'month',
type: glue.Schema.smallint
}],
dataFormat: glue.DataFormat.Json
});
Encryption
You can enable encryption on a Table's data:
Unencrypted
- files are not encrypted. The default encryption setting.- S3Managed - Server side encryption (
SSE-S3
) with an Amazon S3-managed key.
new glue.Table(stack, 'MyTable', {
encryption: glue.TableEncryption.S3Managed
...
});
- Kms - Server-side encryption (
SSE-KMS
) with an AWS KMS Key managed by the account owner.
// KMS key is created automatically
new glue.Table(stack, 'MyTable', {
encryption: glue.TableEncryption.Kms
...
});
// with an explicit KMS key
new glue.Table(stack, 'MyTable', {
encryption: glue.TableEncryption.Kms,
encryptionKey: new kms.EncryptionKey(stack, 'MyKey')
...
});
- KmsManaged - Server-side encryption (
SSE-KMS
), likeKms
, except with an AWS KMS Key managed by the AWS Key Management Service.
new glue.Table(stack, 'MyTable', {
encryption: glue.TableEncryption.KmsManaged
...
});
- ClientSideKms - Client-side encryption (
CSE-KMS
) with an AWS KMS Key managed by the account owner.
// KMS key is created automatically
new glue.Table(stack, 'MyTable', {
encryption: glue.TableEncryption.ClientSideKms
...
});
// with an explicit KMS key
new glue.Table(stack, 'MyTable', {
encryption: glue.TableEncryption.ClientSideKms,
encryptionKey: new kms.EncryptionKey(stack, 'MyKey')
...
});
Note: you cannot provide a Bucket
when creating the Table
if you wish to use server-side encryption (Kms
, KmsManaged
or S3Managed
).
Types
A table's schema is a collection of columns, each of which have a name
and a type
. Types are recursive structures, consisting of primitive and complex types:
new glue.Table(stack, 'MyTable', {
columns: [{
name: 'primitive_column',
type: glue.Schema.string
}, {
name: 'array_column',
type: glue.Schema.array(glue.Schema.integer),
comment: 'array<integer>'
}, {
name: 'map_column',
type: glue.Schema.map(
glue.Schema.string,
glue.Schema.timestamp),
comment: 'map<string,string>'
}, {
name: 'struct_column',
type: glue.Schema.struct([{
name: 'nested_column',
type: glue.Schema.date,
comment: 'nested comment'
}]),
comment: "struct<nested_column:date COMMENT 'nested comment'>"
}],
...
Primitive
Numeric:
bigint
float
integer
smallint
tinyint
Date and Time:
date
timestamp
String Types:
string
decimal
char
varchar
Misc:
boolean
binary
Complex
array
- array of some other typemap
- map of some primitive key type to any value type.struct
- nested structure containing individually named and typed columns.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for aws_cdk.aws_glue-0.32.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3cc734a14c63b476ca5d4a6876cd61b4a3936786f3fd7afe675ea7318cee1e6 |
|
MD5 | 820ff8d661dee218e919dab9d0b154e5 |
|
BLAKE2b-256 | ea307a5d9cb2e107f2b970b30e737353277ead710f9ea53b5eac7ab957cafe43 |