Summer Certification Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code = getmirror

Pass the Amazon Web Services AWS Certified Data Engineer Data-Engineer-Associate Questions and answers with ExamsMirror

Practice at least 50% of the questions to maximize your chances of passing.

Exam Data-Engineer-Associate Premium Access

View all detail and faqs for the Data-Engineer-Associate exam

Go to Exam

705 Students Passed

91% Average Score

94% Same Questions

Viewing page 8 out of 9 pages

Viewing questions 71-80 out of questions

Questions # 71:

A data engineer is launching an Amazon EMR duster. The data that the data engineer needs to load into the new cluster is currently in an Amazon S3 bucket. The data engineer needs to ensure that data is encrypted both at rest and in transit.

The data that is in the S3 bucket is encrypted by an AWS Key Management Service (AWS KMS) key. The data engineer has an Amazon S3 path that has a Privacy Enhanced Mail (PEM) file.

Which solution will meet these requirements?

Options:

Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for at-rest encryption for the S3 bucket. Create a second security configuration. Specify the Amazon S3 path of the PEM file for in-transit encryption. Create the EMR cluster, and attach both security configurations to the cluster.

Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for local disk encryption for the S3 bucket. Specify the Amazon S3 path of the PEM file for in-transit encryption. Use the security configuration during EMR cluster creation.

Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for at-rest encryption for the S3 bucket. Specify the Amazon S3 path of the PEM file for in-transit encryption. Use the security configuration during EMR cluster creation.

Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for at-rest encryption for the S3 bucket. Specify the Amazon S3 path of the PEM file for in-transit encryption. Create the EMR cluster, and attach the security configuration to the cluster.

Questions # 72:

Two developers are working on separate application releases. The developers have created feature branches named Branch A and Branch B by using a GitHub repository's master branch as the source.

The developer for Branch A deployed code to the production system. The code for Branch B will merge into a master branch in the following week's scheduled application release.

Which command should the developer for Branch B run before the developer raises a pull request to the master branch?

Options:

git diff branchB master

git commit -m

git pull master

git rebase master

git fetch -b master

Questions # 73:

A company has a data pipeline that uses an Amazon RDS instance, AWS Glue jobs, and an Amazon S3 bucket. The RDS instance and AWS Glue jobs run in a private subnet of a VPC and in the same security group.

A use' made a change to the security group that prevents the AWS Glue jobs from connecting to the RDS instance. After the change, the security group contains a single rule that allows inbound SSH traffic from a specific IP address.

The company must resolve the connectivity issue.

Which solution will meet this requirement?

Options:

Add an inbound rule that allows all TCP traffic on all TCP ports. Set the security group as the source.

Add an inbound rule that allows all TCP traffic on all UDP ports. Set the private IP address of the RDS instance as the source.

Add an inbound rule that allows all TCP traffic on all TCP ports. Set the DNS name of the RDS instance as the source.

Replace the source of the existing SSH rule with the private IP address of the RDS instance. Create an outbound rule with the same source, destination, and protocol as the inbound SSH rule.

Questions # 74:

A company uses Amazon S3 to store data and Amazon QuickSight to create visualizations.

The company has an S3 bucket in an AWS account named Hub-Account. The S3 bucket is encrypted by an AWS Key Management Service (AWS KMS) key. The company's QuickSight instance is in a separate account named BI-Account

The company updates the S3 bucket policy to grant access to the QuickSight service role. The company wants to enable cross-account access to allow QuickSight to interact with the S3 bucket.

Which combination of steps will meet this requirement? (Select TWO.)

Options:

Use the existing AWS KMS key to encrypt connections from QuickSight to the S3 bucket.

Add the 53 bucket as a resource that the QuickSight service role can access.

Use AWS Resource Access Manager (AWS RAM) to share the S3 bucket with the Bl-Account account.

Add an IAM policy to the QuickSight service role to give QuickSight access to the KMS key that encrypts the S3 bucket.

Add the KMS key as a resource that the QuickSight service role can access.

Answer

D, E

Explanation

Problem Analysis:

The company needs cross-account access to allow QuickSight in BI-Account to interact with an S3 bucket in Hub-Account.

The bucket is encrypted with an AWS KMS key.

Appropriate permissions must be set for both S3 access and KMS decryption.

Key Considerations:

QuickSight requires IAM permissions to access S3 data and decrypt files using the KMS key.

Both S3 and KMS permissions need to be properly configured across accounts.

Solution Analysis:

Option A: Use Existing KMS Key for Encryption

While the existing KMS key is used for encryption, it must also grant decryption permissions to QuickSight.

Option B: Add S3 Bucket to QuickSight Role

Granting S3 bucket access to the QuickSight service role is necessary for cross-account access.

Option C: AWS RAM for Bucket Sharing

AWS RAM is not required; bucket policies and IAM roles suffice for granting cross-account access.

Option D: IAM Policy for KMS Access

QuickSight’s service role in BI-Account needs explicit permissions to use the KMS key for decryption.

Option E: Add KMS Key as Resource for Role

The KMS key must explicitly list the QuickSight role as an entity that can access it.

Implementation Steps:

S3 Bucket Policy in Hub-Account:Add a policy to the S3 bucket granting the QuickSight service role access:

json

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": { "AWS": "arn:aws:iam:::role/service-role/QuickSightRole" },

"Action": "s3:GetObject",

"Resource": "arn:aws:s3:::/*"

}

]

}

KMS Key Policy in Hub-Account:Add permissions for the QuickSight role:

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": { "AWS": "arn:aws:iam:::role/service-role/QuickSightRole" },

"Action": [

"kms:Decrypt",

"kms:DescribeKey"

"Resource": "*"

}

]

}

IAM Policy for QuickSight Role in BI-Account:Attach the following policy to the QuickSight service role:

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Action": [

"s3:GetObject",

"kms:Decrypt"

"Resource": [

"arn:aws:s3:::/*",

"arn:aws:kms:::key/"

]

}

]

}

[:, Setting Up Cross-Account S3 Access, AWS KMS Key Policy Examples, Amazon QuickSight Cross-Account Access, , ]

Questions # 75:

A company needs to implement a new inventory management system that provides near real-time updates and visibility across all AWS Regions. The new solution must provide centralized access control over data access and permissions. The company has a separate inventory management team assigned to each Region. Each inventory management team needs to update inventory levels.

A data engineer must implement Amazon Redshift data sharing with write capabilities. The solution must follow the principle of least privilege.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

Configure a single Redshift datashare from the company's headquarters that provides read-only access for all Regions. Configure a separate AWS Glue ETL job to update data for each Region.

Configure three Regional Redshift datashares that provide full write access. Allow full self-managed access controls.

Configure a single Redshift datashare from the company's headquarters that has selective write permissions for inventory. Set up Regional namespace controls.

Configure separate Redshift datashares for multiple table types that provide full write access. Distribute the datashares across all Regional clusters. Allow self-managed Regional schema permissions.

Questions # 76:

A company uploads .csv files to an Amazon S3 bucket. The company's data platform team has set up an AWS Glue crawler to perform data discovery and to create the tables and schemas.

An AWS Glue job writes processed data from the tables to an Amazon Redshift database. The AWS Glue job handles column mapping and creates the Amazon Redshift tables in the Redshift database appropriately.

If the company reruns the AWS Glue job for any reason, duplicate records are introduced into the Amazon Redshift tables. The company needs a solution that will update the Redshift tables without duplicates.

Which solution will meet these requirements?

Options:

Modify the AWS Glue job to copy the rows into a staging Redshift table. Add SQL commands to update the existing rows with new values from the staging Redshift table.

Modify the AWS Glue job to load the previously inserted data into a MySQL database. Perform an upsert operation in the MySQL database. Copy the results to the Amazon Redshift tables.

Use Apache Spark's DataFrame dropDuplicates() API to eliminate duplicates. Write the data to the Redshift tables.

Use the AWS Glue ResolveChoice built-in transform to select the value of the column from the most recent record.

Questions # 77:

A company's data engineer needs to optimize the performance of table SQL queries. The company stores data in an Amazon Redshift cluster. The data engineer cannot increase the size of the cluster because of budget constraints.

The company stores the data in multiple tables and loads the data by using the EVEN distribution style. Some tables are hundreds of gigabytes in size. Other tables are less than 10 MB in size.

Which solution will meet these requirements?

Options:

Keep using the EVEN distribution style for all tables. Specify primary and foreign keys for all tables.

Use the ALL distribution style for large tables. Specify primary and foreign keys for all tables.

Use the ALL distribution style for rarely updated small tables. Specify primary and foreign keys for all tables.

Specify a combination of distribution, sort, and partition keys for all tables.

Questions # 78:

A mobile gaming company wants to capture data from its gaming app. The company wants to make the data available to three internal consumers of the data. The data records are approximately 20 KB in size.

The company wants to achieve optimal throughput from each device that runs the gaming app. Additionally, the company wants to develop an application to process data streams. The stream-processing application must have dedicated throughput for each internal consumer.

Which solution will meet these requirements?

Options:

Configure the mobile app to call the PutRecords API operation to send data to Amazon Kinesis Data Streams. Use the enhanced fan-out feature with a stream for each internal consumer.

Configure the mobile app to call the PutRecordBatch API operation to send data to Amazon Data Firehose. Submit an AWS Support case to turn on dedicated throughput for the company's AWS account. Allow each internal consumer to access the stream.

Configure the mobile app to use the Amazon Kinesis Producer Library (KPL) to send data to Amazon Data Firehose. Use the enhanced fan-out feature with a stream for each internal consumer.

Configure the mobile app to call the PutRecords API operation to send data to Amazon Kinesis Data Streams. Host the stream-processing application for each internal consumer on Amazon EC2 instances. Configure auto scaling for the EC2 instances.

Questions # 79:

A company is using Amazon Redshift to build a data warehouse solution. The company is loading hundreds of tiles into a tact table that is in a Redshift cluster.

The company wants the data warehouse solution to achieve the greatest possible throughput. The solution must use cluster resources optimally when the company loads data into the tact table.

Which solution will meet these requirements?

Options:

Use multiple COPY commands to load the data into the Redshift cluster.

Use S3DistCp to load multiple files into Hadoop Distributed File System (HDFS). Use an HDFS connector to ingest the data into the Redshift cluster.

Use a number of INSERT statements equal to the number of Redshift cluster nodes. Load the data in parallel into each node.

Use a single COPY command to load the data into the Redshift cluster.

Questions # 80:

A company stores sensitive data in an Amazon Redshift table. The company needs to give specific users the ability to access the sensitive data. The company must not create duplication in the data.

Customer support users must be able to see the last four characters of the sensitive data. Audit users must be able to see the full value of the sensitive data. No other users can have the ability to access the sensitive information.

Which solution will meet these requirements?

Options:

Create a dynamic data masking policy to allow access based on each user role. Create IAM roles that have specific access permissions. Attach the masking policy to the column that contains sensitive data.

Enable metadata security on the Redshift cluster. Create IAM users and IAM roles for the customer support users and the audit users. Grant the IAM users and IAM roles permissions to view the metadata in the Redshift cluster.

Create a row-level security policy to allow access based on each user role. Create IAM roles that have specific access permissions. Attach the security policy to the table.

Create an AWS Glue job to redact the sensitive data and to load the data into a new Redshift table.