[80% Off] Google: Professional Data Engineer on Google Cloud Platform
Duration: 3.0 hours
Best Practice Tests for Professional Data Engineer on Google Cloud Platform Certification 2021
A Professional Data Engineer enables data-driven decision making by collecting, transforming, and publishing data. A Data Engineer should be able to design, build, operationalize, secure, and monitor data processing systems with a particular emphasis on security and compliance; scalability and efficiency; reliability and fidelity; and flexibility and portability. A Data Engineer should also be able to leverage, deploy, and continuously train pre-existing machine learning models.
The Professional Data Engineer exam assesses your ability to:
Design data processing systems
Build and operationalize data processing systems
Operationalize machine learning models
Ensure solution quality
Design Data Processing Systems
Select the Relevant Storage Technologies: The considerations for this area include mapping storage systems to the business needs, data modeling, distributed systems, as well as tradeoffs, involving transactions, throughput, and latency;
Design Data Pipeline: The focus for this subsection includes data visualization & publishing and batch & streaming data (Cloud Dataproc, Cloud Dataflow, Cloud Sub/Pub, Hadoop ecosystem, Apache Spark, Apache Beam, and Apache Kafka). It also focuses on online versus batch prediction and job orchestration & automation;
Design Data Processing Solutions: This topic includes the individuals’ expertise in planning, distributed systems usage, choice of infrastructure, hybrid Cloud & edge computing, system availability & fault tolerance. You should also know about the architecture options, including message queues, message brokers, service-oriented architecture, middleware, and serverless function;
Migrate Data Processing & Data Warehousing: This section includes validating migrations, migration from on-premises to Cloud, and awareness of the current state & how to migrate designs to the future state.
Build & Operationalize Data Processing Systems
Build & Operationalize Storage Systems: This part will require the students’ skills and competence in the effective usage of managed services, including Cloud Spanner, CLoug Bigtable, BigQuery, Cloud SQL, Cloud Memorystore, Cloud Datastore, and Cloud Storage. It also covers their skills in managing the data lifecycle and storage performance and costs;
Build & Operationalize Pipeline: This module requires that the learners demonstrate competence in data cleansing, transformation, batch & streaming, data import & acquisition, as well as integration with the new data sources;
Build & Operationalize Processing Infrastructure: The considerations for this subject area include provisioning resources, adjusting pipeline, monitoring pipeline, and testing & quality control.
Operationalize ML Models
Leverage Pre-Built Machine Learning Models as a Service: It covers one’s knowledge and skills in customizing machine learning APIs, including Auto ML text and Auto ML Vision. It also covers the conversational experiences, such as Dialogflow as well as machine learning APIs, including Speech API and Vision API;
Deploy Machine Learning Pipelines: This objective requires your competence in ingesting relevant data, continuous evaluation, and retraining of ML models (Kuberflow, BigQuery Machine Learning, Cloud Machine Learning Engine, and Spark Machine Learning);
Select the Relevant Training & Service Infrastructure: The consideration for this topic includes distributed versus single machine, hardware accelerators (such as TPU and GPU), and edge compute usage;
Measure, Troubleshoot & Monitor Machine Learning Models: The focus of this subtopic includes the effect of dependencies on machine learning models. It will also measure the examinees’ understanding of machine learning terminologies, such as features, regression, labels, classification, models, recommendation, evaluation metrics, and unsupervised & supervised learning. Moreover, it will also assess their knowledge of common sources of error such as assumptions regarding data.
Ensure Solution Quality
Design for Compliance & Security: The consideration for this topic includes identity & access management such as Cloud IAM. You should also know about data security (including key management and encryption) and privacy assurance (such as Data Loss Prevention API). This part also covers the skills needed in legal compliance, including Health Insurance Portability & Accountability Act, FedRAMP, Children’s Online Privacy Protection Act, and General Data Protection Regulation;
Ensure Efficiency & Scalability: The potential candidates will be required to demonstrate their ability to build and run test suits as well as monitor pipeline, including Stackdriver. It also focuses on their skills related to assessing, improving, and troubleshooting data process infrastructure and data representations. This area will also require that the test takers demonstrate the capacity to resize and autoscale resources;
Ensure Fidelity & Reliability: The applicants should be able to carry out data preparation & quality control (such as Cloud Dataprep), verify and monitor, as well as plan, execute, and stress test data recovery (including rerunning failed jobs, fault tolerance, and retrospective re-analysis performance). Besides that, they should be able to choose between idempotent ACID and eventual consistent prerequisites;
Ensure Portability & Flexibility: The considerations for this domain include the design for application and data portability, including data residency prerequisites and Multiple-Cloud. It also coves data staging, discovery, and cataloging, as well as mapping to future and current business prerequisites.