Before you get to the answers below here are some facts about the exam.
Exam Name – GCP Big Data and Machine Learning Assessment.
Total Questions and Time Limit – 30 Questions and there is no time limit.
Q.1 – Cloud IoT is a set of fully managed and integrated services that allows organizations to easily and securely connect, manage, and collect data from devices across the globe at a large scale. Knowing this, what stage of big data processing does Cloud IoT belong in?
(A) Availability, security, and preferred locations
(B) Customer 360 analysis and log analysis
(C) Massively parallel databases and multiple data marts
(D) Machine learning APIs and Tensorflow
(A) Ease of use and speed
(B) Idle clusters and scaling inflexibility
(C) Integration and customization
(D) High PUC cores and GPUs
(A) Process queries written in structured query language (SQL).
(B) Scale without downtime.
(C) Perform the transformations in “extract, transform, and load (ETL).”
(D) Develop apps faster and easier with cloud backend services.
Q.5 – Cloud Dataflow is a fully managed service for transforming and enriching data in stream (real time) and batch (historical) modes with equal reliability and expressiveness. Knowing this, where does Cloud Dataflow fit in the big data processing model?
(A) Cloud Pub/Sub
(B) Cloud Engine
(D) Data Warehouse
(A) Eliminates the need to buy, build, and operate computing hardware.
(B) Getting queries answered rapidly over very large data sets.
(C) Accelerates development for batch and streaming data processing pipelines.
(D) Allows for fast SQL queries on structured data.
Q.8 – The Cloud Dataproc approach allows organizations to use Hadoop/Spark/Hive/Pig when needed. It takes on average only 90 seconds between the moment resources are requested and a job can be submitted. What makes this possible?
(A) The absence of management and maintenance.
(B) The separation of storage and compute.
(C) The configuration of jobs and workflows.
(D) The use of queries and containers.
(A) Ingest and Process
(B) Storage and Analyze
(C) Apps and Devices
(D) Ingest and Storage
(A) scalable, batch
(B) pre-trained, tailored
(C) storage, process
(D) virtual machine, dataset
(A) Predictive datasets
(B) Market datasets
(C) QA datasets
(D) Training datasets
(A) Multiple data marts are inefficient: they are complex and costly, and they make data difficult to use.
(B) Organizations that want to take advantage of machine learning need to centralize their data with a managed data store that can consolidate structured and semi-structured data.
(C) Accepting that most devices can theoretically be connected to a network, building and managing such networks in a global, secure way—and then getting data out of them for analysis—is complex and difficult for organizations.
(D) Organizations find it difficult to stay ahead when they continuously have to accommodate new data sources and more data without sacrificing efficiency.
(A) Ease of use and implementation
(D) Reduces OPEX
(A) Cloud Dataproc allows organizations to scale data storage and ensures accessibility without compromising security.
(B) Cloud Dataproc allows organizations to transform and enrich data in stream and batch modes.
(C) Cloud Dataproc allows organizations to easily use MapReduce, Pig, Hive, and Spark to process data before storing it, and it helps organizations interactively analyze data with Spark and Hive.
(D) Cloud Dataproc allows organizations to ingest event streams from anywhere, at any scale, for simple, reliable, real-time stream analytics.
(A) Query > Database > Table with Data
(B) Query > Table with Data > Database
(C) Table with Data > Database > Query
(D) Table with Data > Query > Database
(A) Cloud Dataflow takes a query and runs information from the database through it to produce tables of data organized according to the requirements of the original query.
(B) Cloud Dataflow reads data in and can apply filtering, grouping, comparing, joining, or aggregation
(C) Cloud Dataflow relies on a large database to store and analyze data processing pipelines, performing transforms resulting in predictive analytics that can be leveraged to optimize business decisions.
(D) Cloud Dataflow relies on training data to enable machine learning that can then read multiple streams of data and perform transforms that produce resulting output data.
(A) Cloud Machine Learning Engine makes it easy to build sophisticated, large-scale machine learning models across a broad set of scenarios.
(B) Cloud Video Intelligence API makes videos searchable and discoverable by extracting metadata, identifying key nouns, and annotating the content of the video.
(C) Cloud Translation API provides a simple programmatic interface for translating an arbitrary string into any supported language.
(D) Cloud Job Discovery provides a highly intuitive job search that anticipates what job seekers are looking for and surfaces targeted recommendations that help them discover new opportunities.
(A) Cloud Dataproc runs clusters ephemerally; in other words, only when needed.
(B) Cloud Dataproc runs clusters indefinitely, cutting down on wasted time typically spent on spinning up resources.
(C) Cloud Dataproc charges at a per-minute rate for each cluster, reducing costs.
(D) Cloud Dataproc attaches storage or hard drives to each node of the cluster.
(A) BigQuery isolates data for machine learning.
(B) BigQuery connects globally distributed industrial devices into a single network that can be managed efficiently.
(C) BigQuery lets data analysts run data processsing pipelines to do transforms on incoming streaming data.
(D) BigQuery improves analytics, lowers warehousing costs, and includes connectivity to other GCP products.
(A) “I need access to near real-time reports, even if the data is speculative or sampled.”
(B) “I need to know what customers are doing right now, and I need to find out using my existing Hadoop tools.”
(C) “I need to transfer my data from my on-premises solutions to the cloud.”
(D) “I wish I could run queries to organize my batch data.”
Q.21 – An organization’s analysts use Spark Shell. However, their IT department is concerned about the increase in usage and how to scale their cluster, which is running in Standalone mode. How does Cloud Dataproc help?
(A) Cloud Dataproc consolidates data marts into datasets and provides the ability to simply manage all datasets.
(B) Cloud Dataproc can act as a landing zone for log data at a low cost.
(C) Cloud Dataproc supports Spark and can create clusters that scale for speed and mitigate any single point of failure.
(D) Cloud Dataproc enables you to convert audio to text by applying neural network models in an easy-to-use API.
(A) Almost NoOps, with downtime-free upgrades and maintenance
(B) ZeroOps, with governance or maintenance required
(C) No queries
(D) Empty space storage
(A) Queries, machine learning, and compute
(B) VMs, network, and Non-SQL
(C) Availability, throughput, and latency
(D) Sources, sinks, and transforms
(A) Is focused on enabling computers to recognize patterns in data—without humans telling the computer how to recognize the patterns.
(B) Is how you retrieve information from a database.
(C) Is a service to help capture data and rapidly pass massive numbers of messages between other big data tools.
(D) Is a tool for developing and executing a wide range of data processing patterns on very large datasets.
(A) Publisher applications can send messages to a subscriber.
(B) Subscriber applications can subscribe to a topic to receive the message when the subscriber is ready.
(C) Publisher applications can receive messages from a topic.
(D) Subscriber applications can send messages on a topic directly to publisher applications.
(A) Cloud Pub/Sub helps capture data and rapidly pass massive numbers of messages securely between other Google Cloud Platform big data tools and other software applications.
(B) Cloud Pub/Sub offers a solution for analyzing big data and can open the door for other Google Cloud Platform big data tools.
(C) Cloud Pub/Sub allows organizations to access their data anywhere, anytime as an innovative storage solution in the cloud, acting as a repository of data collected by other Google Cloud Platform big data tools.
(D) Cloud Pub/Sub takes the existing data processing pipeline and processes it alongside an incoming stream of input data, performs transforms on that data to gain useful or actionable insights, and produces resulting output data.
(A) It connects globally distributed industrial devices into a single network that can be managed efficiently and serves as a new data source for an organization’s analytic systems to support improved operational efficiency.
(B) It has a deep interoperability with business intelligence (BI) tools, allowing it to connect multiple devices around the globe through the tools themselves.
(C) It is fully managed, with downtime-free upgrades and maintenance and seamless scaling, and it provides the benefits of operating on almost NoOps.
(D) It uses the Spark Machine Learning Libraries (MLlib) to run classification algorithms on very large datasets, relying on cloud-based machines where Spark can be installed and customized.
(A) Cloud IoT enables growth through a better user experience that can increase usage and adoption of a product.
(B) Cloud IoT facilitates machine learning with actionable insights by processing and analyzing data in real time.
(C) Cloud IoT provides compute power across the globe at nearline locations.
(D) Cloud IoT scales with big data workloads so that organizations can collect more data from more devices.
(A) Reduces risk
(B) Massively parallel databases
(C) Optimizes cost
(A) Cloud Pub/Sub stores data and ensures accessibility without compromising security.
(B) Cloud Pub/Sub analyzes data to capture insights to be used for more informed decision making.
(C) Cloud Pub/Sub processes queries, running them against a database of data to produce tables of the results.
(D) Cloud Pub/Sub ingests event streams from anywhere, at any scale, for simple, reliable, real-time stream analytics.
About GCP Big Data and Machine Learning Assessment
This assessment is conducted by Google via the Skillshop platform. This assessment tests your understanding of the use of Big Data and Machine Learning in the Google Cloud Platform. You are tested on various aspects of GCP Big Data And Machine Learning like BigQuery, Cloud Dataflow, Cloud Pub/Sub, Dataproc, Machine learning, and the Internet of Things(IoT).
This Assessment contains a total of 30 questions and there’s no time limit. You cannot edit and review questions that you’ve already answered. You need to score 80% or above to pass the Assessment. If you fail in your first try then you can retake the exam after 24 hours.
The above post gives you all the answers to the GCP Big Data And Machine Learning Assessment. If you want the answers to other Google Cloud Platform exams then visit our GCP Storage and Databases Assessment Answers page, and GCP Compute Assessment Answers page. If you want answers to all three GCP Assessments on one page then visit our Google Cloud Platform Business Professional Accreditation Answers page.