What aspects are in the BigQuery preprocessing?
What Feature representation includes?
What Feature construction includes?
What types of two types of feature preprocessing BigQuery ML support?
automatic and manual
When BigQuery ML Automatic preprocessing occur?
Automatic preprocessing occurs during training.
BigQuery ML provides the transform clause for you to define custom preprocessing using the manual preprocessing functions.
True
How is BigQuery data processing realized?
List some of the advanced feature engineering preprocessing functions in BigQuery.
When does Memorization work in ML?
Memorization works when you have so much data that for any single grid cell within your input space the distribution of data is statistically significant.
Feature crosses are about memorization.
True
What are the benefits of sparse models?
What is used for windowed functions in BigQuery ML?
For these types of time-windowed features, you will use Beam, batch and streaming data pipelines.
The tf.data API enables you to build complex input pipelines from simple, reusable pieces.
True
What is a complementary technology to Apache Beam?
What do you need to do to implement a data processing pipeline?
To implement a data processing pipeline, you write your code using the Apache Beam APIs and then deploy the code to Cloud Dataflow.
Apache Beam SDK comes with a variety of connectors that enable dataflow to read from many data sources, including text files in Google Cloud Storage or file systems, even from real-time streaming data sources, like Google Cloud Pub/Sub or Kafka.
True
What are contention issues?
Contention issues are when multiple servers are trying to get a file locked to the same file concurrently
What is one key advantage of preprocessing your features using Apache Beam?
The same code you use to preprocess features in training and evaluation can also be used in serving.
What is the purpose of a Cloud Dataflow connector?
Connectors allow you to output the results of a pipeline to a specific data sink like Bigtable, Google Cloud Storage, flat file, BigQuery, and more.
What are ways you could do feature engineering within TensorFlow?
Within TensorFlow itself using feature columns or by wrapping the feature dictionary and adding arbitrary TensorFlow code.
Why do you need to use TensorFlow code when wrapping the feature dictionary?
Because this needs to be code executed as part of the model function that is to be a part of the TensorFlow graph.
What is the limit of feature pre-processing in TensorFlow?
The limit here is that we can do pre-processing on a single input only.
TensorFlow models - Sequence models are an exception - tend to be stateless.
True
How does TensorFlow Transform work?