Feature Store Flashcards

1
Q

Online feature store

A

low latency KV store that holds the latest versions of pre-computed features

Online = latest
Offline = historical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Offline feature store

A

Holds all historical values of features to be used for training and batch inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Batch inference

A

aka offline inference
Generate many predictions all at once
Example: Netflix rec system. If recommendations are generated in batch each night, the user will not be able to see personally tailored recommendations upon first signing up.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Online inference

A

Real time inference
Dynamic inference
Generate prediction in real time upon request
Can generate predictions for never before seen data (new users)
Example DoorDash estimated time of delivery. Not a batch job than ran the night before!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Materialization

A

Process of precomputing feature data by executing a feature pipeline and publishing the results to the online and offline feature store

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a Tecton feature view?

A

A feature view defines one or more features whose values are generated when the feature view’s transformations run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Tecton entity

A
  • A collection of join keys used when multiple features are joined together
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the benefits of a feature store?

A

Uses one feature definition for training and serving
Reuse features across models (feature discovery)
Manage feature lineage and versioning
Orchestration of feature compute
Storage of features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What triggers rematerialization of feature values?

A

Changes to pipelines (transforms and entities)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Feature Store vs Feature Platform

A

Feature stores typically store and serve features (meaning they have an API for low latency retrieval)

A feature platform also includes things like defining, testing, orchestrating, monitoring, and managing features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Tecton workspace

A

cloud env where the tecton repo is applied to update the workspace configuration

live workspaces are intended for serving
development workspaces do not materialize feature data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Spine

A

DataFrame consisting of rows and columns that identifies the feature data to be read from the offline store

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Batch Feature View

A

Reads from a batch data source and materialize features on schedule

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Stream Feature Views

A

Transform a stream data source and materialize features in near real-time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

On Demand Feature Views

A

Request-time transformations on batch, stream, or request data sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data Lake

A

Raw and unstructured

17
Q

Data Warehouse

A

Structured data that has been processed in a way to serve business needs (bigquery, snowflake, redshift)

18
Q

What is the offline store used for?

A

Mode training and batch inference

19
Q

Analytics stacks

A

Guys at Tecton use it to mean the system to ingest, transform, store, and analyze data from multiple sources. Not sure this is a widely used term.