On ai and Feature stores
satya - 3/29/2024, 1:57:47 PM
How are feature stores used in GenAI?
How are feature stores used in GenAI?
satya - 4/1/2024, 9:27:35 PM
A good video on general purpose ML feature stores
satya - 4/1/2024, 9:35:11 PM
What are features, the ML perspective, the naming
- X - Various input columns
- Y - One or many Outputs given X
- X is called the feature set (The list of columns)
- Y is sometimes called the labels or outputs
- You train a model by passing (X,Y)
- The model then learns and is represented by a set of (parameters) which can work without the X and Y.
satya - 4/1/2024, 9:37:48 PM
Then what is a feature store
- it is a database of features that can participate in variety of experiments satisfying the "X".
- The word "Entity" refers to the table that keeps the columns of a given set of "X"
- There can be many such feature tables, each table having many features
- Then during training one can take a subset of these to train their models
- It also has a role during inference (or real time APIs)
satya - 4/1/2024, 9:39:12 PM
Ingestion into the feature store
- Can use any ETL including Spark
- You can store them S3, HDFS or a warehouse that sure is suitable for batch access
- Or a store that is optimized for both real time and batch like Azure ADX for instance.
satya - 4/1/2024, 9:40:23 PM
Real time use of a feature store
- Given something like a user id,
- You can then ask the feature store to get a subset of the features
- To feed a model that is trained on those features to get back an answer
satya - 4/1/2024, 9:43:38 PM
Other components
- metadata that describes all features for discovery and cataloging
- Reconciliation service that guarantees data consistency as it gets duplicated for any reason during source copies or during real time vs batch copies
satya - 4/1/2024, 9:49:23 PM
Medium: Deeper article by the same folks of the above video
satya - 4/1/2024, 9:50:19 PM
Feature stores for ML: Medium collection of articles
satya - 4/1/2024, 9:53:56 PM
How is it similar to any relational database
- At the end of the day it is a set of tables and columns
- Albeit Specifically designed with cleaned up data suitable for ML training and ML prediction (inference)
- Need to allow for ETL ingestion and consumption
- Need to allow API access
satya - 4/1/2024, 10:09:41 PM
Feature stores
Michaelangelo, Uber:Search On Web
Feast:Search On Web
HopsWorks:Search On Web
Databricks feature store:Search On Web
Google Vertex:Search On Web
satya - 4/1/2024, 10:14:04 PM
Feature stores and LLMs
Feature stores and LLMs
satya - 4/1/2024, 10:27:30 PM
LLM Feature store
satya - 4/2/2024, 7:21:00 AM
RAGs and LLMs: from Hops Feature store guys
satya - 4/2/2024, 7:21:47 AM
What is a feature store: A linked in Guide