In episode 89 of The Gradient Podcast, Daniel Bashir speaks to Shreya Shankar.
Shreya is a computer scientist pursuing her PhD in databases at UC Berkeley. Her research interest is in building end-to-end systems for people to develop production-grade machine learning applications. She was previously the first ML engineer at Viaduct, did research at Google Brain, and software engineering at Facebook. She graduated from Stanford with a B.S. and M.S. in computer science with concentrations in systems and artificial intelligence. At Stanford, helped run SHE++, an organization that helps empower underrepresented minorities in technology.
Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pub
Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSS
Follow The Gradient on Twitter
Outline:
- (00:00) Intro 
- (02:22) Shreya’s background and journey into ML / MLOps - (04:51) ML advances in 2013-2016 
- (05:45) Shift in Stanford undergrad class ecosystems, accessibility of deep learning research 
 
- (09:10) Why Shreya left her job as an ML engineer - (13:30) How Shreya became interested in databases, data quality in ML 
 
- (14:50) Daniel complains about things 
- (16:00) What makes ML engineering uniquely difficult 
- (16:50) Being a “historian of the craft” of ML engineering 
- (22:25) Levels of abstraction, what ML engineers do/don’t have to think about 
- (24:16) Observability for Production ML Pipelines - (28:30) Metrics for real-time ML systems 
- (31:20) Proposed solutions 
 
- (34:00) Moving Fast with Broken Data - (34:25) Existing data validation measures and where they fall short 
- (36:31) Partition summarization for data validation 
- (38:30) Small data and quantitative statistics for data cleaning 
 
- (40:25) Streaming ML Evaluation - (40:45) What makes a metric actionable 
- (42:15) Differences in streaming ML vs. batch ML 
- (45:45) Delayed and incomplete labels 
 
- (49:23) Operationalizing Machine Learning - (49:55) The difficult life of an ML engineer 
- (53:00) Best practices, tools, pain points 
- (55:56) Pitfalls in current MLOps tools 
 
- (1:00:30) LLMOps / FMOps 
- (1:07:10) Thoughts on ML Engineering, MLE through the lens of data engineering 
- (1:10:42) Building products, user expectations for AI products 
- (1:15:50) Outro 
Links:














