Day 10/21

December 16, 2023 at 10:31 PM

Report Task We won't let qwertyboss know who reported this.

- **MLOps with k8s - twiml (Page 16 /31)**
- Steps to consider: data acquisition, preprocessing, experiment management, model development, deployment and monitoring(reporting).
- ML at scale, focus on eliminating the *incidental* complexity
- *incidental* complexity of machine learning ⇒ getting access to data, setting up servers, and scaling model training and inference.
- As opposed to its *intrinsic* complexity ⇒ identifying the right model, selecting the right features, and tuning the model’s performance to meet business goals.

- Key requirements
- **Multi-tenancy:** Establishing a group of hardware to a specific team is inefficient, rather create a shared environment for concurrent projects.
- **Elasticity:** The hardware should expand/shirk based on the requirement of workload.
- **Immediacy:** It should have self-service access to the Data scientists.
- **Programmability:** APIs to enable automated provisioning and maximise utilisation.
- Cloud does meet the above requirements, however, latency and economics can be optimised significantly if on-prem. If you want to know more about a hybrid approach watch “How Dukaan moved from cloud to on-prem” - Asli Engineering [link](https://www.youtube.com/watch?v=vFxQyZX84Ro)

- Container and K8s
- K8’s hierarchy - a declarative system
- Cluster, Master → multiple worker(nodes), kubelet (agent),
- Kubeflow is one of the options that utilises K8s to deliver mlops capabilities.
- Other solutions: [TWIML Solutions Guide](https://twimlai.com/solutions/)
- in general, ephemeral in nature
- Volumes (available until the pod exists), Persistence volume (lifecycle managed by the cluster)

![Untitled](https://prod-files-secure.s3.us-west-2.amazonaws.com/d2df9e4d-9311-4c0c-9701-1e0536a3aba8/d49fc8f2-e8f8-4d09-aafc-dc57f87b24ea/Untitled.png)

pg 17

- CSI and other others - Custom resources, operators, schedule extensions, CNI(container network interface), Device plugins
- **Exercise idea:** Containerise the training and inference part of a simple machine learning use case and orchestrate the process using K8s.

**Extras:**

1. Read: What if the load balancer goes down? [Saurabh Dashora on X](https://twitter.com/ProgressiveCod2/status/1735561521869283339)
1. Remove a single point of failure using Floating IP and Active-passive switchover.
2. Completed AWS LI assessment: [LinkedIn](https://www.linkedin.com/skill-assessments/Amazon%20Web%20Services%20(AWS)/quiz-intro/)

**Retro**

- Progress >>>> Feelings: You don’t have to “feel like” doing the thing but if you know deep inside it is good for you in the long/short run “just embrace the pain” and do it anyways or else you will have to endure the pain of regret. Choose your pain wisely.

0 Likes 0 Comments

Welcome to the new Makerlog!

There is still quite a bit of work to be done yet, I apologize if you expected more.

Our goal is to get it out as quickly as possible and iterate from there.

Don't hesitate to let us know if you have any constructive criticism by clicking the feedback button next to your notifications bell.

Thanks for being a part of Makerlog!

— Josh Manders