MLOps
MLOps (a compound of “machine learning” and “operations”), a subset of ModelOps is a practice for collaboration and communication between data scientists and operations professionals to help manage production ML (or deep learning) lifecycle.[1] Similar to the DevOps or DataOps approaches, MLOps looks to increase automation and improve the quality of production ML while also focusing on business and regulatory requirements. While MLOps also started as a set of best practices, it is slowly evolving into an independent approach to ML lifecycle management. MLOps applies to the entire lifecycle - from integrating with model generation (software development lifecycle, continuous integration/continuous delivery), orchestration, and deployment, to health, diagnostics, governance, and business metrics. According to Gartner, MLOps is a subset of ModelOps. MLOps is focused on the operationalization of ML models, while ModelOps covers the operationalization of all types of AI models.[2]
History
The challenges of the ongoing use of machine learning in applications were highlighted in a 2015 paper titled, Hidden Technical Debt in Machine Learning Systems.[3]
The predicted growth in machine learning includes an estimated doubling of ML pilots and implementations from 2017 to 2018, and again from 2018 to 2020.[4] Spending on machine learning is estimated to reach $57.6 billion by 2021, a compound annual growth rate (CAGR) of 50.1%.[5]
Reports show a majority (up to 88%) of corporate AI initiatives are struggling to move beyond test stages. However, those organizations that actually put AI and machine learning into production saw a 3-15% profit margin increases.[6]
In 2018, after having one presentation about ML productionization from Google,[7] MLOps[8] and approaches to it began to gain traction among AI/ML experts, companies, and technology journalists as a solution that can address the complexity and growth of machine learning in businesses.[9][10][11][12][13][14][15][16][17]
Various companies have started MLOps practices, e.g., ModelOp[18] and WWT.[19]
Architecture
There are a number of barriers that prevent organizations from successfully implementing ML across the enterprise, including difficulties with:[20]
- Deployment and automation
- Reproducibility of models and predictions[21]
- Diagnostics[22] (wrong reference - repeated)
- Governance and regulatory compliance[23]
- Scalability[24]
- Collaboration[25]
- Business uses[26]
- Monitoring and management[27]
A standard practice, such as MLOps, takes into account each of the aforementioned areas, which can help enterprises optimize workflows and avoid issues during implementation.
A common architecture of an MLOps system would include data science platforms where models are constructed and the analytical engines where computations are performed, with the MLOps tool orchestrating the movement of machine learning models, data and outcomes between the systems.[28]
See also
- AIOps, a similarly named, but different concept - using AI (ML) in IT and Operations.
References
- Talagala, Nisha. "Why MLOps (and not just ML) is your Business' New Competitive Frontier". AITrends. AITrends. Retrieved 30 January 2018.
- Vashisth, Shubhangi; Brethenoux, Erick; Choudhary, Farhan; Hare, Jim. "Use Gartner's 3-Stage MLOps Framework to Successfully Operationalize Machine Learning Projects". Gartner. Gartner. Retrieved 30 October 2020.
- Sculley, D.; Holt, Gary; Golovin, Daniel; Davydov, Eugene; Phillips, Todd; Ebner, Dietmar; Chaudhary, Vinay; Young, Michael; Crespo, Jean-Francois; Dennison, Dan (7 December 2015). "Hidden Technical Debt in Machine Learning Systems" (PDF). NIPS Proceedings (2015). Retrieved 14 November 2017.
- Sallomi, Paul; Lee, Paul. "Deloitte Technology, Media and Telecommunications Predictions 2018" (PDF). Deloitte. Deloitte. Retrieved 13 October 2017.
- Minonne, Andrea; Schubmel, David; George, Jebin; Piña, Jeronimo; Danqing Cai, Jessie; Leung, Jonathan; Dimitrov, Lubomir; Ranjan, Manish; Daquila, Marianne; Kumar, Megha; Iwamoto, Naoko; Anand, Nikhil; Carnelley, Philip; Membrila, Roberto; Chaturvedi, Swati; Manabe, Takashi; Vavra, Thomas; Zhang, Xiao-Fei; Zhong, Zhenshan. "Worldwide Semiannual Artificial Intelligence Systems Spending Guide". IDC. Retrieved 25 September 2017.
- Bughin, Jacques; Hazan, Eric; Ramaswamy, Sree; Chui, Michael; Allas, Tera; Dahlström, Peter; Henke, Nicolaus; Trench, Monica. "Artificial Intelligence The Next Digital Frontier?". McKinsey. McKinsey Global Institute. Retrieved 1 June 2017.
- Sato, Kaz. "What is ML Ops? Best Practices for Devops for ML". YouTube. YouTube. Retrieved 19 July 2020.
- "What is MLOps?". Algomox. Algomox. Retrieved 25 November 2020.
- G, Doug. "MLOps Silicon Valley". Meetup. Meetup. Retrieved 2 February 2018.
- Bridgwater, Adrian. "Should every business function have an Ops extension?". Tech HQ. Tech HQ. Retrieved 13 April 2018.
- Royyuru, Avinash. "How to build AI culture: go through the curve of enlightenment". Medium. Hackernoon. Retrieved 28 April 2018.
- Talagala, Nisha. "Why MLOps (and not just ML) is your Business' New Competitive Frontier". AITrends. AITrends. Retrieved 30 January 2018.
- Simon, Julien. "MLOps with serverless architectures (October 2018)". LinkedIn SlideShare. Julien Simon. Retrieved 23 October 2018.
- Saucedo, Alejandro. "Scalable Data Science/Machine Learning: The State of DataOps / MLOps in 2018". MachineLearning.AI. Alejandro Saucedo. Retrieved 9 September 2018.
- Talagala, Nisha. "Operational Machine Learning: Seven Considerations for Successful MLOps". KDNuggets. KDNuggets. Retrieved 1 April 2018.
- Banks, Erink. "BD Podcast Ep 34 – Putting AI to Work with MLOps Powered by ParallelM". Big Data Beard. Big Data Beard. Retrieved 17 July 2018.
- Sato, Kaz. "What is ML Ops? Solutions and best practices for applying DevOps to production ML services". Artificial Intelligence Conference. O'Reilly. Retrieved 10 October 2018.
- "ModelOps RFP". ModelOps: The ModelOps and MLOps Resource Hub. Retrieved 30 October 2020.
- "Getting Started With MLOps: For Data Scientists". www.wwt.com. Retrieved 2021-01-27.
- Walsh, Nick. "The Rise of Quant-Oriented Devs & The Need for Standardized MLOps". Slides. Nick Walsh. Retrieved 1 January 2018.
- Warden, Pete. "The Machine Learning Reproducibility Crisis". Pete Warden's Blog. Pete Warden. Retrieved 19 March 2018.
- Warden, Pete. "The Machine Learning Reproducibility Crisis". Pete Warden's Blog. Pete Warden. Retrieved 10 March 2018.
- Vaughan, Jack. "Machine learning algorithms meet data governance". SearchDataManagement. TechTarget. Retrieved 1 September 2017.
- Lorica, Ben. "How to train and deploy deep learning at scale". O'Reilly. O'Reilly. Retrieved 15 March 2018.
- Garda, Natalie. "IoT and Machine Learning: Why Collaboration is Key". IoT Tech Expo. Encore Media Group. Retrieved 12 October 2017.
- Manyika, James. "What's now and next in analytics, AI, and automation". McKinsey. McKinsey Global Institute. Retrieved 1 May 2017.
- Haviv, Yaron. "MLOps Challenges, Solutions and Future Trends". Iguazio. Iguazio. Retrieved 19 February 2020.
- Walsh, Nick. "The Rise of Quant-Oriented Devs & The Need for Standardized MLOps". Slides. Nick Walsh. Retrieved 1 January 2018.