Machine learning operations offer agility, spur innovation

Many organizations have adopted machine learning (ML) in a piecemeal fashion, building or buying ad hoc models, algorithms, tools, or services to accomplish specific goals. This approach was necessary as companies learned about the capabilities of ML and as the technology matured, but it also has created a hodge-podge of siloed, manual, and nonstandardized processes and components within organizations. This can lead, in turn, to inefficient, cumbersome services that fail to deliver on their promised value—or that stall innovation entirely. 

As businesses look to scale ML applications across the enterprise, they need to better automate and standardize tools, processes, and workflows. They need to build and deploy ML models quickly, spending less time manually training and monitoring models and more time on value-driving, revenue-generating innovation. Developers need access to the data that will power their ML models, to work across lines of business, and to collaborate transparently on the same tech stack. In other words, businesses need to adopt best practices for machine learning operations (MLOps): a set of software development practices that keep ML models running effectively and with agility.

The main function of MLOps is to automate the more repeatable steps in the ML workflows of data scientists and ML engineers, from model development and training to model deployment and operation (model serving). Automating these steps creates agility for businesses and better experiences for users and end customers, increasing the speed, power, and reliability of ML. These automated processes can also mitigate risk and free developers from rote tasks, allowing them to spend more time on innovation. This all contributes to the bottom line: a 2021 global study by McKinsey found that companies that successfully scale AI can add as much as 20 percent to their earnings before interest and taxes (EBIT). 

“It’s not uncommon for companies with sophisticated ML capabilities to incubate different ML tools in individual pockets of the business,” says Vincent David, senior director for machine learning at Capital One. “But often you start seeing parallels—ML systems doing similar things, but with a slightly different twist. The companies that are figuring out how to make the most of their investments in ML are unifying and supercharging their best ML capabilities to create standardized, foundational tools and platforms that everyone can use — and ultimately create differentiated value in the market.” 

In practice, MLOps requires close collaboration between data scientists, ML engineers, and site reliability engineers (SREs) to ensure consistent reproducibility, monitoring, and maintenance of ML models. Over the last several years, Capital One has developed MLOps best practices that apply across industries: balancing user needs, adopting a common, cloud-based technology stack and foundational platforms, leveraging open-source tools, and ensuring the right level of accessibility and governance for both data and models.

Understand different users’ different needs

ML applications generally have two main types of users—technical experts (data scientists and ML engineers) and nontechnical experts (business analysts)—and it’s important to strike a balance between their different needs. Technical experts often prefer complete freedom to use all tools available to build models for their intended use cases. Nontechnical experts, on the other hand, need user-friendly tools that enable them to access the data they need to create value in their own workflows.

To build consistent processes and workflows while satisfying both groups, David recommends meeting with the application design team and subject matter experts across a breadth of use cases. “We look at specific cases to understand the issues, so users get what they need to benefit their work, specifically, but also the company generally,” he says. “The key is figuring out how to create the right capabilities while balancing the various stakeholder and business needs within the enterprise.”

Adopt a common technology stack 

Collaboration among development teams—critical for successful MLOps—can be difficult and time-consuming if these teams are not using the same technology stack. A unified tech stack allows developers to standardize, reusing components, features, and tools across models like Lego bricks. “That makes it easier to combine related capabilities so developers don’t waste time switching from one model or system to another,” says David. 

A cloud-native stack—built to take advantage of the cloud model of distributed computing—allows developers to self-service infrastructure on demand, continually leveraging new capabilities and introducing new services. Capital One’s decision to go all-in on the public cloud has had a notable impact on developer efficiency and speed. Code releases to production now happen much more rapidly, and ML platforms and models are reusable across the broader enterprise.

Save time with open-source ML tools 

Open-source ML tools (code and programs freely available for anyone to use and adapt) are core ingredients in creating a strong cloud foundation and unified tech stack. Using existing open-source tools means the business does not need to devote precious technical resources to reinventing the wheel, quickening the pace at which teams can build and deploy models. 

To complement its use of open-source tools and packages, David says, Capital One also develops and releases its own tools. For example, to manage streams of dynamic data too large to manually monitor, Capital One built an open-source data profiling tool that uses ML to detect and protect sensitive data like bank account and credit card numbers. Additionally, Capital One recently released the open-source library rubicon-ml, which helps capture and store model training and execution information in a repeatable and searchable way. Releasing its own tools as open source ensures that Capital One builds ML capabilities that are flexible and repurposable (by others, as well as across its own businesses) and allows the company to connect with and contribute to the open-source community.

Enable data accessibility while prioritizing governance 

A typical ML system includes a production environment (processing data in real-time) and an analytical environment (a store of data with which users can work). For many organizations, the lag time between these environments is a significant pain point. When data scientists and engineers need access to near-real-time data from the production environment, it’s important to set up appropriate controls.

ML developers thus need to ensure integration and access to both environments without compromising governance integrity. “In an ideal world, the organization would establish a seamless integration between production data stores and analytical environments that can enforce all the controls and governance frameworks that the data scientists, engineers, and other stakeholders involved in the model governance process need,” says David. 

Governing and managing the ML models themselves is equally important. As a machine learns and as input data changes, models tend to drift, which traditionally requires engineers to monitor and correct for that drift. MLOps practices, by contrast, help automate the management and training of models and workflows. An organization adopting MLOps could determine for each ML use case what needs to be monitored, how often, and how much drift to allow before retraining is required. It can then configure tools to automatically detect triggers and retrain models at an appropriate cadence.

In the early days of ML, companies took pride in their ability to develop new and bespoke solutions for different parts of the business. But now companies seeking to scale ML in a well-governed, nimble way have to account for continuous updates to data sources, ML models, features, pipelines, and many other aspects of the ML model lifecycle. With its potential to offer standardized, reproducible, and adaptable processes across large-scale ML environments, MLOps could unlock the future of enterprise machine learning.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

Related Posts
Online Conspiracies About the Baltimore Bridge Collapse Are Out of Control thumbnail

Online Conspiracies About the Baltimore Bridge Collapse Are Out of Control

Conspiracists and far-right extremists are blaming just about everything and everyone for Tuesday morning’s Baltimore bridge collapse.A non-exhaustive list of things that are getting blamed for the bridge collapse on Telegram and X include President Biden, Hamas, ISIS, P. Diddy, Nickelodeon, India, former president Barack Obama, Islam, aliens, Sri Lanka, the World Economic Forum, the
Read More
PornHub has launched a museum guide for classical nudes thumbnail

PornHub has launched a museum guide for classical nudes

In an effort to get people back into museums, Pornhub (yes that Pornhub) has produced a "Classic Nudes" museum guide (the page is SFW but the URL is Pornhub so click at your own discretion). From orgies to outdoor, and even a little bit of nipple play, prepare to savor in every stroke of these…
Read More
GLP’s Japan Venture Capital Firm Closes on $96M for Maiden Fund thumbnail

GLP’s Japan Venture Capital Firm Closes on $96M for Maiden Fund

Yoshiyuki Chosa Monoful Venture Partners, a Japan-focused venture capital arm of industrial property giant GLP, has reached a first closing of its maiden fund with JPY 12.9 billion ($96 million) in commitments. Founded this year, MVP invests in growth-stage start-ups in Japan’s logistics and real estate ecosystem, GLP said Wednesday in a release. Monoful Venture
Read More
Big data SSD vs. HDD, the failure rate of SSD is far lower than that of HDD, the outcome has been divided thumbnail

Big data SSD vs. HDD, the failure rate of SSD is far lower than that of HDD, the outcome has been divided

我們發現,今年來 SSD 固態硬碟正在取代傳統機械 HDD 硬碟成為主流的存儲媒介。除去價格與容量等方面,SSD 固態硬碟幾乎在各方面都完勝 HDD 機械硬碟,SSD 固態硬碟的存取速度更快、容量也在不斷增加,小巧的體積更是超薄電腦裝置的首選。而後者體積較大大,性能差,只適合當「倉庫」用。但對於那些重數據安全性的用戶來說,SSD 與 HDD 硬碟還要比較下可靠性, 於是一家名為 BackBlaze 的雲儲存服務商在近日公佈了他們近年來 SSD 與 HDD 硬碟的故障率報告。他們最新的報告顯示報告中表示在 Q2 季度末,他們有 1666個SSD硬碟,1607個 HDD 硬碟,前者出問題的有17個,故障率 1.05%,後者有 619個出問題的,故障率高達6.41%。但從數據上看確實是 SSD 硬碟的可靠性要大幅高於 HDD 硬碟,符合大家的認知。但是報告中還指出他們的 SSD 使用時間平均才 14.2個月,HDD 硬碟已經達到了 52.4個月,用的越久,故障率當然也會更高。所以他們再次進行測試,此次使用都是時間都為 14個月左右的 SSD 及 HDD 硬碟。最後的出的結論是 SSD 的故障率是1.05%,HDD 硬碟是1.38%。雖然最後一次測試中,HDD 硬碟的故障率仍要比 SSD 高,但雙方差距其實並不是很明顯。而目前的 SSD 硬碟正常使用壽命已可達 5-10 年,且由於沒有機械結構,不怕碰撞,摔落、斷電等,所以 SSD 硬碟比 HDD 機械盤更可靠,但對於普通用戶而言,這兩者的差別並不是很大,所以目前普通的電腦通常會採用 SSD…
Read More
Index Of News
Total
0
Share