Machine learning operations offer agility, spur innovation

Many organizations have adopted machine learning (ML) in a piecemeal fashion, building or buying ad hoc models, algorithms, tools, or services to accomplish specific goals. This approach was necessary as companies learned about the capabilities of ML and as the technology matured, but it also has created a hodge-podge of siloed, manual, and nonstandardized processes and components within organizations. This can lead, in turn, to inefficient, cumbersome services that fail to deliver on their promised value—or that stall innovation entirely. 

As businesses look to scale ML applications across the enterprise, they need to better automate and standardize tools, processes, and workflows. They need to build and deploy ML models quickly, spending less time manually training and monitoring models and more time on value-driving, revenue-generating innovation. Developers need access to the data that will power their ML models, to work across lines of business, and to collaborate transparently on the same tech stack. In other words, businesses need to adopt best practices for machine learning operations (MLOps): a set of software development practices that keep ML models running effectively and with agility.

The main function of MLOps is to automate the more repeatable steps in the ML workflows of data scientists and ML engineers, from model development and training to model deployment and operation (model serving). Automating these steps creates agility for businesses and better experiences for users and end customers, increasing the speed, power, and reliability of ML. These automated processes can also mitigate risk and free developers from rote tasks, allowing them to spend more time on innovation. This all contributes to the bottom line: a 2021 global study by McKinsey found that companies that successfully scale AI can add as much as 20 percent to their earnings before interest and taxes (EBIT). 

“It’s not uncommon for companies with sophisticated ML capabilities to incubate different ML tools in individual pockets of the business,” says Vincent David, senior director for machine learning at Capital One. “But often you start seeing parallels—ML systems doing similar things, but with a slightly different twist. The companies that are figuring out how to make the most of their investments in ML are unifying and supercharging their best ML capabilities to create standardized, foundational tools and platforms that everyone can use — and ultimately create differentiated value in the market.” 

In practice, MLOps requires close collaboration between data scientists, ML engineers, and site reliability engineers (SREs) to ensure consistent reproducibility, monitoring, and maintenance of ML models. Over the last several years, Capital One has developed MLOps best practices that apply across industries: balancing user needs, adopting a common, cloud-based technology stack and foundational platforms, leveraging open-source tools, and ensuring the right level of accessibility and governance for both data and models.

Understand different users’ different needs

ML applications generally have two main types of users—technical experts (data scientists and ML engineers) and nontechnical experts (business analysts)—and it’s important to strike a balance between their different needs. Technical experts often prefer complete freedom to use all tools available to build models for their intended use cases. Nontechnical experts, on the other hand, need user-friendly tools that enable them to access the data they need to create value in their own workflows.

To build consistent processes and workflows while satisfying both groups, David recommends meeting with the application design team and subject matter experts across a breadth of use cases. “We look at specific cases to understand the issues, so users get what they need to benefit their work, specifically, but also the company generally,” he says. “The key is figuring out how to create the right capabilities while balancing the various stakeholder and business needs within the enterprise.”

Adopt a common technology stack 

Collaboration among development teams—critical for successful MLOps—can be difficult and time-consuming if these teams are not using the same technology stack. A unified tech stack allows developers to standardize, reusing components, features, and tools across models like Lego bricks. “That makes it easier to combine related capabilities so developers don’t waste time switching from one model or system to another,” says David. 

A cloud-native stack—built to take advantage of the cloud model of distributed computing—allows developers to self-service infrastructure on demand, continually leveraging new capabilities and introducing new services. Capital One’s decision to go all-in on the public cloud has had a notable impact on developer efficiency and speed. Code releases to production now happen much more rapidly, and ML platforms and models are reusable across the broader enterprise.

Save time with open-source ML tools 

Open-source ML tools (code and programs freely available for anyone to use and adapt) are core ingredients in creating a strong cloud foundation and unified tech stack. Using existing open-source tools means the business does not need to devote precious technical resources to reinventing the wheel, quickening the pace at which teams can build and deploy models. 

To complement its use of open-source tools and packages, David says, Capital One also develops and releases its own tools. For example, to manage streams of dynamic data too large to manually monitor, Capital One built an open-source data profiling tool that uses ML to detect and protect sensitive data like bank account and credit card numbers. Additionally, Capital One recently released the open-source library rubicon-ml, which helps capture and store model training and execution information in a repeatable and searchable way. Releasing its own tools as open source ensures that Capital One builds ML capabilities that are flexible and repurposable (by others, as well as across its own businesses) and allows the company to connect with and contribute to the open-source community.

Enable data accessibility while prioritizing governance 

A typical ML system includes a production environment (processing data in real-time) and an analytical environment (a store of data with which users can work). For many organizations, the lag time between these environments is a significant pain point. When data scientists and engineers need access to near-real-time data from the production environment, it’s important to set up appropriate controls.

ML developers thus need to ensure integration and access to both environments without compromising governance integrity. “In an ideal world, the organization would establish a seamless integration between production data stores and analytical environments that can enforce all the controls and governance frameworks that the data scientists, engineers, and other stakeholders involved in the model governance process need,” says David. 

Governing and managing the ML models themselves is equally important. As a machine learns and as input data changes, models tend to drift, which traditionally requires engineers to monitor and correct for that drift. MLOps practices, by contrast, help automate the management and training of models and workflows. An organization adopting MLOps could determine for each ML use case what needs to be monitored, how often, and how much drift to allow before retraining is required. It can then configure tools to automatically detect triggers and retrain models at an appropriate cadence.

In the early days of ML, companies took pride in their ability to develop new and bespoke solutions for different parts of the business. But now companies seeking to scale ML in a well-governed, nimble way have to account for continuous updates to data sources, ML models, features, pipelines, and many other aspects of the ML model lifecycle. With its potential to offer standardized, reproducible, and adaptable processes across large-scale ML environments, MLOps could unlock the future of enterprise machine learning.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

Related Posts
17 trucos y consejos para ser un experto de Mensajes de Google thumbnail

17 trucos y consejos para ser un experto de Mensajes de Google

Mensajes de Google es la aplicación oficial para enviar SMS/MMS y chatear gratis a través del protocolo RCS, popularmente conocida como el "WhatsApp de las operadoras" y la alternativa en Android a iMessage de Apple, superando ya más de 1.000 millones de descargas en la Play Store. Aprovechando que cada vez más son los dispositivos…
Read More
A text message routing company suffered a five-year-long breach thumbnail

A text message routing company suffered a five-year-long breach

Syniverse, a telecom company that helps carriers like Verizon, T-Mobile, and AT&T route messages between each other and other carriers abroad, disclosed last week that it was the subject of a possible five year long hack. If the name Syniverse sounds familiar, the company was also responsible for the disappearance of a swath of Valentine’s…
Read More
Xiaomi confirms  MWC 2023 attendance, but no 13 Ultra launch thumbnail

Xiaomi confirms MWC 2023 attendance, but no 13 Ultra launch

Xiaomi is sending out invitations to reporters that want to tour its booth at MWC 2023. The company writes “we'll share our vision for 2023 and onward, and how we plan to innovate and better connect people”. While we’d love to hear Xiaomi’s strategy for this year, there isn’t even a hint of a Xiaomi
Read More
Sony Xperia 1 III and Xperia 5 III receive the update to Android 12;  Dual-SIM variants are the first to update thumbnail

Sony Xperia 1 III and Xperia 5 III receive the update to Android 12; Dual-SIM variants are the first to update

Primele zile ale lunii ianuarie ne-au adus temperaturi de primăvară, zăpadă în ultimele ore și multe update-uri la Android 12 pentru principalele smartphone-uri de pe piață. Samsung conduce la numărul de știri referitoare la actualizări, dar acum apare în peisaj și Sony care anunță disponibilitatea lui Android 12 pe telefoanele Xperia 1 III și Xperia…
Read More
Index Of News
Total
0
Share