What is Machine Learning (ML)? Pragmatically most can agree that it’s about getting computers to learn over time in an autonomous fashion by feeding them data. But how machine learning is applied in practice varies by industry. For this reason, ActiveState recently undertook a project to survey the state of Machine Learning by interviewing a range of leaders in various industries, including:
- Atlassian in the Software industry, whose key business driver was to understand customer churn
- State Auto in the Insurance industry, whose key business driver was to create new revenue streams
- Hydro Quebec in the Utilities industry, whose key business driver was to better predict mechanical failures in order to reduce downtime
- GSI Technology in the Hardware industry, whose key business driver was to create visual search or recommendation engine on a chip
The idea is to highlight both technical and business challenges with the goal of educating organizations who are currently considering the adoption of ML. While each company’s experiences were unique, there were also some commonalities shared by all, including:
- Resourcing – The dearth of data scientists was universally decried. To overcome the gap, they resorted to hiring new grads and/or encouraging existing personnel to diversify their skills.
- Data Volume – ML requires huge volumes of both training and test data in order to create and prove out a robust model. All of the companies interviewed either generated their own, or were able to source sufficient data.
- Data Preparation – Always a significant portion of the work, many of the companies volunteered that finding the right data to work with was also an issue. Once the right data is found labeling, cleansing, normalizing, indexing, mapping, etc required anywhere from 50-80% of the project’s time and effort.
- Model Creation – While proofs of concept were easy to throw together in just a few weeks, it takes time to find the best algorithm. Trade offs between finding the best fitting algorithm versus best performing algorithm were common.
- Operationalization – Moving models into production remains a significant hurdle to realizing value from ML initiatives. Packaging the model for production, and scaling it out were cited as key concerns.
Universally, the biggest business hurdle lay in transforming the company into a data-driven business at all levels, which can be a significant culture change for organizations that previously operated with top-down decision making. Key to overcoming this hurdle is for data science teams to be able to provide enough significant results to encourage continued investment. Unfortunately, the ML productivity J-curve works against you. The J-curve reflects the fact that a company’s investment in ML can actually result in decreased productivity at the start of their initiative, before later delivering a sharp increase in benefits.
ActiveState has compiled the experiences of each of these companies at the forefront of the Artificial Intelligence revolution into a book called, “Building a Data Driven Culture with Machine Learning (ML), Industry Use Cases from Leading Companies.” We’ve been working on Python builds since 1999 and have been witness to the mass adoption of Python for ML. It’s important for us to help enterprises on their ML journey reap the benefits of Python.
Get the complete 8-chapter book of industry use cases.