Meet Mike Shores, Senior Director of Data Science and designer of DnA’s model building protocol. Because Vista is a matrix organization, data scientists in Mike’s chapter are working on anything from product search to pricing strategy to customer care. Here he shares the philosophy powering model performance measurement and why accuracy isn’t always king.
Building a machine learning model (ML) is one thing. Real-world impact is another. Between these two benchmarks is a creative balancing act of technical and human factors. So, what should data scientists keep top of mind when it comes to evaluating their models?
It’s all about the customer…
Sound obvious? Nailing a customer pain point is far more nuanced in practice. For example, the Vista website makes product recommendations. Straightforward, right? But there’s a lot to interrogate to build the right model. Asking the right questions is how you ensure a rounded, pragmatic process that delivers the best outcomes. It becomes the signature of team success.
- What problem are we trying to solve? Sit with your customers before building anything to understand their pain points and issues to solve. Sidebar: DnA has internal and external Vista customers to serve.
- What real-world constraints need to be considered? They can be technical (your model needs to ensure website speed for scoring purposes) or practical (an external customer has a strong interest in holiday cards, but probably not in mid-July). Or both.
- What value will this model drive? How much will improving model accuracy by 1% enhance a customer’s experience? If you improve a process by 2%, what’s the ROI on time spent doing that work? Data scientists are always in high demand. What’s the ‘opportunity cost’ of putting a new project on the back burner to iterate the ‘old’ experience for your customer?
- How will your customer trust the final model? If they aren’t comfortable using it, the value is lost. Plan how to support the uptake. At/for DnA, that means explaining how your model works, sharing model interpretability metrics and pressure-testing recommendations with actual users.
But it’s also part science…
Be conscious as you develop and utilize each tool at your disposal.
- How can the way you select data help or hurt you? A bad sampling strategy can make a good model look bad or a bad model look good. For instance, at Vista, our product demand is sensitive to holiday periods, so we factor that in and often measure that time bracket separately.
- What’s the right evaluation metric? And at what level? It’s crucial to select the relevant standard evaluation metric by use case. Then, set a minimally acceptable level of error given the performance of your baseline process. Consider if errors need to be evaluated with more granularity, by the region or industry of your customer, for example.
- How will you document this? Having a consistent formula to record why/what/how means colleagues can quickly understand someone else’s work. At DnA, we use one template: we all know where key info is, and it speeds up knowledge sharing.
…and part art
But it does pose a sticky question: how do you encode subjective considerations? The DnA answer is practical – test rather than debate. It’s faster. And if it doesn’t work, a misstep is quickly undone. Some guardrails :
- Will this move the needle? The technical gains might be sound but will improving this model benefit the organization in a holistic way? Use your expert voice to call it.
- Is there institutional knowledge to capture? Lean on the collective experience of your product team to assess whether new variables meaningfully impact business goals. And don’t worry about your knowledge gaps. At DnA, we recruit fearlessly honest people who say what they do and don’t know. It’s how we create a safe psychological space where sharing and asking for help is the norm.
- The most powerful tool of all? An ML model should never be a black box: you need to convey how it operates and authenticate this with the team. The explanation doesn’t need to be flawless. Use examples: you’ll build trust, help model adoption, and empower other teams.
Customer care: a case in point
Here’s how we handle our philosophy of balance on the ground. DnA was tapped as a partner to improve Vista’s ‘Customer CARE’ team forecasts.
- Question 1: Why aren’t today’s forecasts working? Before we could accept the challenge, we had to understand what was wrong. Feedback was wide-ranging. When the forecast was inaccurate, it led to understaffing and a negative customer experience, but scheduling too many CARE specialists is costly and lowers employee engagement. The forecast was time-consuming to put together. It didn’t account for new trends and company initiatives.
- Question 2: How do you use the forecast? Our partners in Customer CARE looked at the forecast primarily by country and language to make near-and-long-term decisions about weekly staffing schedules and when to hire team members.
- Question 3: How far out do you make your decisions? We figured out two-week and quarter lead forecasts worked best. Same-day or one day lead times were impractical – people appreciate notice of schedules and changes.
It took just three questions to cover an expanse of necessary ground:
- Vista is customer-centric: We wanted to staff Customer CARE optimally, cutting hold times for Vista customers while keeping costs efficient.
- Science: We needed to balance our sensitivity to outlying data points. Chiefly, bias arising both from underestimating customer calls (leading to wait times) and overestimating calls (resulting in an over-staffed, underemployed customer service team).
- Art: We know that more customers reach out to us when we run promotions. But promotions are dynamic – we don’t know when future offers will occur. So, we had to be creative, devising assumptions and forecasting the number of ensuing customer contacts.
How did it turn out? It’s a win for everyone.
- Forecast errors are reduced by 30% globally.
- Our solution is scalable. And forecast generation time is now a few hours, not days.
- Vista customers and employees are experiencing a cascade of benefits in the real world – better service levels and less ad-hoc rescheduling of shifts included.
Thoughtful, applied creativity is the lifeblood of both Vista and DnA. It’s as vital to our balanced approach to data modelling as it is to the design solutions we conjure for small businesses.