Get an inside look into how Vista Tech and Data is transforming the customer experience with machine learning. Jack Lin, our Data Scientist, discusses the cutting-edge techniques used to create personalized experiences through the Next Best Action (NBA) engine.
At Vista, we are committed to providing our customers with the best possible user experience throughout their journey with us. There’s nothing quite like seeing a big smile on our customers’ faces when they have an “Aha!” moment and think, “This is exactly what I need right now,” whether it’s from our site or marketing outreach.
We believe that personalization plays a key role in achieving this mission, which is why we have funded the Next Best Action (NBA) project. NBA enables us to better understand our customers’ needs and preferences through the lens of data science and to deliver tailored experiences that are relevant and meaningful to everyone.
Whether it’s the treatment we offer, the actions we take, or the content we provide, it all comes down to creating an engaging and satisfying experience for our customers. The term “treatment” here refers to how we interact with customers, actions refer to the steps we take to meet their needs, and content refers to the information we provide to help them achieve their goals.
Our goal is always to provide a personalized and valuable experience that keeps customers coming back!
The Challenges with Traditional Ways of Personalization
For years, marketing treatments have always been quite formulaic and A/B testing style, making the entire process heavily reliant on the formulas. In the larger scheme of things, the whole process becomes too restricted and cannot realize its full potential for the customers.
Personalization is the future. And at Vista, we plan to focus precisely on that.
The modern segment of customers wants an alternative. An alternative that can cater to their wide range of preferences.
The traditional method of deploying machine learning models for one-off use is slow to adapt to complex environments where all the variables constantly change. This requires not only the constant involvement of humans but also additional hours to make the models work.
How NBA is Managing to Provide a Breakthrough
The arm of NBA (Next Best Action) includes a modern learning technique called Contextual Multi-arms bandit, a branch of RL (Reinforcement Learning), an emerging way in data science.
Contextual bandit is a problem-solution model where a learner (Agent) is put in a situation (State), takes action, and consequently receives feedback (Reward) from the environment. Positive feedback means a reward and negative feedback means a penalty.
How does NBA help? Well, it allows the agent to adapt to different situations (states) and choose the best action (Action) to optimize the desired outcome (Reward).
In the NBA world, putting email marketing in context is quite simple,
- Algorithm > An Agent
- Email Creative > Action
- Engagement/Purchase > Reward
- Opt-out/ No response > Penalty
The NBA algorithm will then open up the possibility of including numerous actions depending on which outcome the customer desires.
This model not only adapts to the ever-changing dynamics but also acts as a smart assistant that learns and grows by collecting rich data and implementing new strategies automatically. This also helps NBA make informed decisions since the data is always up-to-date and accurate.
Use Case: NBA in Email Marketing
Email marketing has always been an effective tool for reaching out to potential customers. However, not knowing who wants “what” and “when” can make the entire process redundant. Because of how it is set up, the email rules work on pre-defined triggers. These triggers send emails to anyone and everyone without gauging the amount of engagement from each set of users.
With NBA, the goal is to reach the right audience in the right way.
Let’s take a friendly replenishment offering message, for example. It’s simple but hard to improve and can easily burden our customers as they, especially the frequent active buyers, are usually eligible for more than one program and other daily email treatments.
The relevance is lost, and the customer is disengaged. This is where NBA bridges the gap, optimizing not only WHAT to send but also WHEN to send the email treatment.
Addressing WHAT and WHEN with uplift modeling
The concept of uplift modeling is understanding the incremental value of doing something and not doing anything at all.
In this context, it decides whether the company knows if it should send emails to their customer or not. The incremental value here means the level of impact the mail could have if it is sent and how much it can be responsible for increasing the likelihood of a customer engaging in the company or buying the product.
Based on this input, NBA decides on a threshold and determines how much value mailing a particular customer would bring. If the incremental value is significant enough, it can be kept as a baseline for the next best decision.
Incremental Value = P(R|Treatment) – P (R|No Action)
Ultimately, the model also addresses and translates the “WHAT” and “WHEN” into a modeling setup, where the default email treatment is considered as no action, meaning no email is to be sent, and also what to send, which will depend on the value brought to its customers.
Optimizing NBA Engine for Better Performance
Addressing data bias
In RL (Reinforcement Learning), models are constantly updated on new data inputs. With a massive history of randomized data, the biggest fallacy with each data set is that it likely leads to bias.
To minimize this, we created a propensity model for sampling using IPW (Inverse Propensity weighting).
For example, data with a 0.4 propensity is weighted as 2.5 (1/0.4), and another with a lower propensity of 0.2 is weighted higher as 5 (1/0.2). This is the so-called IPW process to remove the treatment bias.
Exploration and exploitation of new opportunities
This RL setup provides a continuous feedback loop of exploring environments, searching for alternatives, and uncovering new possibilities while exploiting the current base knowledge to lock on early success.
We used the Epsilon-Greedy algorithm to maintain the proportion of exploration and exploitation and applied the most common e-greedy approach to control the exploration rates, i.e., 0.1 e-greedy means 10% exploration and 90% exploitation. The more data we acquire over time and the better the model we train, the lower the e-greedy ratio decreases.
Ultimately, we went through these three stages to perfect our NBA engine. First, we started with the exploration to learn through randomization. And then, we applied the knowledge in the second stage, aka exploit, while it’s still learning.
And eventually, we reached a point where we maximized the optimization. This is what we call a stable stage, and at this point, there’s not much to learn anymore until the new action comes in or a shift in customer behavior.
In our case, in the end, we maintained a small e-greedy ratio of 0.1 to stay adaptive and ready for change.
Treatment assignment with random control
Once the exploration and exploitation strategies are in place, the final step is to randomly select a percentage of customers as the control group to power the uplift model. This group serves as the benchmark to evaluate the treatment impact.
By comparing the behavior of this group with that of the group receiving the treatment, the model can predict the incremental value between the two groups. Thus, we can estimate the impact of treatment.
Putting It All Together: The Complete NBA Architecture
Putting all the modules together, this is our diagram of architecture.
Data Set: The dataset comes from the program communication history using the RL setup. It’s a self-sufficient framework for acquiring the data we need. The program is live and executed daily. The headcount for computation involves millions of records covering most of the Vista markets.
Feature Creation: We created features to feed into the NBA engine, mainly from customer profile data, transaction data, email engagements, and site activities. The aggregation method included the most common statistics through Deep Feature Synthesis, an algorithm that creates features between sets of relational data in an automated fashion.
ModelOps: We built a distributed modeling pipeline to perform parallel processing, track experiments, log parameters, and store artifacts to support continuous improvement for each iteration of NBA engine refinement.
Business Impact
With the NBA engine now in production, promising results are being seen across all domains, including uplifting and engaging with the customers.
It has successfully achieved its goal of optimizing the right email treatment for each customer by decreasing the burden of overflooded inboxes, resulting in more customer satisfaction and financial gain.
Moreover, this also helped the marketing operations team move away from a rule-based to an automatic model-driven implementation, making it another big success in process improvement.
Where We’re Heading Next?
Next-Best-Action’s success has clearly demonstrated the power of combining online learning and machine learning in the contextual multi-armed bandit setting. With this engine, we can automatically adapt to our customer’s needs and behavior and provide personalized content.
This is just the beginning of the revolution in personalized customer experience powered by data and innovation. The future holds much more.
Considering a career change to work in a data-driven organization? Discover exciting career opportunities with Vista DnA.