A marketing analytics summer internship at Venmo
· by Tom Vladeck, MBA Intern
Hi, my name is Thomas Vladeck. I’m in the middle of getting my MBA from Wharton and this summer I had a marketing analytics summer internship at Venmo. Prior to Wharton I did a variety of things related to climate policy, so although I majored in math a long time ago, this is a bit of a new thing for me. I loved the three classes I took in marketing and market research I took in my first year, and since Wharton is heavily quantitative, I felt ready to take on a new challenge.
Picking my project
My first step was to pick my project. I was hired with the understanding that (a) I’d do some sort of technical market research project and (b) I’d mostly manage myself. During my interview process I put together a process I’d follow for defining the project, performing the analysis, and distributing my findings. My first task was to get acquainted with the team and figure out what everyone wanted to know - and what would be the highest-value projects I could work on.
Eventually we settled on trying to understand what types of people were not getting value out of Venmo and churning out. This would happen in two stages: first, I’d figure out which users were no longer using Venmo, and second, I’d correlate that with other things we could observe about those users to come up with a general finding.
A diversion into product survey data
But first! I got sidetracked. While getting acquainted with the data we stored in various places, I started looking into the surveys we send our users. Like most companies, we keep track of our Net Promoter Score. We also ask a bunch of follow-up questions about how our product is performing, such as “Do you find it easy to find the right person to pay on Venmo?”. I noticed that we were sending three different versions of the survey out, with different sets of questions. Our PMs had a lot of questions, and no good way to sort out which they should be asking.
Ta Da!! I had just learned a tool to do this, and wasn’t going to let the opportunity go to waste. A tool called factor analysis can help marketers interpret the information they are getting from their surveys, and redesign them to ask fewer questions.
Factor analysis works on the following principle: we can observe only how our users answer the questions we ask them, but we can’t directly observe the “factors” that are important to them. The process uses fancy math to infer what factors are driving different answers to questions.
As it turned out, we were asking a lot of redundant questions; for example, we were asking five questions that began “I feel that it is easy…”. The graphs below are one output of the factor analysis (called a scree plot), and show how much variation in survey responses is accounted for by each factor - and clearly some factors are far more important than others. This meant that we could reduce the number of questions we ask and get the same amount of information.
Based on the factor analysis, we were able to reduce the number of questions we ask from 23 to just 8, and combine our three regular product feedback surveys into just one.
Back to customer analytics
With that little mini-project out of the way, I turned back to the task of thinking about retention and churn at Venmo. Apps like Venmo have a much tougher time calculating churn than do subscription services like Dropbox or cable TV. In “contractual” settings, you can observe churn directly when people cancel their subscription; by contrast, if someone doesn’t use Venmo for a while, there’s a chance they just haven’t been in the right place or mood and will come back.
I had planned to dive right into this sexy stochastic model that would put a probability on each user being “alive” but I was urged by our GM, Mike Vaughan, to start simpler. The sexy stochastic model would come later.
The first thing he suggested doing was creating a “transition matrix.” Like most every other tech company, we measure our “monthly actives” - the number of users that show up in a given month. But we weren’t measuring how much “turnover” there was in our active user base. Were active users staying active? Were inactive users becoming active and vice versa? There was no way to tell.
A transition matrix has every user fitting into one of these cells:
This can give more detail on what our retention looks like. Given a monthly active number, more turnover is better, as it means that more users overall are still using Venmo.
With Mike Cohen - my strategy mentor - I wrote some R code that combed through our database to track individual users to fill out this matrix. Sure enough, in addition to high retention, we also had high turnover. Since it was worth keeping track of, I worked with JT Glaze - data engineer impresario - to sketch out the Python code that will ultimately feed into a Looker dashboard for the company to keep track of.
The sexy stochastic model
With our transition matrix presented to the team, I turned my attention back to the modeling our user base. It turns out that many of the good folks back at Wharton have spent a lot of time thinking about churn rates in a non-contractual setting, and have developed some models and R code to work it out.
The model that seemed to apply best is called the Pareto/NBD model. Roughly speaking, it assumes that every Venmo user, while they’re alive, has a constant probability of using the app on any given day - but those probabilities are different for every user, and they vary according to a prior distribution. Similarly, the model assumes that users have a constant probability of churning out each day, and again, these probabilities are different for each user, and vary according to a prior distribution.
The beauty of this model is that it only needs a few pieces of information for each user (how long they’ve used Venmo, when they last used it, and how frequently they use it), and there is a really useful R package that will do most of the heavy lifting (although we did have to patch some functions).
With tons of data and a state-of-the-art model at our disposal, we plugged and chugged, and… bummer:
As you can see here, the model is substantially underpredicting our holdout data. We scratched our heads, and we ended up finding the culprit: clumpiness! (No, really, that’s the term). It basically means that our users don’t have a constant probability of using the app every day. Some weeks you’re with friends sharing payments left and right, other weeks you’re heads down at the library studying for finals and barely going out.
Can we measure clumpiness? Of course we can! Some more good folks at Wharton have come up with a measure based on the familiar notion of entropy. When we calculated it for our users, we found that a huge number of our users were “clumpy.”
So, basically, our users are binge-users of Venmo. This jibed well with the data that our user base was turning over significantly. It also meant that a critical assumption of the Pareto/NBD model was violated. On the bright side, we learned something new about our users.
With the stochastic modeling route closed off until someone (or we) make inroads into extending customer lifetime value calculations to customers with hidden states (in a way that’s computationally feasible), we turned to segmenting our user base, including clumpiness as a segmenting variable.
You may be wondering, “what is segmentation”? Basically, it’s an attempt to classify your users into different types. You may know of “soccer moms” and “nascar dads” from the political arena. Same thing. For example, some people on Venmo are in college and use it when they go out with their friends; others are older and use it only for rent. We found these archetypal users by using a technique called model-based clustering.
In addition to working in R, I really enjoyed a few other things about this summer. The first is pairing. Venmo has a ton of pairing rooms where you can sit next to a teammate and work off the same computer. My mentor Mike and I spent countless hours working together on problems - sometimes as simple as going through an academic paper or writing an email - and we were at least five times as productive as we would have been individually.
The other thing I really enjoyed about Venmo are demo days. Every other Friday the various Product, Engineering, and Design teams will demo what they’ve been working on. Over the summer I demoed a few times. At first I was a bit hesitant that people would be interested in this quant-heavy marketing stuff, but was very pleasantly surprised to find people really enjoyed it.
Finally, Venmo was just a plain fun place to work at. I mean, I was so into playing dodgeball that I basically threw my arm out:
I had such a great summer that I even wrangled my way into continuing to work on projects during the school year! Although I’ll miss being in the office everyday, I’m excited that I’ll get the opportunity to continue scratching my statistical-modeling itches.