fbpx

Our guests today are Brian Krebs and Anthony Cross. Brian is the founder and CEO of MetricWorks, and Anthony the former Head of UA and Data Science at Big Fish Games. In today’s conversation they offer very interesting perspectives on why SKAdNetwork isn’t enough – and what it needs to be supplemented by in order for mobile measurement to truly reflect the value of marketing efforts.

In today’s conversation, we reflect on why measurement in a world where marketing is influenced by multiple variables has no clear and easy answers – certainly none seemingly as simple as the solutions that deterministic measurement offered. The solution then is to embrace the multiple variables involved – and use an approach that is a mix of SKAdNetwork, econometric modeling and intelligent experimentation.

I’m thrilled for this episode for how Brian and Anthony truly deconstruct the very many nuances there are to truly measuring marketing impact; and offer what might be the smartest way to approach an inherently messy paradigm.






ABOUT: Brian Krebs  | Anthony Cross | MetricWorks




ABOUT ROCKETSHIP HQ: Website | LinkedIn  | Twitter | YouTube


KEY HIGHLIGHTS

🌓 Why last touch attribution does not give the whole picture

🏋️‍♂️ Incrementality is THE criteria for marketing effectiveness

🤏 The limitations of SKAdNetwork signals for measurement

🧮 Top statistics-based approaches to measuring incrementality that work

🍼 How statistical models rely on IDFAs to be useful

🧨 No-IDFA is going to blow up the measurement landscape

📚 Big Fish used multiple sources of data in a custom dashboard to solve for attribution

💡 How Big Fish used data science to get marketing insights

🦋 Using data science to comparing LTV curves through a gaming app launch cycle to amp up ROAS

◀️ Reverse engineering LTV curves to inform ad spend decisions

🗺️ Mapping in-app events to monetisation and retention opportunities

💸 Spend needs to tie back to business, and LTV is the only way to do that

📍 Where to apply econometric modelling in marketing

🎨 How to use media mix models for marketing

🏗️ How to build models that calculate impact and outcomes

💯 Micro and macro level factors to consider for accurate predictions

🍃 How seasonality and trends impact lift

⚙️ Tweaking factors to generate insights about what-if scenarios

👋 The journey from deterministic to probabilistic has already begun

🧪 Testing successful campaigns with interrupted time series can give counterintuitive results

💪 How the post-IDFA landscape is strengthening the case for incrementality testing

🔻 The opportunity cost for interrupted time series experiments

⚖️ The careful balance in ITS experiments

🔎 Why models are necessary alongside experiments for real validation 

🧱 How to construct an econometric model by choosing the right variables

📅 Why daily data is better than weekly data for training models in spite of the increased noise

📈 The value of seasonality and trends encoding in models

🔮 The key to good prediction for granularity in each event 

🔬 Understanding counterfactual experiments and setting baselines

🥇 Why incrementality testing is the gold standard now but not post-IDFA

👬🏻 The two-fold team buy-ins for testing

🤝 Why transparent partners are very important to a UA team

🌌 Why experiments are tied to a certain level of scale

🚼 Why single apps don’t have to worry about incrementality till scale 

👌 The tried and tested approach to learning how to model

KEY QUOTES

Why incrementality testing needs IDFAs

For example, incrementality testing, without a big pool of IDFAs in order to take that audience and randomly split it between the control group and a treatment group, you are not able to run the randomised control trial that is kind of the basis of incrementality testing. 

Finding answers to what-if scenarios

What if all this spend, all the impressions, all the clicks, all the marketing activity from a particular media source, or particular country, or a particular creative, and any combination of those at any granularity is gone. And then the model tells us, well, this is what would happen, your overall installs would decrease by a 1000. Or your overall revenue, day seven, would decrease by $100. That represents the incrementality.

The difficulty of running interrupted time series tests

It’s always hard when you’ve got a campaign running, it’s clearly making you money, and then you turn around and say: “Look, we’re gonna stop this for a month because you want to do some testing!”

Weighing the opportunity cost vs the information gain

You have to be able to quantify that balance, so that marketers can make the best decisions for them: 

  • So what is the information gain if I run this experiment? 
  • And what is the likely opportunity cost based on how much money that I’m historically running through that set of traffic? 

Using multiple, interconnected models for best results

The naive approach is training one model to predict with all those factors to predict installs, and then whatever that model says is the install incrementality you allocate proportionally the revenue and all the downstream events. That is not a good way to do it. Because some media partners’ campaigns, countries may be giving you a lot of incremental install lift, but give you almost no revenue, or other downstream events. 

So you want to look at training separate models for each individual event and cohort day.

Zeroing in on incrementality

Some marketing intervention happens at some specific time period and you measure the actual observed impact of that intervention. And you measure the expected impact if the intervention had never happened, right, we call that the counterfactual and then that could be based on a control group. You pause German traffic, but you don’t do anything to Italian traffic; Italian traffic would be the control group. 

Or it could be synthetic, where you use something called causal inference to synthetically build a counterfactual or a baseline in lieu of a real control group. And you simply take the actual observed effect against the expected effect, without the intervention of the baseline. And that’s your incrementality right there. 

The minimum scale for calculating lift

When you are a new app—which is probably the only type of title that would be out there buying on a single channel—then more than likely you don’t have significant organic demand you have to worry about.

FULL TRANSCRIPT BELOW

Shamanth: I’m very excited to welcome Brian Krebs and Anthony Cross to The Mobile User Acquisition Show. Brian, Anthony, welcome to the show.

Brian: Excellent, glad to be here. 

Anthony: Thank you, Shamanth.

Shamanth: Absolutely thrilled to have you guys. 

Because certainly, we’re going to talk about something that’s top of mind for a lot of marketers. And you guys have very interesting insights and approaches to thinking about measuring in a paradigm where precise measurement may not be possible at all. And that’s a shift that I think it’s going to be very seismic for everyone in marketing. And I’m thrilled to have you guys just speak about what I think is a very unique and interesting approach to everything that’s going to happen. 

To start off, as we’ve known, and as certainly we’ve talked about on the show before: SKAdNetwork is not enough in and of itself for measurement. And I think one reason for that is just that it’s a last-touch approach, and that can be problematic in very many ways. 

So what do you see, Brian, as some of the key approaches that can supplement SKAdNetwork?

Brian: Yeah, that’s a great question. And I definitely agree with your premise: last touch attribution has a bunch of pitfalls. I know that your viewers probably understand a lot of those already. 

I think they can be summed up, as it pertains to what we’re talking about today, in really just one theme, which is that they’re unable—just intrinsically—to measure incrementality. And I just believe that measuring incrementality is the only way to align measurement with business value. And that’s absolutely what measurement should be measuring marketing effectiveness and there is no marketing effectiveness without an alignment with business value. 

As far as the second point you made; absolutely SKAdNetwork, it’s not enough. The interesting thing is SKAdNetwork is not only hamstrung by the fact that it implicitly carries with it the inherent flaws within last touch attribution, but also it’s a limited last touch attribution signal itself. 

What I mean by that is it’s a last touch signal that by its nature, by design, is meant to protect user privacy. So that manifests itself in the limitation of 100 campaign IDs, in that limit to 64 conversion value IDs, and so on and so forth. 

So what we have seen is a variety of ways that are already out there today. Some aren’t very prevalent within performance marketing, even though they’re prevalent elsewhere, mostly in brand marketing. Things like incrementality testing, things like time series or interrupted time series experimentation, and certainly econometric modelling. Things that are implementations of econometrics have been explored before, like media mix modelling, which actually we found does not apply properly to the problem with measurement. 

But what a lot of people don’t know, even if you’ve heard of some of those approaches to measurement, is that while they can all measure incrementality—so they have that big leg up on what we see today, as well as what SKAdNetwork will be—there’s one big problem: is that some of those rely on IDFA.

For example, incrementality testing, without a big pool of IDFAs in order to take that audience and randomly split it between the control group and a treatment group, you are not able to run the randomised control trial that is kind of the basis of incrementality testing. So much of the landscape changes in terms of the available measurement approaches, as soon as IDFA is completely deprecated.

Shamanth: Yeah, right. And as you pointed out, Brian, a lot of these approaches are prevalent, but just haven’t been applied to mobile, just because you had the IDFAs available to make deterministic decisions. 

So Anthony, you led the UA and data science functions at Big Fish games. And you did this in a pre-iOS 14 world, and then you led a lot of the data science efforts at the time. This was in a fully deterministic world. What were some of the key types of analysis that you drove at the time, and that you found impactful that you couldn’t attain just via deterministic measurement that MMPs offered?

Anthony: Yes, Shamanth. That’s a really good question. The first thing that I did want to point out is that MMPs are obviously a source of attribution data that everyone uses, but at Big Fish Games, we were pretty fortunate in that we actually had a custom built UA platform that took the data feed from our MMPs. But also, we had direct API feeds from other major advertising partners, and also had a direct data stream coming from our games. 

So just the first point to make is that it was always hard to learn a single version of the truth from a single source. So in our case, we had the MMP data, we had the advertising company data that we work with, as well as having our internal game data coming through. And really one of the things we did is triangulate those resources to get to a level of truth. And that obviously involved data science to assist with that as well. 

But to answer your specific question about the types of analysis that our data science team did at Big Fish; I’ve a couple of examples to share. 

One example is predicting LTV curves. We had a new game that launched—a quick plug for it: it was a great game called EverMerge and launched back in May. And as anyone that’s launched a new game will know, you typically go through a soft launch period where you look at LTV curves there, but those curves typically are pretty different when you actually launch. Then you have a launch period, you have a golden cohort come in and typically fantastic LTV curves there. And then after a month or so you get into more of a steady state and your LTV curves change again. 

So one area that was critically important for us is having a data science team that was doing pretty regular cohort reviews of LTVs, through soft launch through that golden cohort period, and then afterwards. And constantly updating that LTV view of what revenue was looking like. And obviously, then applying that to the cost of buying to make sure that we were achieving a certain level of ROAS. So that was one critical area of data science with launching a game.

Another example of those—I’m sure a lot of folks out there use Facebook—we use Facebook AEO campaigns; that was one of our main vehicles for using Facebook. Another example of data science is actually we used the data science team to identify the specific events within a game there should be optimised around for those AEO campaigns. 

So what we’re looking to do there typically is, to say, what events within a game map well to people being either good monetizers or having really great retention rates, or whatever it is you’re trying to optimise for. For example, if you have a match 3 game, what level of progression do you want a person to get to in that match 3 game for you to be able to optimise around? So just some examples there.

Brian: Anthony, I was just wondering about a couple of examples that you just talked about, how do you see the type of data science that you’re performing in order to solve these problems changing when IDFA is gone?

Anthony: Yeah, that’s a really good question, Brian. Just taking the two examples I shared in turn, you’re always going to want to predict LTV; that doesn’t change at all. That is your beginning point about tying back spend to business value. That’s a pretty critical thing you have to do, running a UA team. So that doesn’t change and the data flow for LTV curves, again, in our case came directly from the game itself. So that is going to still be a big need. I think there is a really interesting question around Facebook and how well they’re going to be able to even offer AEO as a campaign. But obviously, if they can’t, then all that goes away.

Shamanth: Right, and I imagine some of what you did back then has parallels with what might be required going forward, Anthony, because it’s still going to be important to notice and realise what events correspond to revenue and LTV, because those will inform the conversion value strategy going forward, if not your AEO strategy going forward.

Anthony  

Yep, you’re exactly right. It’s parallel work or similar work.

Shamanth: Yeah. And if we were to look ahead by a few months, and even preparing now, for a few months from now, Brian, you spoke of alternatives to SKAdNetwork, or what SKAdNetwork needs to be supplemented by. You spoke of a couple of approaches within that, when you talk about econometric modelling. I understand this identifies and quantifies certain inputs and certain outputs in marketing cases; those should be fairly clear as to what they are. But I’m curious, where has this been applied? And what have some of the results been of this broad approach?

Brian: That’s a great question. Obviously, econometrics, in terms of answering questions about the economy certainly is where it was formed. But this methodology is applied to a bunch of different places, certainly, even in terms of marketing, mostly within brand marketing though: consumer packaged goods type spaces, even in the larger firms within the e-commerce space. 

And what they’re mostly using in terms of econometrics is the implementation of econometrics that we call media mix model or marketing mix modelling. So they’re used for slightly different things—actually, you could really say wildly different things—than the needs of marketers in terms of attribution or marketing measurement. What you can imagine them being is a model or an equation really—because that’s all a model is—that is describing a bunch of factors to try to understand or predict some sort of outcome. 

So definitely, in terms of the economy, a lot of researchers are looking at questions like, what are the factors involved in predicting unemployment rates, and all these different things, certainly, even in terms of what we’re looking at right now: how does COVID-19 and case rates and things like that actually impact unemployment and GDP and other areas of the economy. 

So what you’re really looking at is you’re trying to model the factors that could affect this outcome, and what the exact impacts of those factors are on the outcome. 

As an example, when applied to mobile measurement, the factors could include things like seasonality. We see a lot of cases where day of week plays a big role in your overall KPIs, your lift, in terms of revenue, but also certainly volume, and retention, when the user installs, the day that the user hasn’t been performing some sort of activity. There’s often spikes in lift on weekends rather than weekdays. You see seasonality in terms of the month of the year play a role, certainly as you get towards the end of the year, towards the holidays. For most countries, there are a lot of spikes involved there. So there’s some seasonality. 

There’s also some just trends. And a specific type of trend that we call local trend, where overall the app might be on its way up in terms of installs, in terms of revenue, all kinds of different KPIs. But there might be little local dips, where some bug in the app was pushed. And all of a sudden, that created a local trend, that was actually a decrease in overall lift. So we add those things too. 

And obviously, in terms of marketing, a lot of the factors revolve around your marketing activity: your spend, your impressions, clicks, even. And once all those factors, those variables, are input into this model, and the effects that those particular factors have on the outcome, whether that is installs, revenue, retention, LTV, ROAS are modelled, you can actually do a lot of things with that model. 

One of the things that we do is that we see,

what if all this spend, all the impressions, all the clicks, all the marketing activity from a particular media source, or particular country, or a particular creative, and any combination of those at any granularity is gone. And then the model tells us, well, this is what would happen, your overall installs would decrease by a 1000. Or your overall revenue, day seven, would decrease by $100. That represents the incrementality. 

What you can do with that is run this analysis using that model in order to determine incrementality across any granularity and report that in a very, very similar way, essentially the same way that last touch is reported by the MMPs. There’s just different allocations.

Shamanth: Yeah, and these sound like powerful approaches to understanding very complex systems, where you cannot boil down causality to one single cause. And certainly that’s true of economies. That’s true, certainly of marketing, just because there are so many factors in play. 

Anthony, I would be curious to hear if this is something you guys thought about or factored in, as you guys worked on games with multiple channels, and multimillion dollar budgets, where you ostensibly had deterministic, clear ROAS numbers, but you still couldn’t be entirely certain if they were adding incremental value.

Anthony: It’s really interesting. There’s so much that is going to be changing in the world of UA over the next six months. 

I think the deterministic approach is so compelling, and it’s such a strong way of thinking about things: you do X, you get Y. You have the data so you can associate the two things together. 

I think the reality, especially with LAT, and the pretty significant increase in people that have turned LAT on over the last year or so. The reality is that purely deterministic methods of tracking has already started to become less deterministic, and probably more probabilistic over time. It’s just that we haven’t acknowledged that because we’ve still had our dashboards and data that keep telling us, everything looks fine. But I think the reality under the covers is that signal is becoming a little bit less precise. And obviously, with iOS 14 it will continue to do so. 

So I think that there’s going to be some pretty big transitions. And one of the things that I was really keen on at Big Fish was this notion of just, in a world where there’s a lot of uncertainty, I think one thing you have to be doing is just trying a lot of different things. It’s like a horse race, you just don’t know in advance which horse is going to win. So you want to back as many horses as you can, and see which one of those then gets across the finish line. So that was the approach that I was taking at least a lot of different things.

Shamanth: Certainly, and in a deterministic world, I don’t think there was a lot of pressure to truly push for incrementality. Certainly, I think there were approaches that did involve incrementality. I think the most common one is what’s the interrupted time series, to turn off traffic in a certain geo, at a certain point of time, see what happens. Was that something you guys thought about? Was that something you would recommend doing going forward with iOS 14?

Anthony: It’s always hard when you’ve got a campaign running, it’s clearly making you money, and then you turn around and say: “Look, we’re gonna stop this for a month because you want to do some testing!” 

Shamanth: Yeah. 

Anthony: That doesn’t go very well with business; understandably so. And again, prior to iOS 14, it was like why on earth would you do this? So that was always the challenge for us. We discussed it, but we didn’t do it because of the business impact—especially when you’re just launching a new game, again you’ve gone global, and it’s going crazy—why on earth would you do that. So we discussed it, we didn’t actually do it. 

But I think, at some point, you have to bite the bullet and do the testing. And maybe you choose a geography where you’re not getting—you wouldn’t choose US or a tier one; maybe choose a tier two or tier three to do the testing. And the business impact is much less. I think you have to allocate some budget, and take away that budget and just test.

Brian: I definitely agree with that. 

I think that there’s techniques that we’ve been adapting slowly. You’re both absolutely right about the fact that we’ve been a little bit lulled to sleep by the comforts and hidden dangers, but it was just so easy and simple and understandable of last touch attribution. And this idea of a deprecation event is definitely acting as an inflection point; kind of waking us all up. Maybe this is the time to try to measure incrementality. 

It had been happening very, very slowly, already, in terms of incrementality testing. Facebook has their uplift tool, some individual channels or marketing partners, were starting to evolve some tooling around measuring uplift, using the incrementality testing concepts. 

I think the issue there is many of the incrementality testing techniques that people were using were just flawed. In general, I think it’s still the gold standard for measuring incrementality but without that big pool of IDFAs, you’re kind of dead in the water. And that big pool of IDFAs have to be opted into enough apps, so you can reach them. 

But it was expensive, for the most part, until this advent of ghost ads. It was very expensive to show public service announcement type ads or placebo ads; you still have to buy them. So ghost ads and things came up, but you have to have a lot of trust in your marketing partner to go with the ghost ads route. 

So with some of the problems that we faced in incrementality testing the fact that it’s basically just gone once IDFA is gone? Interrupted time series is kind of the last great way—very similar to incrementality testing in that it’s well adopted in areas like pharmaceuticals and medicine for very good reasons; areas that you have to be very, very critical in determining causality. So this becomes one of the last bastions of being able to understand causation in terms of marketing, because it doesn’t require IDFAs. 

The thing there is there’s an opportunity cost. When you go to pause something, ITS experiment or an interrupted time series experiment requires an intervention or an interruption. It doesn’t technically have to be a pause of traffic; it could be a decrease in budget, even an increase in budget, but you just gain less information about incrementality in that way. Pausing definitely is the best, but it’s this balance. 

And the key, I think, are twofold that we’ve seen in our research. One is that

you have to be able to quantify that balance, so that marketers can make the best decisions for them:

  • So what is the information gain if I run this experiment? 
  • And what is the likely opportunity cost based on how much money that I’m historically running through that set of traffic? 

Within a big company, you can rain down a little bit of terror on the employees below you, and you can get away with that. Within a startup, when you do that, it really demotivates people because you need to have that passion in there. So it can not only set back your traction, but it can also set back your culture, and it’s just really demotivating the people on the ground floor who are doing all the day to day work.



And that balance between information gained to the opportunity cost is really critical to quantify, so the marketer can make the best decisions for them. 

The other thing is you have to have the econometric modelling side by side, or some sort of modelling side by side with this sort of experimentation. Because otherwise, all these experiments, no matter how they’re designed, have one flaw, which is in terms of scope. You’re only measuring the traffic that you’re running the experiment on. 

With an econometric model or some sort of model, by having that alongside this testing or this experimentation, you’re able to potentially validate many different areas of traffic by running a single test because that model is modelling so many different interactions between these various channels, marketing partners, campaigns, geos, and all that, you’re able to run at one test, and validate many different areas of traffic all at once. So we’ve found that it’s absolutely critical to actually have this sort of two pronged approach, just testing or experimentation, or just econometric modelling. They’re both just much less than the sum unless they’re together,

Shamanth: Certainly, and out of curiosity about the economic approach that involves econometric modelling, it would appear that inherently just because there are so many variables involved in such a complex system, is it easy to come up with a model like that? So if somebody wanted to come up with an econometric model for their game, or their portfolio of apps, how does one go about constructing such a model? Just basically identifying all the variables that could be causes of revenue or retention, or whatever outcomes you are looking for? And how do you identify which variables are the most impactful?

Brian: Yeah, it’s a great question. So it all kind of starts with the model. And based on our research and the live beta testing that we’re doing right now, with many of the largest gaming studios in the world, I can say that we settled on Bayesian structured time series models. 

If you are going to try this in-house, I definitely recommend that type of model, in order to have many different types of components within the model, including matrix of spend. There’s a couple other things that you want to take a look at. 

  • Number one, you should be using daily data. It introduces less correlation, or it allows for decorrelation. That said it does introduce more noise. But using weekly data, which is kind of the tradition within media mix modelling, is impossible in terms of measuring marketing, specifically because you would need so many weeks of data in order to have enough observations to train a proper model. And in terms of marketing measurement, you just can’t wait that long to make decisions. So you want to use daily data. 
  • There’s another thing as far as the structure of the model. Definitely include seasonality. Almost everyone that we’ve looked at so far; there’s some sort of seasonality to their data. Look at day of week, for sure; month of year for sure; holidays; all those things can be encoded in the model. Definitely trends, including local linear trends. 

And there’s another big thing here. When you’re a marketer, we like to look at our marketing data in terms of cohorted KPIs, cohorted by install date. That all goes away with SKAdNetwork, because by definition, it’s protecting user privacy, you don’t know when the user installs, by design. 

In order to keep that type of analysis alive, when you look into econometric modelling, you need to train separate models per cohorted KPI. What I mean by that is

the naive approach is training one model to predict with all those factors to predict installs, and then whatever that model says is the install incrementality you allocate proportionally the revenue and all the downstream events. That is not a good way to do it. Because some media partners’ campaigns, countries may be giving you a lot of incremental install lift, but give you almost no revenue, or other downstream events. So you want to look at training separate models for each individual event and cohort day.

What I mean by that is there’s one model for installs. There’s a separate model for revenue day 0; a completely separate model for revenue day 1. While I’m calling these separate models, there are some techniques involved in order to share information between those models, allowing those models to talk to each other that makes the entire process much more efficient. But I think those are the basics in order to ensure that you’re able to measure at any granularity, but continue also measuring your KPIs cohorted by install date.

Shamanth: Sure, so you’re identifying potential variables and you’re building daily models, based on the system that you’ve described, which was based on Bayesian structured time series models. Do these account for variables that have not been a part of the original model, which is to say, for instance, COVID happens. Right? COVID would not have been a part of anyone’s modelling exercises, so how does the model account for pandemic?

Brian: That’s a great question. So I mentioned trends before; absolutely the component that you need in your model, because there’s sometimes overall trends. 

Anthony, you talked about a soft launch into a full launch, if the app is doing well of course. There’s this overall upward trend, until you hit some sort of peak and then it probably plateaus. When you’re talking about revenue, installs, all those things, right? So you want trends to be a component in your structured time series model. 

That said, there’s also something else you can do, which is adding something called a local linear trend. And that’s what we’re doing, where it accounts for local trends. Rather than these long term trends, you might see that across all your traffic, for a specific title, there was a big decrease in value, incremental value generated, whether that’s lift in terms of installs, or revenue or retention, or whatever it is. And interestingly, during this pandemic, there was actually, for many games, an increase in lift. But these local linear trends, even for the unexpected event, can see that locally in a short time period, there was actually across all your traffic, this big spike or this big decrease, and it’s able to account for that within the model.

Shamanth: Gotcha, that certainly makes sense. And again, I think it’s worth remembering that this isn’t meant to be a precise model. By the nature of itself, it’s imprecise because it is attempting to model an imprecise world. And that certainly makes sense. 

You did speak about running experiments in this paradigm. Those could be interrupted time series experiments, but those could be independent of it. What would examples of such experiments be that aren’t interrupted time series experiments?

Brian: Yeah, good question. So interrupted time series is pretty basic;

some marketing intervention happens at some specific time period and you measure the actual observed impact of that intervention. And you measure the expected impact if the intervention had never happened, right, we call that the counterfactual and then that could be based on a control group. You pause German traffic, but you don’t do anything to Italian traffic; Italian traffic would be the control group. 

Or it could be synthetic, where you use something called causal inference to synthetically build a counterfactual or a baseline in lieu of a real control group. And you simply take the actual observed effect against the expected effect, without the intervention of the baseline. And that’s your incrementality right there. 

So they’re pretty straightforward. There’s some data science techniques that you need, especially if there’s no reasonable control group. But outside of that, for future-proof experiment designs, we researched quite a few. There’s not many in terms of experiment designs that don’t require IDFA. The best that we have right now is definitely incrementality testing. I talked about some of the potential foibles and flaws in that sort of experimentation, but I personally still believe it’s probably the gold standard for measuring incrementality. The problem is, it’ll be gone in a few months. So without IDFAs, I personally believe that interrupted time series experiments is really one of the only opportunities left to truly test your incrementality.

Shamanth: And Anthony, some of these approaches via econometric modelling, or even interrupted time series; just all of these involve data science skills and approaches that are very different from what would have happened in the past. How easy would it be for UA teams, market marketing teams to get buy in on approaches like this from the internal teams going forward?

Anthony: So I think that there’s actually two sets of buy-ins you need: 

  • One is, as you say, from internal teams. One of the things that I was doing at Big Fish was actually just to have lots of conversations with different partners. And just to get different points of view, then again have people on our internal data science team, be part of those discussions. And again, there’s always some backwards and forwards, we had really good, experienced data scientists, who were able to have conversations, for example, with people on Brian’s team, and scientist to scientist kind of go head-to-head and get into the models of how they work. And so that’s one part of it.

    And actually, you want a partner who is willing to kind of open up the kimono a little bit and share some of how their secret sauce works. Your point’s very valid; you do need your own internal data science team to have confidence in the partners you’re working with. It can’t be just some blackbox, secret sauce stuff, where people just say: “Hey, it works, trust us!” You have to have that additional level of comfort there. So that some of the internal team stuff, you have to get them on board and bring them along and make them part of the process. 
  • I also think it’s a really important executive set of conversations you have to have. The team has to realise that this is a major shift that’s happening in UA. The UA performance and efficacy of the past is going to be different. 

Again, because none of us have a true crystal ball that can predict with accuracy, what is going to happen in the future, it’s important that we experiment, that we try different things, and there’ll be some cost to that. It’s going to mean working with a couple of different partners; some of them work out some that don’t. And then that will mean wasted time and money. But that is an expectation that is really important to set, because no one knows in advance what is going to be the optimal way of doing things.

Brian: I really liked your points about not trusting a partner if they’re not willing to open up the black box. Because traditionally, the interesting thing is, we all had this really easy methodology of last touch. It’s so simplistic; there’s no black box. 

When all of a sudden, LAT came out and there’s this probabilistic modelling done by the MMPs, which they’ve invested a lot of time and money into, at this point, to do fingerprinting, basically. And that’s a complete black box for upwards of like 30% of your users potentially. And it’s still kind of hiding behind the guise of this simplicity of last touch attribution; nobody really called them out on it. 

So I think it’s critical moving forward, there is no guise or anything to hide behind. Now, we all understand that with last touch mostly gone, except for this really limited SKAdNetwork signal, there is no simplicity, and no even feigning of simplicity. So we understand this complex problem. Any partner you work with has to be willing to completely open up the models to your teams; down to model decompositions, error rates. There should be no reason for them to hide behind some sort of blackbox approach, because otherwise they could literally just be making up the incrementality measurement results.

Shamanth: Yeah, certainly. Definitely, I think it’s important to know and understand what the statistical techniques they’re using will be. And is there a certain level of scale at which these approaches become meaningful? For instance, if there’s a small indie developer that’s on one channel, has $300 a day—or even like $100 a day—on one campaign, one ad set, one creative on Facebook? Do you think they should even bother about any of these things? Or is there a certain level of scale at which these approaches of econometric modelling, or really understanding incrementality, and all of this becomes meaningful?

Anthony: I think if you are doing things: one channel, limited budget; the sort of conversation we’re having now, in my view, is a lot less applicable. And the reason for that is a lot of the decisions you have to make as a UA leader is how to allocate a large budget across multiple channels. And if you’re just sticking to one channel, and let’s say it’s Facebook, or it’s Google, it’s one of the standard ones, you don’t need to worry about that allocation decision. And also assuming that that channel is a self attribution network, like Facebook is, then there’s very little validation data. You have the dashboards there; that’s where you live. You probably don’t even need an MMP. You’re buying Facebook, you’re living in Facebook, and that data is the data, and that that’s kind of it. So you don’t need to worry about data science and all the other stuff that we’ve been talking about is my view. 

Shamanth: Yeah, yeah.

Brian: Yeah, I actually completely agree with your view, Anthony. The interesting thing is, I think it all boils down to incrementality. What is incrementality? It becomes important because of two things. 

  • Number one, you have significant organic demand. And now you’re worried about possible cannibalization any time you spend on media. 
  • The other is if you have a wide spread of media—we all know most of these media partners these days have a very, very high overlapping audience almost 100% for the most part. Because of that overlap in audience, when you spend with another partner, another campaign, in another geo, on another publisher app even, for the SDK networks, you’re not guaranteed incrementality even over and above the rest of your paid marketing. Let alone your organic demand. 

When you are a new app—which is probably the only type of title that would be out there buying on a single channel—then more than likely you don’t have significant organic demand you have to worry about.

And you’re not worried about the audience overlap in terms of incrementality, either. So I completely agree, this last touch signal—even though it’s very, very limited, in terms of SKAdNetwork—is probably going to be enough for you until you do extend out, or you gain some traction in order for your organic demand to be significant enough that incrementality actually matters.

Shamanth: Yeah, that certainly makes sense, especially if you are a small indie developer with limited budgets. For a larger developer that has more substantial budgets, and are looking to actively explore some of these solutions, how should they start thinking about it? What skill sets or statistical models should they look for? Or somebody is the head of marketing or head of UA, and they’re like: “I want to educate myself on this. Either I can actually build the models myself, or I can have an informed conversation with somebody that is going to be building these models.” How should they go about it?

Anthony: Yeah, I’ll speak to it just because it was a process that I went through at Big Fish. I think that there’s a number of different sources of information to seek out. 

  • One, I had a lot of conversations with our partners at Facebook, at Apple, at Google, and got their input into what was going on. 
  • Second of all, there’s actually a lot of really good sources. I mean, RocketShip HQ is a really good source of information. MobileDevMemo is another one. There are really good articles, good podcasts, just go out and listen to those and educate yourself about different points of view, different opinions. 

One of the things that I did through that—especially listening to different podcasts—was identifying people that actually I thought had some good points of view. I would then go to LinkedIn. And I would find the company, find the people, and I would contact directly with those people. 

That may well be how I saw you, Brian, is that there a lot of good LinkedIn stuff that you guys were doing, and I was like: “Man, this guy knows what he’s talking about.” So I want to connect with you. So I probably talked to like at least five or six different partners, through that kind of podcast and LinkedIn, direct contact and getting in touch approach. 

So those are some of the things that I did. I think just being broadly educated, and then probably getting five, six conversations with different partners: What’s your model; what is your team? Are you willing to open up the kimono and show us what you’re doing? So that was the approach that I took.

Brian: I completely agree with everything you just said. It’s so ideal to actually explore what people are saying out there. LinkedIn is a great resource. 

I’ll answer this question in a slightly different way. Anthony, you mentioned something earlier in the conversation about how companies would potentially go about adopting something like this, or at least exploring it. And there’s a couple of hurdles; you mentioned one of them is at the executive level within that internal decision-making paradigm. I think that probably that is critical, even in the context of what we’re talking about right here. In terms of getting information and understanding what these various options are, and how you would implement them in-house, or how you would find a partner, if you wanted to go that route. 

The first step may be actually to understand what is the benefit of a solution or similar techniques to what we’re talking about right now. The ideal way, at least that I’ve seen, is for there to be some buy-in internally, often at the executive level, and then sort of a confirmation from the data science team; very similar to what you discussed earlier, Anthony. 

So I think that the key here is understanding the drawbacks of last touch and the additional limitations of a signal like SKAdNetwork that’s just so critically hampered by its design of protecting user privacy. And then understand what is incrementality to your marketing organisation. If you’re just operating on Facebook; there’s not a lot of organic demand, probably there’s not a lot for you. 

But if incrementality is potentially critical to your organisation, the main thing we hear is: “Well, yeah, but last touch… at least it’s understandable.” That’s true. But what if you lost all of your KPIs? That’s the easiest way to think about it. The KPIs that you work with right now: what if they were all gone and replaced by just a couple of approximations, which is usually what you can get out of that 64 IDs of the conversion value, and not understand when users installed anymore as well, so cohorting by install date is also gone. 

Is it better to not only understand your incrementality—rather than just understanding whatever happened to get the last touch, which thus far in our research and actual live experimentation is very, very random for the most part. It definitely differs depending on how your marketing mix is structured; how your title performs. But in general, there’s a large, large discrepancy—above 50% in some cases—as far as deviation goes, when you compare our incrementality-type of signal or some incrementality signal, with last touch reported by the MMPs right now. So far in our experience, it can get to around a 30% deviation, meaning it’s 70% accurate in determining or measuring incrementality. But honestly, that’s just luck, for the most part, because it’s very far off in most cases. 

The other thing is: are you advertising on self-attributing networks right now? This particular area is a bit more short term, because when SKAdNetwork is the only solution, the sort of self-attributing paradigm we have right now goes away. Thank God because it’s a completely ridiculous way of measuring. But right now we’re seeing, based on incrementality alone, the self-attributing networks are far over misattributed in terms of last touch. Facebook and Google especially, because they’re the biggest players, biggest volume, so that just makes sense. 

But this was one of our early hypotheses: if you’re running on Facebook and Google significantly, you may want to take a look at incrementality sooner rather than later. Simply because the installs being attributed and revenue being attributed to those self-attributing networks right now, is probably far overestimating what the actual incremental value is. 

I will say that that also differs by the app, by the app developer; a lot of it depends on exactly how you’re using Facebook and Google, and how many other marketing partners you’re running with. But for sure, we’re seeing about an average of 50% decrease overall. That’s obviously quite significant.

Shamanth: Certainly. And I think the first step to moving forward is to understand what the challenges of the current paradigm are, and how much money people might be leaving on the table, if they don’t really look at what the incremental lift through their marketing efforts is. 

This has certainly been incredibly instructive, especially because this is a whole new set of concepts that we all need to wrap our heads around to really progress in a world where all of the KPIs we have now just won’t exist. So thank you for taking the time to be on The Mobile User Acquisition Show. 

Brian, Anthony, how can folks find out more about you guys and what you guys do?

Brian: So yeah, MetricWorks is the website. There’s a blog page on that website, where we post a lot of information about these topics, about other topics. But for sure, we go into pretty great technical detail; very practical information there as well. And you can certainly find me on LinkedIn, Brian Krebs, those are the primary ways to contact us.

Anthony: I’ve really enjoyed the conversation. I always like talking to Brian as well. So for me, I’m on LinkedIn: Anthony Cross. And I’ve really developed a passion around UA; definitely got a passion for gaming; and then all the stuff happening with iOS 14. It’s fascinating; it’s a really interesting space to be.

Shamanth: Thank you for taking the time to talk to us today.

Brian: Thank you. It’s been a pleasure. This was great. This is probably the deepest dive yet that I’ve seen into these topics. So definitely excited to see the end result. But Anthony, always a pleasure, man. Really appreciate you coming on board with this.

Anthony: I welcome the opportunity. Thank you.

A REQUEST BEFORE YOU GO

I have a very important favor to ask, which as those of you who know me know I don’t do often. If you get any pleasure or inspiration from this episode, could you PLEASE leave a review on your favorite podcasting platform – be it iTunes, Overcast, Spotify or wherever you get your podcast fix. This podcast is very much a labor of love – and each episode takes many many hours to put together. When you write a review, it will not only be a great deal of encouragement to us, but it will also support getting the word out about the Mobile User Acquisition Show.

Constructive criticism and suggestions for improvement are welcome, whether on podcasting platforms – or by email to shamanth at rocketshiphq.com. We read all reviews & I want to make this podcast better.

Thank you – and I look forward to seeing you with the next episode!

WANT TO SCALE PROFITABLY IN A GENERATIVE AI WORLD ?

Get our free newsletter. The Mobile User Acquisition Show is a show by practitioners, for practitioners, featuring insights from the bleeding-edge of growth. Our guests are some of the smartest folks we know that are on the hardest problems in growth.