Marketing Mix Modeling Explained – With R

49 Replies

Marketing mix modeling (MMM) is a process used to quantify the effects of different advertising mediums, i.e. media. It is also used to optimize the spend budget over these different mediums. The popular method of choice is multiple regression analysis. The model also takes into account other variables such as pricing, distribution points, and competitor tactics. This article will explain the mathematics behind MMM by starting with a simple model and then adding complexities. I’ll also incorporate R code so you can immediately reproduce the results.

Start Simple:
Let’s assume there is only one advertising variable that affects sales. This simple model is usually defined as:

Sales = Base + b·Advertising

There are two aspects to this model: (1) It is linear and (2) The Base is a constant. This is OK for now as we’ll add more complexity later. However, I can quickly tell you that Base can include other variables to make it non-constant. The non-linearity part will be introduced in a future blog post.

A sample R code can be:

sales <- c(37, 89, 82, 58, 110, 77, 103, 78, 95, 106, 98, 96, 68, 96, 157, 198, 145, 132, 96, 135)
ad <- c(6, 27, 0, 0, 20, 0, 20, 0, 0, 18, 9, 0, 0, 0, 13, 25, 0, 15, 0, 0)

modFit.0 <- lm(sales~ad)
summary(modFit.0)

This model has an R² of 0.184 so there is much work to be done.

Complexity 1: The Adstock Case
The model above assumes that advertising in weekt will only affect sales that same week. This is wrong and will cause the advertising effect to be undervalued. Simply put, past ads can (and usually do) affect present and future sales. This multi-effect aspect of advertising can be controlled for with adstock transformation, which I covered in a previous blog post.

Our model now becomes:

Sales = Base + b·f(Advertising|α)

where f() is a the adstock transformation function for the Advertising variable given an adstock of α. Other functional forms besides adstock can be incorporated here as well. Also notice how the order of observations matter for adstocking to take place.

With an adstock rate of 50% the R code is:

sales <- c(37, 89, 82, 58, 110, 77, 103, 78, 95, 106, 98, 96, 68, 96, 157, 198, 145, 132, 96, 135)
ad <- c(6, 27, 0, 0, 20, 0, 20, 0, 0, 18, 9, 0, 0, 0, 13, 25, 0, 15, 0, 0)

ad.adstock <- as.numeric(filter(x=ad, filter=.50, method="recursive"))

modFit.1 <- lm(sales~ad.adstock)
summary(modFit.1)

Notice how we improved R² from 0.184 to 0.252.

Complexity 2: More Advertising Variables
It should be clear by now that I have been using advertising mediums and advertising variables interchangeably. From a modeling perspective, Advertising can be a paid media channel like TV, radio or banner ads, a non-paid media variable like social impressions or word-of-mouth, or a marketing campaign. When adding more variables, however, their unit of measure need not be the same. Many measures can be used including TRPs, GRPs, impressions or spend. I listed them in order of preference when available. Regardless of unit of measure in a statistical model they are all called advertising variables and our model formulation becomes:

Sales = Base + ∑_i=1 b_i·f(Advertising_i|α_i)

where f() is a the adstock transformation function for Advertising_i with an adstock of α_i, i.e. each advertising variable has it’s own alpha rate.

The R code for two advertising variables with adstock rates of 30% is:

sales <- c(37, 89, 82, 58, 110, 77, 103, 78, 95, 106, 98, 96, 68, 96, 157, 198, 145, 132, 96, 135)
ad1 <- c(6, 27, 0, 0, 20, 0, 20, 0, 0, 18, 9, 0, 0, 0, 13, 25, 0, 15, 0, 0)
ad2 <- c(3, 0, 4, 0, 5, 0, 0, 0, 8, 0, 0, 5, 0, 11, 16, 11, 5, 0, 0, 15)

ad1.adstock <- as.numeric(filter(x=ad1, filter=.3, method="recursive"))
ad2.adstock <- as.numeric(filter(x=ad2, filter=.3, method="recursive"))

modFit2 <- lm(sales~ad1.adstock+ad2.adstock)
summary(modFit2)

Now, our model is even stronger with R² of 0.769.

Complexity 3: Changing Base & Other Variables
So far we assumed the Base to be a constant, i.e. an intercept. I often get asked the question of how to make the Base non-constant. The simple answer is Base includes more than just the intercept. If you notice an increasing trend in Sales then part of modeling is to create a trend variable. This trend variable gets added to the base. Seasonal variables also sometimes get added to the Base. Finally, there is the idea of distribution points.

Distribution points accounts for the number of outlets (stores or online) that the product in question is being sold at. If a retailer, for example, doubles their stores then we would assume their sales would increase not due to marketing but simply to number of stores available. Marketing plays a role, of course, but I think you get the point.

Finally, pricing & promotions are of prime importance. They too are variables to add to the model. However, these variables aren’t part of the base. Due to their complexity, I’ll leave their discussion to a future blog post.

Hence, our current model is now of the form:

Sales = a₀ + a₁·Trend + a₂·Distribution + ∑_i=1 b_i·f(Advertising_i|α_i)

sales <- c(37, 89, 82, 58, 110, 77, 103, 78, 95, 106, 98, 96, 68, 96, 157, 198, 145, 132, 96, 135)
ad1 <- c(6, 27, 0, 0, 20, 0, 20, 0, 0, 18, 9, 0, 0, 0, 13, 25, 0, 15, 0, 0)
ad2 <- c(3, 0, 4, 0, 5, 0, 0, 0, 8, 0, 0, 5, 0, 11, 16, 11, 5, 0, 0, 15)
trend <- 1:20

ad1.adstock <- as.numeric(filter(x=ad1, filter=.3, method="recursive"))
ad2.adstock <- as.numeric(filter(x=ad2, filter=.3, method="recursive"))

modFit.3 <- lm(sales~trend+ad1.adstock+ad2.adstock)
summary(modFit.3)

Our final model’s R² is 0.940.

Business Implications & Contributions
Aside from the statistical fit of our model clients always ask about the business implications. This is usually referred to as sales lift or uplift due to marketing. a.k.a. the contribution. The contribution in our model is the product of adstocked advertising & the it’s coefficient.

Contribution_i = b_i·f(Advertising_i|α_i)

Final Remarks & a Challenge:
You can see now that Marketing Mix Modeling is a business term for regression analysis on transformed variables. Any decent data scientist or statistician can do the job. However, it is important to note that the mix in Marketing Mix refers to the different mediums, media, campaigns, or variables and their effects on sales. This is in contrast to mixed effects models, which measure the effect of one variable on many different levels, like DMA level modeling as an example. Mixed effect models can be used instead of multiple regression analysis when dealing with multiple geographies, like DMA’s, but the mixed terms refer to different things and I thought to call out.

The challenge that faces all statistical analyses is data as it is 80% of the work. While that can be taken care of by data personnel, there is still one challenge that escapes many. What adstock rate to give to each advertising variable? This is harder than it sounds and it goes beyond basic statistics. Modelers don’t only have to worry about a particular adstock being statistically valid, but they also have to choose among different adstock rates with different contributions, and all of which are statistically valid as well. One reason for this is that the ultimate consumer MMM results is a human. The model that makes the “most sense” – however that is defined – can trump the most accurate model. HBR has a good article about this problem. My recommendation for such scenarios is to track the model’s fit statistics at each decision points in the modeling process. The modeler or data scientist can then show the decision maker that choosing a higher contribution will make R² drop from 90% to 70% and leave the final decision to the business users.

49 thoughts on “Marketing Mix Modeling Explained – With R”

Pingback: Advertising Adstock – Concept & Formula | Gabriel Mohanna's Blog
Tom Logan September 9, 2014 at 1:38 am

One of the best blogs ever…I will certainly promote your site, it is truly easy to read and very valuable..Thank you for taking time out of your busy calendar to enlight us “commoners” into the world of marketing effectiveness..

Reply ↓
yuchen0908 December 8, 2014 at 11:31 pm

Hello Gabriel, I’m wondering how can I put diminishing return effect into this marketing mix model? Can you advise a little bit if you put this effect into consideration? Cheers, Yu.

Reply ↓
Jeff January 28, 2015 at 4:20 pm

I echo others sentiments, great stuff! I have considered but never actually applied market mix modeling. I was curious of 2 things if you dont mind:

1) How do you deal with the situation where you have multiple stores / branches / geographic units? Do you typically combine all the data and leave the store variable as a factor (fixed effect)?
2) have you considered that if you are going to settle on a linear model, it might be better to use Arimax or GLS to take into account the correlated errors?

thanks!

Reply ↓
moku February 2, 2015 at 9:04 am

Hey Gabriel,

I’ve read through your blog and it has been a very helpful in explaining the Marketing Mix Modeling process.

I have three years of MMM data and it is structured monthly instead of weekly. At first I thought this was ok but after reading about the adstock transformations I’m not sure the adstock transformation would work very well for monthly data as opposed to weekly data, what is your opinion?

Reply ↓
1. AnalyticsArtist Post authorFebruary 2, 2015 at 5:28 pm
  
  Adstock is not necessary a weekly idea. It’ll work for monthly data but you’ll have a lower adstock value in a monthly data than you would in a weekly data.
  
  Reply ↓
moku February 2, 2015 at 9:09 am

I notice in your MMM example you create your trend by simply doing “trend <- 1:20" in R. Is that the best way to create a trend? I was wondering about using stl() decomposition in R and then using the decomposed trend variable in the model. What are your thought about doing something like that? Should a trend variable only increase in increments of 1 like your blog suggests or should the trend fit the sales data more like a linear model would? My thought is if you used the decomposed trend variable you should just fix the coefficient to 1?

Reply ↓
1. AnalyticsArtist Post authorFebruary 2, 2015 at 5:48 pm
  There are four components for in time-series analysis:
  - Level
  - Trend
  - Seasonal
  - Irregular
  Not everything is always present; some data only exhibit one or two of these components. The Australian Bureau of Statistics has a good review of this: The Basics of Time Series Analysis.
  
  The example I showed was based on a simulated data which included just a simple trend and hence the variable I created was coded as a linear trend. In general, however, the trend is always linear unless you detect a trend reversal. stl() decomposition will show you the other components if they exist.
  Reply ↓
moku February 2, 2015 at 9:12 am

Hey Gabriel,

I was wondering if you could explain more about the non-constant base. Of course the intercept in a linear model is the base sales when the other predictors are zero but how do you factor in the trend to get the non-constant base? How would you plot that in the typical stacked line chart you see used in MMM presentations? How do you calculate the incremental % for your predictors/media mix variables? Are you then able to extrapolate those calculations to ROI calculations?

Reply ↓
Pingback: Advertising Diminishing Returns & Saturation | Gabriel Mohanna's Blog
Tom Logan March 9, 2015 at 7:36 am

Hi Gabriel,

Would it be possible for you to make a blog regarding how to apply linear or non-linear mixed model in marketing context. That is if you wish to forecast a store’s revenue based on store’s advertising budget, but takes into account the various products revenue and their respective marketing budget. How will this work in practice? Would you be able to forecast product revenue and the store revenue at the same time?

Reply ↓
Devansh April 11, 2015 at 6:28 am

Hi Gabriel

I am looking for a way to iterate the different values for add stock and lag effects automatically without having to change the values manually.Is there any package that can help me achieve this or do I have to use a loop for it? Also can you suggest any particular package in R that might be useful in helping me to write the code for Market Mix Modelling?

Reply ↓
Brian September 6, 2015 at 6:35 pm

What a great blog! I have been interested in market mix models but never worked in an area where they were needed / possible. I am wondering how the data typically looks when you build one? Is there a single time series for an entire company / brand or instead are there multiple series combined in one data set (to represent stores or regions or other units based either on geography or business units)?

Reply ↓
1. AnalyticsArtist Post authorFebruary 11, 2016 at 1:25 pm
  
  Data is usually usually comes from different sources. Sales data is from within the company’s warehouse. Advertising data is either managed by the marketing agency or available from the marketer within the company. Other data could be available from public sources like macroeconomic variables or twitter or search trends.
  
  Reply ↓
D8Amonk February 10, 2016 at 11:21 am

Hi Gabriel,

Excellent post and very informative. I am trying to run the adstock transformation code you have here and on your Git but both are returning errors:

> ad.adstock <- as.numeric(filter(x=ad, filter=.50, method="recursive"))
Error in filter_(.data, .dots = lazyeval::lazy_dots(…)) :
argument ".data" is missing, with no default

Do you know why this would be the case?

Reply ↓
1. AnalyticsArtist Post authorFebruary 11, 2016 at 1:37 pm
  
  What version of R are you using? You can use the “version” command to find out. Paste it here.
  
  Reply ↓
  1. D8Amonk February 11, 2016 at 2:26 pm
    
    platform x86_64-w64-mingw32
    arch x86_64
    os mingw32
    system x86_64, mingw32
    status
    major 3
    minor 2.3
    year 2015
    month 12
    day 10
    svn rev 69752
    language R
    version.string R version 3.2.3 (2015-12-10)
    nickname Wooden Christmas-Tree
  2. D8Amonk March 9, 2016 at 1:39 pm
    
    Any ideas?
2. Ambar Nag November 3, 2016 at 11:27 pm
  
  Hi D8Amonk – the error is because you are using dplyr which also has a filter function. Try using stats::filter instead of filter.
  
  Reply ↓
Brian February 19, 2016 at 11:54 am

How very cool! I am wondering if you happen to know, once a model is built…then what? I see presentations where they decompose sales by “would have happened regardless of marketing”, direct mail, radio etc. How do you get that from the model?

Reply ↓
1. Tom Logan February 27, 2016 at 6:54 pm
  
  Since this is a regression model. You can decompose the output and you have to do it one by one. Example:
  sales = 2 + 0.5*sales
  
  and assuming that you have two values for sales = [2,4]
  So if you are constructing a graph which has a y axis and x axis
  your first line will be 2 that is you go to y axis and find the point 2 and draw an horizontal line also called your baseline
  
  then you next line will be
  2+ 0.5*2 = 3
  2+0.5*4 = 4
  
  Now you have the construct the second line which adds or put on top the baseline,
  
  So to summarize:
  The idea here is that if you have an equation:
  
  y = 2 + 2*x1 + 4* x2 + 6*x3
  
  Step1: start creating graph for the intercept i.e. 2
  Step 2: create a graph for intercept and the first variable i.e. 2*x1
  Step 3: create the third graph for intercept, second variable and third variable
  
  And each time you will have graph on top of each other
  
  Hope it helps
  
  Reply ↓
bubul April 11, 2016 at 1:08 am

Dear Gabriel. awesome post with such simplification.

I am new to market mix modeling. I am wondering

1. At present trying to plot effect of channel spend in total sales (contribution) over a period of time. How can I plot -> contribution of channel on sales. Mathematics: for Wk1: Sales = base + b1*ad1 + b2*ad2….

2. Optimize budget allocation across channels to get a pre-determined sales (Y)

Reply ↓
1. AnalyticsArtist Post authorMay 29, 2016 at 1:42 pm
  - Objective Function: Max b1*ad1 + b2*ad2 + …
  - Subject To:
    sum(ad1+ad2+…) = total budget
    Sales = base + b1*ad1 + b2*ad2 + …
  - Variable to Change: ad1, ad2 …
    Please note that ad1 and ad2 are vectors throughout time.
I would love to have shown you this in R but I haven’t done that before.

Reply ↓

raptorly May 5, 2016 at 9:25 am

Gabe, thank you for this amazing blog! What is your take on adding autoregressive terms into the modeling equation for base? Are we missing any carryover component by not including those autoregressive terms? (e.g., lagged versions of sales as independent variables)

Reply ↓

AnalyticsArtist Post authorMay 29, 2016 at 1:24 pm

Please add them. Just note that the definition for the advertising coefficient will change.

Reply ↓

Mark Aitkin May 31, 2016 at 11:25 pm

thanks for this handy blog post!

this might be handy for you, or their probably a better way to use this, i’m still learning 🙂

# x =seq 1:nrows(data)… trend
# y1 = sales
# y2 = adstockadvertising
# ylab = “label”
# title = “main title”

superplot <- function(x, y1, y2, ylab, title){
plot(x, y1, type="l", col="red", ylab = ylab)
par(new=TRUE)
plot(x, y2, type="h", col="blue", main = title, xaxt="n", yaxt="n", xlab= "", ylab="")
axis(4)
mtext("Media", side=4, line=3)
legend("topleft", col=c("red","blue"), lty=1, legend=c("Visits", "Media"))
}

superplot(trend, sales, ad1, "sale", "ad1")

Reply ↓

Tim September 10, 2016 at 1:10 pm

Great, post! Question though, in your vectors “ad”, “ad1”, and “ad2” what are your units? Are you using dollars spent, GRP, TRP, or something else?

Reply ↓

AnalyticsArtist Post authorSeptember 10, 2016 at 1:12 pm

I didn’t specify because you can use any of them.

Reply ↓
1. Tim September 10, 2016 at 1:16 pm
  
  Ok, thanks for the response! One more question. For the trend, you used “1:20”. However, how would you decompose for seasonality, in addition to that linear trend? Do you have any examples of that? Thanks again!
2. Tim September 10, 2016 at 1:28 pm
  
  Ok, thanks for the response! One more question. For the trend, you used “1:20”. However, how would you decompose for seasonality, in addition to that linear trend? Do you have any examples of that? Thanks again!

Tim September 10, 2016 at 1:29 pm

Ok, thanks for the response! One more question. For the trend, you used “1:20”. However, how would you decompose for seasonality, in addition to that linear trend? Do you have any examples of that? Thanks again!

Reply ↓

Puspa November 17, 2016 at 12:36 am

Hello, Mr.Gabriel.
I am new reader in your blog. Thank you for your explanation about the MMM. I want to ask, the advertising that you use in this explanation, is it GRP?
Thank you,
Best regards,

Reply ↓

AnalyticsArtist Post authorNovember 17, 2016 at 12:53 pm

Yes. You can use GRPs, impressions, or in some rare instances where these measures aren’t available, spend.

Reply ↓

Mikkel Petersen November 18, 2016 at 4:06 am

Hi Gabriel,
First of all I want to say how much I appreciate this blog and I have been using it for inspiration for my work.

I am building a forecasting process to predict Baseline Demand and uplifts from Marketing/Promotional activities and have a few questions I hope you can help me with.

1. Do you recommend to model the effect of activities in separate models or in the same model and get coefficients from there?

2. Do you use Linear models or Non-linear models (I use promotional counts that shows 0 in periods with no promotions)

3. I sometimes experience that either my marketing or promotional coefficients become negative. Would you exclude negative coefficients in a model (and eventually conclude that they actually contribute negatively to sales?)

Reply ↓

AnalyticsArtist Post authorNovember 18, 2016 at 1:53 pm

1. Do you recommend to model the effect of activities in separate models or in the same model and get coefficients from there?
Same model

2. Do you use Linear models or Non-linear models (I use promotional counts that shows 0 in periods with no promotions)
The Marketing Mix model is linear in coefficient but I use nonlinear optimization to derive the model coefficient as well as the transformations parameters. Using 0 for no promotions is the correct assumption.

3. I sometimes experience that either my marketing or promotional coefficients become negative. Would you exclude negative coefficients in a model (and eventually conclude that they actually contribute negatively to sales?)
Concluding that marketing contributes negatively to sales will not go well with marketing. It really depends on how accurate your model is. A coefficient can switch signs if you don’t include all the possible other variables that aught to be included. Are you including macro variables? Competitor effects? Price?

Reply ↓
1. Mikkel Petersen November 21, 2016 at 8:53 am
  
  Thanks for being so responsive!
  I just have a follow up question for 2. and 3.
  
  2. What do you mean by using a non-linear optimization to derive the model coefficient when you MMM model is linear in coefficient?
  (re adstock transformations those I have handled separately)
  
  3. Yes that’s a good point. Right now it is just internal marketing and promotion data used. I suspect that using a multi linear regression model, including 51 weekly seasonal dummy variables somehow “cannibalizes” the effect of some of the explanatory variables as some of them have a lot of activity around seasonal reoccurring events.
  Is there a way handle this issue or share more of the seasonal impact with other variables?

Jim Tenny January 12, 2017 at 7:43 pm

This is great stuff. Why did you stop posting?! 🙂

One thing I was curious about, what happens when you have multiple markets, such as DMAs? Do you stack the data on top of each other? If so, how do you get decomposition by DMA? I dont assume you consider the markets fixed effects and create lots of interaction variables do you?

Reply ↓

Jim Tenny January 24, 2017 at 7:14 am

Anyone?

Reply ↓

Jim Tenny January 17, 2017 at 11:44 am

And as a follow-up, I wonder if we could use a random effects model (e.g. proc mixed) for this purpose? Can we still decompose the markets though (have different estimates per market and the different decomposition of the media effects – for optimization purposes)?

Reply ↓

florian1981 July 5, 2017 at 8:47 am

Hi,
thanks for your great post.
One question: what’s about the predicor “trend”? For what does it stands for and why is it necessary?

Reply ↓

A September 5, 2017 at 6:31 am

Hello Gabriel,

may we expect sth more about non-linear baseline? 🙂

Reply ↓

AnalyticsArtist Post authorSeptember 5, 2017 at 1:46 pm

The base is a combination of multiple variables such as intercept, macro trends, seasonalities and few other non-player variables. So it may look linear but in actuality it is linear per variable.

Reply ↓

MMM Analyst October 29, 2017 at 11:17 pm

Excellent article.

Do you have any thoughts on how to get the estimate of the base effect in a multiplicative model specification (log-log) model specification. In other words, how to represent a multiplicative model via the the stacked chart usually accompanies any marketing mix model.

Thanks

Reply ↓

AnalyticsArtist Post authorMarch 16, 2018 at 4:09 pm

I’ve seen a few but they don’t stand a mathematical regiour. All of them are trying to represent a multiplicative model into additive values. That just doesn’t work.

Reply ↓

John Gagliano June 12, 2018 at 8:57 pm

Hi Gabriel,
In your view, how should one include pricing promotions in a Market Mix Model? There is a very high granularity (at the SKU level) when it comes to pricing promotions.

Thanks

Reply ↓

AnalyticsArtist Post authorJune 12, 2018 at 9:16 pm

You can do the modeling at SKU level. If you are looking for a product group modeling then price = sales/units for that group.

Reply ↓

Paulo June 9, 2020 at 8:46 am

I have seen many MMM using Bayesian DLM, because in dynamic models the parameters might change with time. How would it be possible using R?

Reply ↓

AnalyticsArtist Post authorJune 9, 2020 at 11:48 am

Try using RBugs.

Reply ↓

Pingback: Demystifying the Manipulation of Marketing Mix Modeling (MMM) Results – MMM Labs

Gabriel Mohanna's Blog

Analytics Artist

Marketing Mix Modeling Explained – With R

49 thoughts on “Marketing Mix Modeling Explained – With R”

Leave a comment Cancel reply

Share this:

Related

49 thoughts on “Marketing Mix Modeling Explained – With R”

Leave a comment Cancel reply