Friday, 5 July 2013

RFM - A Precursor to Data Mining

RFM was initially utilized by marketers in the B-2-C space - specifically in industries like Cataloging, Insurance, Retail Banking, Telecommunications and others. There are a number of scoring approaches that can be used with RFM. We'll take a look at three:

RFM - Basic Ranking
RFM - Within Parent Cell Ranking
RFM - Weighted Cell Ranking

Each approach has experienced proponents that argue one over the other. The point is to start somewhere and experiment to find the one that works best for your company and your customer base. Let's look at a few examples.

RFM - Basic Ranking

This approach involves scoring customers based on each RFM factor separately. It begins with sorting your customers based on Recency, i.e., the number of days or months since their last purchase. Once sorted in ascending order (most recent purchasers at the top), the customers are then split into quintiles, or five equal groups. The customers in the top quintile represent the 20% of your customers that most recently purchased from you.

This process is then undertaken for Frequency and Monetary as well. Each customer is in one of the five cells for R, F, and M

Experience tells us that the best prospects for an upcoming campaign are those customers that are in Quintile 5 for each factor - those customers that have purchased most recently, most frequently and have spent the most money. In fact, a common approach to creating an aggregated score is to concatenate the individual RFM scores together resulting in 125 cells (5x5x5).

A customer's score can range from 555 being the highest, to 111 being the lowest.

RFM - Within Parent Cell Ranking

This approach is advocated by Arthur Middleton Hughes - one of the biggest proponents of RFM analysis. It begins like the one above, i.e., all customer are initially grouped into 5 cells based on Recency. The next step takes customers in a given Recency cell - say cell number 5, and then ranks those customers based on Frequency. Then customers in the 55 (RF) cell are ranked by monetary value.

RFM - Weighted Ranking

Weightings used by RFM practitioners vary. For example some advocate adding the RFM score together - thus giving equal weight to each factor. Consequently, scores can range from 15 (5+5+5) to 3 (1+1+1). Another weighting arrangement often used is, 3xR + 2xF + 1xM. In this case, scores can range from 30 to 3.

So which to use? In reality, there are many other permutations of approaches that are being used today. Best-practice marketing analytics requires a fine mix of mathematical and statistical science, creativity and experimentation. Bottom line, test multiple scoring methods to see which works best for your unique customer base.

Establishing a Score Threshold

After a test or production campaign, you will find that some of the cells were profitable while some were not. Let's turn to a case study to see how you can establish a threshold that will help maximize your profitability. This study comes from Professor Charlotte Mason of the Kenan-Flagler Business School and utilizes a real-life marketing study performed by The BookBinders Book Club (Source:Recency, Frequency and Monetary (RFM) Analysis, Professor Charlotte Mason, Kenan-Flagler Business School, University of North Carolina, 2003).

BookBinders is a specialty book seller that utilizes multiple marketing channels. BookBinders traditionally did mass marketing and wanted to test the power of RFM. To do so, they initially did a random mailing to 50,000 customers. The customers were mailed an offer to purchase The Art History of Florence. Response data was captured and a "post-RFM" analysis was completed. This "post analysis" was done by freezing the files of the 50,000 test customers prior to the actual test offer. Thus, the impact of this test campaign did not effect the analysis by coding many (the actual buyers) of the 50,000 test subjects as the most recent purchasers. The results firmly support the use of RFM as a highly effective segmentation approach.

Purchased the book = yes; months since last purchase = 8.61; total # purchases = 5.22; dollars spent = 234.30
Purchased the book = no; Months since last purchase = 12.73; total # purchases = 3.76; dollars spent = 205.74

Customers that purchased the book were more recent purchasers, more frequent purchasers and had spent the most with BookBinders.

The response rate for the top decile (18%) was twice the response rate associated with the 5th decile (9%).

Results from this test were then used by BookBinders to identify which of their remaining customers should receive the same mailing. BookBinders used a breakeven response rate calculation to determine the appropriate RFM cells to mail.

The following cost information was used as input:

Cost per Mail-piece $0.50

Selling Price $18.00

BookBinders Book Cost $9.00

Shipping Costs $3.00

Breakeven is achieved when the cost of the mailing is equal to the net profit from a sale. In this case:

Breakeven = (cost to mail the offer/net profit from a single sale)

= $0.50/($18-9-3)

= ($0.50/6)

= 8.3% = Breakeven Response rate

So, according to the test offer, profit can be obtained by mailing to cells that exhibited a response rate of greater than 8.3%

RFM dramatically improved profitability by capturing 71% of buyers (3,214/4,522) while mailing only 46% of their customers (22,731/50,000). And the return on marketing expenditures using RFM was more than eight times (69.7/8.5) that of a mass mailing.

Number of Cells and Cell Size Considerations

As previously mentioned, RFM was initially utilized by companies that operated in the B-to-C marketplace and generally possessed a very large number of customers. The idea of generating 125 cells using quintiles for R, F and M has been a very good practice as an initial modeling effort. But what if you are a B-to-B marketer with relatively fewer customers? Or, what if you are a B-to-C marketer with an extremely large file with millions of customers? The answer is to use the same approach that is used in data mining -- be flexible and experiment.

Establishing a minimum test cell size is a good place to start. Arthur Hughes recommends the following formula:

Test Cell Size = 4 / Breakeven Response Rate.

The Breakeven Response Rate was addressed above in the BookBinders case study. The number "4" is a number that Hughes has found works successfully based on many studies he has performed. BookBinders Breakeven Response Rate was 8.3%. Using the above formula, you would need a minimum of 48 customers in each cell (4/0.083). BookBinders actually had 400 customers per cell, so they had more than adequate comfort in the significance of their test. In reality, BookBinders could have created as many as 1,041 cells if they were comfortable using the minimum of 48 per cell. As an example, they could have used deciles as opposed to quintiles and established 1,000 cells (10 x 10 x 10). The more cells the finer the analysis, but of course the law of diminishing returns will arise.

Other weighting considerations can be used for small files. If your Breakeven Response Rate is 3%, your minimum cell size would be 133 customers (4/0.03). Therefore, if you have 12,000 customers you could have about 90 cells (12,000/133). As such, a 5 x 5 x 4 (100 cells) or a 5 x 4 x 4 (80 cells) approach may be appropriate.

Conclusions

RFM, BI and data mining are all part of an evolutionary path that is common to many marketing organizations. While RFM has been practiced for over 40 years, it still holds great value for many organizations. Its merits include:

- Simplicity - easy to understand and implement

- Relatively low cost

- Proven ROI

- The demand on data requirements are relatively low in terms of variables required and the number of records

- Once utilized, it sets up a broader foundation (from an infrastructure and business case perspective) to undertake more sophisticated data mining efforts

RFM's challenges include:

- Contact fatigue can be a problem for the higher scoring customers. A high level cross-campaign communication strategy can help prevent this.

- Your lowest scoring customers may never hear from you. Again, a cross-campaign communications plan should ensure that all of your customers are communicated with periodically to ensure low scoring customers are given the opportunity to meet their potential. Also, data mining and the prediction of customer lifetime value can help address this shortcoming.

- RFM includes only three variables. Data mining typically finds RFM-based variables to be quite important in response models. But there are additional variables that data mining typically use (e.g., detailed transaction, demographic and firmographic) that help produce improved results. Moreover, data mining techniques can also increase response rates via the development of richer segment/cell profiles that can be used to vary offer content and incentives.

As stated before, successful marketing efforts require analytics and experimentation. RFM has proven itself as an effective approach to predicting response and improving profitability. It can be an important stage in your company's evolution in marketing analytics.



Source: http://ezinearticles.com/?RFM---A-Precursor-to-Data-Mining&id=1962283

No comments:

Post a Comment