Market Basket Analysis

Market Basket Analysis is one of the key techniques used by large retailers to uncover associations between items. It works by looking for combinations of items that occur together frequently in transactions. To put it another way, it allows retailers to identify relationships between the items that people buy.

Association Rules

Association Rules are widely used to analyze retail basket or transaction data, and are intended to identify strong rules discovered in transaction data using measures of interestingness, based on the concept of strong rules.

An example of Association Rules

Assume there are 100 customers

10 of them bought milk, 8 bought butter and 6 bought both of them.

bought milk => bought butter

support = P(Milk & Butter) = 6/100 = 0.06

confidence = support/P(Butter) = 0.06/0.08 = 0.75

lift = confidence/P(Milk) = 0.75/0.10 = 7.5

The set of items a customer buys is referred to as an item set, and market basket analysis seeks to find relationships between purchases.

Typically the relationship will be in the form of a rule:

IF {beer, no bar meal} THEN {crisps}.

The probability that a customer will buy beer without a bar meal (i.e. that the antecedent is true) is referred to as the support for the rule. The conditional probability that a customer will purchase crisps is referred to as the confidence.

The algorithms for performing market basket analysis are fairly straightforward (Berry and Linhoff is a reasonable introductory resource for this). The complexities mainly arise in exploiting taxonomies, avoiding combinatorial explosions (a supermarket may stock 10,000 or more line items), and dealing with the large amounts of transaction data that may be available.

A major difficulty is that a large number of the rules found may be trivial for anyone familiar with the business. Although the volume of data has been reduced, we are still asking the user to find a needle in a haystack. Requiring rules to have a high minimum support level and high confidence level risks missing any exploitable result we might have found. One partial solution to this problem is differential market basket analysis, as described below.

In retailing, most purchases are bought on impulse. Market basket analysis gives clues as to what a customer might have bought if the idea had occurred to them . (For some real insights into consumer behavior,

As a first step, therefore, market basket analysis can be used in deciding the location and promotion of goods inside a store. If, as has been observed, purchasers of Barbie dolls have are more likely to buy candy, then high-margin candy can be placed near to the Barbie doll display. Customers who would have bought candy with their Barbie dolls had they thought of it will now be suitably tempted.

But this is only the first level of analysis. Differential market basket analysis can find interesting results and can also eliminate the problem of a potentially high volume of trivial results.

In differential analysis, we compare results between different stores, between customers in different demographic groups, between different days of the week, different seasons of the year, etc.

If we observe that a rule holds in one store, but not in any other (or does not hold in one store, but holds in all others), then we know that there is something interesting about that store. Perhaps its clientele are different, or perhaps it has organized its displays in a novel and more lucrative way. Investigating such differences may yield useful insights which will improve company sales.

An itemset is the set of items a customer buys at the same time. It’s typically stated as a logic rule like IF {bread, peanut butter} THEN {jelly}. An itemset can consist of no items (a null amount though, is usually ignored) to all items in the data set.

The support count is a count of how often the itemset appears in the transaction database. The support is how often the item appears, stated as a probability. For example, if the support count is 21 out of a possible 1,000 transactions, then the probability is 21/1,000 or 0.021.

The confidence is the conditional probability that the items will be purchased together.

Consumers Behavior Analysis
SKU Optimization

Get industry recognized certification – Contact us

keyboard_arrow_up