Skip to content

Instantly share code, notes, and snippets.

@howardrotterdam
Created March 30, 2025 08:58
Show Gist options
  • Select an option

  • Save howardrotterdam/5d6a1c591ee8de64f5599ecd30a7d9b2 to your computer and use it in GitHub Desktop.

Select an option

Save howardrotterdam/5d6a1c591ee8de64f5599ecd30a7d9b2 to your computer and use it in GitHub Desktop.
What's Association Rule in machine learning?

What's Association Rule in machine learning?

Grok3:

In machine learning, Association Rule refers to a technique used to uncover interesting relationships or patterns between variables in large datasets, typically in the form of "if, then" statements. It’s widely applied in data mining, especially for market basket analysis, where the goal is to find items that frequently occur together in transactions.

An association rule is expressed as A → B, where:

  • A (the antecedent) is an item or set of items, and
  • B (the consequent) is another item or set of items.
  • The rule suggests that if A occurs, then B is likely to occur as well.

Key Metrics

Association rules are evaluated using three main metrics:

  1. Support: The percentage of transactions in the dataset that contain both A and B. It measures how frequently the rule applies.
    • Formula: Support(A → B) = P(A ∪ B) = (Number of transactions with A and B) / (Total transactions)
  2. Confidence: The probability that B occurs given that A has occurred. It measures the reliability of the rule.
    • Formula: Confidence(A → B) = P(B|A) = (Number of transactions with A and B) / (Number of transactions with A)
  3. Lift: The ratio of the observed support to the expected support if A and B were independent. It indicates the strength of the rule over random co-occurrence.
    • Formula: Lift(A → B) = Confidence(A → B) / P(B) = (Support(A → B)) / (Support(A) * Support(B))
    • Lift > 1: Positive correlation; Lift = 1: No correlation; Lift < 1: Negative correlation.

Example

Imagine a grocery store dataset:

  • Rule: {Milk, Bread} → {Butter}
  • Support: 10% of all transactions contain Milk, Bread, and Butter.
  • Confidence: 70% of transactions with Milk and Bread also include Butter.
  • Lift: 1.5, meaning Milk and Bread buyers are 1.5 times more likely to buy Butter than random shoppers.

Common Algorithm

The most well-known algorithm for generating association rules is Apriori. It works by:

  1. Identifying frequent itemsets (combinations of items that meet a minimum support threshold).
  2. Generating rules from those itemsets that meet a minimum confidence threshold.

Applications

  • Retail: Recommending products (e.g., "People who buy diapers also buy baby wipes").
  • Healthcare: Discovering symptom-disease relationships.
  • Web Usage Mining: Analyzing user behavior on websites.

In essence, association rules help reveal hidden patterns in data, making them a powerful tool for decision-making and predictive analytics!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment