Support for Item Hierarchies
Often an item hierarchy is available for datasets used for association rule mining. For example in a supermarket dataset items like "bread" and "beagle" might belong to the item group (category) "baked goods."
We provide support to use the item hierarchy to aggregate items to different group levels, to produce multi-level transactions and to filter spurious associations mined from multi-level transactions.
## S4 method for signature 'itemMatrix' aggregate(x, by) ## S4 method for signature 'itemsets' aggregate(x, by) ## S4 method for signature 'rules' aggregate(x, by) addAggregate(x, by, postfix = "*") filterAggregate(x)
x |
an transactions, itemsets or rules object. |
by |
name of a field (hierarchy level) available in
itemInfo or a vector of
character strings (factor) of the same length as items in |
postfix |
characters added to mark group-level items. |
Transactions can store item hierarchies as additional columns in the
itemInfo data.frame ("labels"
is reserved for the item labels).
Aggregation: To perform analysis at a group level of the item hierarchy,
aggregate()
produces a new
object with items aggregated to a given group level. A group-level item
is present if one or more of the items in the group are present in
the original object.
If rules are aggregated, and the
aggregation would lead to the same aggregated group item in the lhs and
in the rhs, then that group item is removed from the lhs.
Rules or itemsets, which are not
unique after the aggregation, are also removed. Note also that the
quality measures are not applicable to the new rules and thus are removed.
If these measures are required, then aggregate the transactions before
mining rules.
Multi-level analysis: To analyze relationships between
individual items
and item groups, addAggregate()
creates a new transactions object
which contains both, the original items and group-level items (marked with
a given postfix). In association rule mining, all items are handled
the same, which means that we will produce a large number of rules
of the type
item A => group of item A
with a confidence of 1. This happens also to itemsets
filterAggregate()
can be used to
filter these spurious rules or itemsets.
aggregate()
returns an object of the same class as x
encoded
with a number of items equal to the number of unique values in
by
. Note that for associations (itemsets and rules)
the number of associations in the
returned set will most likely be reduced since
several associations might map to the same aggregated association and
aggregate returns a unique set. If several associations map to
a single aggregated association then the quality measures of one
of the original associations is randomly chosen.
addAggregate()
returns a new transactions object with the
original items and the group-items added. filterAggregateRules()
returns a new rules object with the spurious rules remove.
Michael Hahsler
data("Groceries") Groceries ## Groceries contains a hierarchy stored in itemInfo head(itemInfo(Groceries)) ## aggregate by level2: items will become labels at level2 ## Note that the number of items is therefore reduced to 55 Groceries_level2 <- aggregate(Groceries, by = "level2") Groceries_level2 head(itemInfo(Groceries_level2)) ## labels are alphabetically sorted! ## compare orginal and aggregated transactions inspect(head(Groceries, 2)) inspect(head(Groceries_level2, 2)) ## create lables manually (organize items by the first letter) mylevels <- toupper(substr(itemLabels(Groceries), 1, 1)) head(mylevels) Groceries_alpha <- aggregate(Groceries, by = mylevels) Groceries_alpha inspect(head(Groceries_alpha, 2)) ## aggregate rules ## Note: you could also directly mine rules from aggregated transactions to ## get support, lift and support rules <- apriori(Groceries, parameter=list(supp=0.005, conf=0.5)) rules inspect(rules[1]) rules_level2 <- aggregate(rules, by = "level2") inspect(rules_level2[1]) ## mine multi-level rules: ## (1) add aggregate items. These items are followed by a * Groceries_multilevel <- addAggregate(Groceries, "level2") summary(Groceries_multilevel) inspect(head(Groceries_multilevel)) rules <- apriori(Groceries_multilevel, parameter = list(support = 0.01, conf = .9)) inspect(head(rules, by = "lift")) ## this contains many spurous rules of type 'item X => aggregare of item X' ## with a confidence of 1 and high lift. ## filter spurious rules resulting from the aggregation rules <- filterAggregate(rules) inspect(head(rules, by = "lift"))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.