Data Mining And Market Basket Implementation

Khawar Islam
7y
14.6k
0
2

Article

Data Mining

Data mining is the study of finding hidden data which is previously unknown and non-trivial. Mining is also used for prediction.

Points of Study

The objective is clear but output is not clear
Dimension keyword is used for the number of columns.
Unknown data
Non-trivial.
Pattern should be hidden.

Job Scope

Data analyst
Data scientist
Business intelligence

Software Tools

Weka Research tool
Rapid Minor Professional tool
SPSS by IBM Data pre-processing
Excel skills Final & replace
CSV file Data set
R programming Specially for Data science

Major techniques

These are the major techniques which are used in data mining to extract raw data for the following steps like data cleaning, data pre-processing, etc. and constructing useful datasets which are used for prediction.

Market basket/Frequent pattern Analysis.
Classification (model and prediction).
Health analyses.
Clustering (fraud detection).
Time series prediction.
Construct software system.

Now we move up to our first data mining technique which is market basket analysis, and perform its implementation by considering binary database examples.

Market Basket/Frequent Pattern

Market basket analysis is frequently used in

Super market promotion.
Use in web pages.
Arrangement of items.

Pattern

Collection of item sets with more than one item is called set.

Unique basket + items inside = transaction association rule = tell the association of two rules

Example

Now we will consider one example in which T is the IDs/Transactions of the dataset which are always unique. Items are transactions of each row assigned by its ID.

Transaction Database

Solution

Now first we see the overall number of items present in database e.g. here we have A, B, C, D and E. We count all unique items from six rows we don’t need to duplicate it like in ID 1, we have A, B, D and E but we don’t have C so we note down these items in our list without item C. When we move in second ID row 2, we see that, we already wrote B and E now we only add C item in our list because B and E is already present. We only add C in our item list. Now we construct table, the items like A, B, C, D and E are written in the first row and transaction IDs are written in first column. Now suppose that our table is empty.

Now we see that A is present in first row and what is transaction ID? We write this transaction ID into first cell of A. Now we check B, B is also present in first transaction ID, so we write 1 in B cell. Now here is the game in C item. C item is not present in transaction ID 1 then we see the second ID row 2. C item is present in row 2, so we add the transaction ID to our first row of tables. Like we write “2” in first row C Cell, because C is not present in 1 ID row 1 but present in row 2 ID 2. So we write it. The process goes the same and we complete our database construction by giving items.

Question

Now solve this question by giving the following table and constructing the transaction and items table.

Hint :Make the table like question 1