Which is an efficient method for temporal multil-label association mining in distributrd environment for horizontally partitioned data?

Apriori algorithm successfully finds the frequent elements from the database. But as the dimensionality of the database increase with the number of items then:

• More search space is needed and I/O cost will increase.

• Number of database scan is increased thus candidate generation will increase results in increase in computational cost.

Therefore many variations have been takes place in the Apriori algorithm to minimize the above limitations arises due to increase in size of database. These subsequently proposed algorithms adopt similar database scan level by level as in Apriori algorithm, while the methods of candidate generation and pruning, support counting and candidate representation may differ. The algorithms improve the Apriori algorithms by:

• Reduce passes of transaction database scans

• Shrink number of candidates

• Facilitate support counting of candidates

Direct Hashing and Pruning (DHP):

It is absorbed that reducing the candidate items from the database is one of the important task for increasing the efficiency. Thus a DHP technique was proposed [7] to reduce the number of candidates in the early passes C# for k 9 1 and thus the size of database. In this method, support is counted by mapping the items from the candidate list into the buckets which is divided according to support known as Hash table structure. As the new item set is encountered if item exist earlier then increase the bucket count else insert into new bucket. Thus in the end the bucket whose support count is less the minimum support is removed from the candidate set. In this way it reduce the generation of candidate sets in the earlier stages but as the level increase the size of bucket also increase thus difficult to manage hash table as well candidate set.

Partitioning Algorithm:

Partitioning algorithm is based to find the frequent elements on the basis partitioning

of database in n parts. It overcomes the memory problem for large database which do not fit into main memory because small parts of database easily fit into main memory. This algorithm divides into two passes,

1 In the first pass whole database is divided into n number of parts.

2 Each partitioned database is loaded into main memory one by one and local frequent elements are found.

3 Combine the all locally frequent elements and make it globally candidate set.

4 Find the globally frequent elements from this candidate set.

It should be noted that if the minimum support for transactions in whole database is

min_sup then the minimum support for partitioned transactions is min-sup number of transaction in that partition. A local frequent itemset may or may not be frequent with respect to the entire database thus any itemset which is potentially frequent must include in any one of the frequent partition

Bharat Gupta (2011), A BETTER APPROACH TO MINE FREQUENT ITEM SETS USING APRIORI AND FP-TREE APPROACH, COMPUTER SCIENCE AND ENGINEERING DEPARTMENT, THAPAR UNIVERSITY, PATIALA

What is fastest Algorithm to find best fit hyper-sphere covering m points in n dimensional dataspace?

What are the latest developments in multilevel association rule mining?

Has anyone used machine learning technique to find association rules?

Which are the most popular techniques to optimize association rules?

Has anybody used statistical learning for pre-copy prediction ( or dirty page prediction) in virtual machine live memory migration?

Which is appropriate dataset for privacy preserving stream data classification?

Is there any known development or implementation of function encryption in privacy preserving data mining?

How can one filter uninteresting rules in multilevel association mining processes?

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How are iso-frequency contours plotted?

Do you know best mines of western part of Afghanistan?

Hello all, Looking for international reviewer to review Ph.D thesis in wireless sensor network.Can anybody help?