Please download to get full document.

View again

of 64
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.


Document Related
Document Description
1. INTRODUCTION 1.1 Synopsis This project entitled, “Study on Value-Added Service in Mobile Telecom Based on Association Rules” is extracted from proceedings of the 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing. With the continuous development of information technology, it is beginning to be applied in managing data storage business in more and more fields. However, in face of an increasing number of data, ther
Document Share
Document Tags
Document Transcript
  1.INTRODUCTION 1.1 Synopsis This project entitled, “Study on Value-Added Service in Mobile Telecom Based onAssociation Rules” is extracted from proceedings of the 10 th ACIS International Conference onSoftware Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing.With the continuous development of information technology, it is beginning to be appliedin managing data storage business in more and more fields. However, in face of an increasingnumber of data, there is a good database management system still needs further exploration. Datamining not only be able to make efficient data storage, and can extract hide useful The currentInternet technology and its growing demand necessitates the development of more advanced datamining technologies to interpret the information and knowledge from the data distributed all over the world. In the 21st century this demand continues to grow.Data mining can discover interesting patterns or relationships describing the data and predictive or classify the behavior of the model based on available data. In other words, it is aninterdisciplinary field with a general goal of predicting outcomes and uncovering relationships indata. It uses automated tools that employ sophisticated algorithms to discover mainly hidden patterns, associations, anomalies, and/or structure from large amounts of data stored in datawarehouses or other information repositories and filter necessary information from this big dataset.Association rule mining refers to discovering association relationships among different attributes.Data mining in the telecommunications sales can. Analyze the optimal and rational sales to match.The association rule mining commodities can be found in the relationship between commodities,such as commodities which are often together at the same time to buy.Telecommunications industry is a typical data intensive industry, with the deepening of telecom reform, competition is also becoming fierce increasingly. Compared with other industries,the telecommunications industry have more user's data., which can help people analyze the dataaccurately and obtain useful knowledge, in order to win the competition , people should find more business opportunities and provide users with better service. As a result, data warehouse and datamining has important value in the telecommunications industry. 1  Data mining is the task of discovering interesting patterns from large amounts of datawhere the data can be stored in databases, data warehouses, or other information repositories. It ayoung interdisciplinary field, drawing from areas such as database systems, data warehousing,statistics, machine learning, data visualization, information retrieval, and high performancecomputing. Other contributing areas include neural networks, pattern recognition, spatial dataanalysis, image databases, signal processing and many application fields, such as business,economics and bioinformatics.It includes data cleaning, data integration, data selection, data transformation, data mining, pattern evaluation and knowledge presentation. Since   different users can be   interested in differentkinds of knowledge, data mining should cover a wide spectrum of data analysis and knowledgediscovery tasks, including data characterization, discrimination, association, classification,clustering trend and deviation analysis and similarity analysis. These tasks may use the samedatabase in different ways and require the development of numerous data mining techniques.It includes the discovery of concept or class descriptions, association, classification, prediction, clustering, trend analysis, deviation analysis, and similarity analysis. Characterizationand discrimination are forms of data summarization. It can be classified according to the kinds of databases mined, the kinds of knowledge mined, the techniques used, or the applications adapted.Data mining can be classified into descriptive data mining and predictive data mining. Conceptdescription is the most basic form of descriptive data mining. It describes a given set of task-relevant data in a concise and summarative manner, presenting general properties of the data.Efficient and effective data mining in large databases poses numerous requirements andgreat challenges to researchers and developers. The issues involved include data miningmethodology, user interaction, performance and scalability, and the processing of a large variety of data types. Other issues include the exploration of data mining applications and their socialimpacts. 1.2 Apriori Algorithm For Frequent Itemsets 2  Finding frequent itemsets in transaction databases has been demonstrated to be useful inseveral business applications. Many algorithms have been proposed to find frequent itemsets froma very large database. However, there is no published implementation that outperforms every other implementation on every database with every support threshold. In general, many implementationsare based on the two main algorithms: Apriori and frequent pattern growth (FP-growth). TheApriori algorithm discovers the frequent itemsets from a very large database through a series of iterations.The Apriori algorithm is required to generate candidate itemsets, compute the support, and prune the candidate itemsets to the frequent itemsets in each iteration. The FP-growth algorithmdiscovers frequent itemsets without the time-consuming candidate generation process that iscritical for the Apriori algorithm. Although the FP-growth algorithm generally outperforms theApriori algorithm in most cases, several refinements of the Apriori algorithm have been made tospeed up the process of frequent itemsets mining.This paper, implemented a parallel Apriori algorithm based on Bodon’s work andanalyzed its performance on a parallel computer. The reason we adopted Bodon’s implementationfor parallel computing is because Bodon’s implementation using the trie data structureoutperforms the other implementations using hash tree. The rest of the paper is organized asfollows. It introduces related work on frequent item sets mining. We present our implementationfor parallel computing on frequent item sets mining. We present the experimental results of our implementation on a symmetric multiprocessing computer. 1.3 Key Issues Of Apriori Algorithm The Apriori Algorithm is an influential algorithm for mining frequent item sets for booleanassociation rules. 1.Key Concepts  Frequent Item sets: The sets of item which has minimum support (denoted by Li for ith-Itemset).  Apriori Property: Any subset of frequent item set must be frequent.  Join Operation: To find Lk , a set of candidate k-item sets is generated by joining Lk-1with itself. 3  2. Methods to Improve Apriori’s Efficiency  Hash-based item set counting: A k-item set whose corresponding hashing bucket count is below the threshold cannot be frequent.  Transaction reduction: A transaction that does not contain any frequent k-item set isuseless in subsequent scans.  Partitioning: Any item set that is potentially frequent in DB must be frequent in at least oneof the partitions of DB.  Sampling: mining on a subset of given data, lower support threshold + a method todetermine the completeness.  Dynamic item set counting: add new candidate item sets only when all of their subsets areestimated to be frequent. 1.4 Data Mining Concepts And Techniques Mining different kinds of Knowledge in Database  Since   different users can be   interested in different kinds of knowledge, data mining shouldcover a wide spectrum of data analysis and knowledge discovery tasks, including datacharacterization, discrimination, association, classification, clustering trend and deviation analysisand similarity analysis. These tasks may use the same database in different ways and require thedevelopment of numerous data mining techniques. Data Mining query languages and Data Mining  Relational query languages (such as SQL) allow users to pose queries for data retrieval. In asimilar high level data mining tasks by facilitating the specification of the relevant sets of data for analysis, the domain knowledge , the kinds of knowledge to be mined, and conditions andconstraints to be enforced on the discovered patterns. Such language should be integrated with adatabase or data warehouse query language, and optimized for efficient and flexible data mining. Database Technology Database Technology   has evolved from primitive file processing to the development of database management systems with query and transaction processing. Further progress has led tothe increasing demand for efficient and effective data analysis and data understanding tools. This 4
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks