Data Mining Assignment Help, Homework Help, Data Mining Project Help

Computer Science >> Data Mining Assignment Help & Project Help

INTRODUCTION

Data mining include the use of sophisticated data analysis tools to discover previously unknown, valid technique and relationships in large data sets. These tools can implement statistical models, mathematical algorithms, and machine learning techniques (algorithms that improve their performance automatically through experience, such as neural networks or decision trees). Consequently, data mining consists of more and more collecting and managing data, it includes analysis and prediction.

Data mining can be performed on data presented in quantitative, textual, or multimedia forms. Data mining applications can use a different of parameters to examine the data. They include association of (patterns where one event is connected to another event, such as purchasing a pen and purchasing paper), sequence or path analysis (patterns where one event leads to other event, such as the birth of a child and purchasing diapers), classify (identification of new patent , such as coincidences in between duct tape purchases and plastic sheeting purchases), clustering (finding and visually documenting groups of previously unknown facts, such as a geographic location and the brand preferences), and forecasting (discovering patterns from which one can make reasonable predictions regarding future activities, such as the prediction that the people who join an athletic club may take exercise classes).

As an application, compared with other data analysis applications, such as structured queries (used in many commercial databases) or statistical analysis of software, data mining represents it a difference of kind rather than degree. Many simpler analytical tools utilize to a verification-based approach, where the user develops a hypothesis and then tests of the data to prove or disprove the hypothesis. For example, a user might hypothesize that the customer, who buys hammer, will also buy a box of nails. The effectiveness of this type approach can be limited by the creativity with user to develop more hypotheses, as well as the structure of software being used. In contrast, to data mining utilizes a discovery approach, in which algorithms can be used to examine several multidimensional data relationships simultaneously, identifying those that have unique or frequently represented. For example, the hardware store may also compare their customers’ tool purchases with the home ownership, type of automobile driven, age, occupation, income, and/or distance between his residence and the store. As the result of its complex capabilities, two precursors are important for the successful data mining exercise; a clear formula of the problem to be solved, and access the relevant data.

Reflecting this conceptualization of data mining, some observation consider that data mining to be just one step in a larger process known as knowledge discovery in the databases (KDD). Other steps in the KDD process, into progressive order, include data cleaning, data integration, data selection, data transformation, (data mining), pattern evaluation, and knowledge presentation.

A number of advances in the technology and business processes have contributed to a growing interest in the data mining in both the public and private sectors. Some of these changes carry the growth of computer networks, which can be used to connect databases; the development of enhanced the search-related techniques such as neural networks and advanced algorithms; the spread of the client/server computing model, allowing users to access the centralized data resources from the desktop; and an increased the ability to combine data from disparate sources into a single searchable.

In addition to these improved data management tools, the increased availability of information and the decreasing costs the storing it have also played a role. Over the past several years there has been the rapid increase in the volume of information collected and stored, with some observers suggesting that the quantity of the world’s data approximately doubles every year. At the same time, the costs of data storage have decreased significantly from dollars per megabyte to the pennies per megabyte.

In additional for these improved data management tools, the increased availability of information and the decreasing costs of the storing it have also played a role. Over the past several years there has been the rapid increase in the volume of information collected and stored, with some observers suggesting that the quantity of the world’s data approximately doubles every year. At the same time, the costs of the data storage have decreased significantly from dollars per megabyte to the pennies per megabyte.

Data mining has become increasingly common in the both public and private sectors. Organizations use data mining as a tool to survey customer information, reduce fraud and waste, and assist in the medical research. However, the proliferation of data mining has raised some implementation and oversight issues as well. These include concerns about the quality of the data being analyzed, to the interoperability of the databases and software between agencies, and potential infringements on the privacy.

Also, there are some of the concerns that limitations of the data mining are being overlooked as agencies work to emphasize their homeland security initiatives.

Main goal:

- Study statistical tools of useful managerial decision making.

– Most management problems include some degree of uncertainty.

– People have poor intuitive judgment of the uncertainty.

– IT revolution... abundance of the available quantitative information.

– Data mining: large databases info,

– market segmentation with targeting

– stock market data collection