answersLogoWhite

0

Data mining refers to the broadly-defined set of techniques involving finding meaningful patterns - or information - in large amounts of raw data.

At a very high level, data mining is performed in the following stages (note that terminology and steps taken in the data mining process varies by data mining practitioner):

1. Data collection: gathering the input data you intend to analyze

2. Data scrubbing: removing missing records, filling in missing values where appropriate

3. Pre-testing: determining which variables might be important for inclusion during the analysis stage

4. Analysis/Training: analyzing the input data to look for patterns

5. Model building: drawing conclusions from the analysis phase and determining a mathematical model to be applied to future sets of input data

6. Application: applying the model to new data sets to find meaningful patterns

Data mining can be used to classify or cluster data into groups or to predict likely future outcomes based upon a set of input variables/data.

Common data mining techniques and tools include, for example:

a. decision tree learning

b. Bayesian classification

c. neural networks

During the analysis phase (sometimes also called the training phase), it is customary to set aside some of the input data so that it can be used to cross-validate and test the model, respectively. This is an important step taken in order to to avoid "over-fitting" the model to the original data set used to train the model, which would make it less applicable to real-world applications.

User Avatar

Wiki User

15y ago

What else can I help you with?

Related Questions

What are the seminar topics related to data mining?

Here are some interesting seminar topics related to data mining: Introduction to Data Mining Techniques – Overview of fundamental techniques like classification, clustering, regression, and association rule mining. Applications of Data Mining in Healthcare – How data mining is transforming patient care, disease prediction, and medical research. Big Data and Data Mining – Integrating data mining with big data tools to extract valuable insights. Data Mining in E-commerce – Techniques for customer behavior analysis and recommendation systems. Machine Learning in Data Mining – Exploring the role of machine learning algorithms in enhancing data mining processes. Data Mining for Fraud Detection – Using data mining to identify fraudulent activities in banking and finance.


What are some examples of data mining techniques?

Although there are a number of data mining techniques there are three that are most commonly used. These common techniques include decision trees, artificial neutral networks and the nearest-neighbour method. These techniques each analyze data in different ways.


What are some Examples of data mining?

Although there are a number of data mining techniques there are three that are most commonly used. These common techniques include decision trees, artificial neutral networks and the nearest-neighbour method. These techniques each analyze data in different ways.


What are some examples of mining?

Although there are a number of data mining techniques there are three that are most commonly used. These common techniques include decision trees, artificial neutral networks and the nearest-neighbour method. These techniques each analyze data in different ways.


What is the purpose of data mining services?

Data mining is the application of computational techniques to obtain useful information from a large data. When applied to different situations data mining can reveal information and valuable insights about patterns. Examples of data mining applications are Fraud detection, customer behaviour, customer retention.


Functions of data warehouse?

A data warehouse functions as a repository for all the data held by an organisation. The main functions are to reduce cost of data storage, facilitate data mining, and facilitate ability to back up data at an organisational level.


What is directed data mining and undirected data mining in database?

Directed data mining involves using predefined goals or objectives to guide the analysis and modeling of data. In contrast, undirected data mining aims to discover patterns or relationships in data without specifying a particular outcome in advance. Directed data mining is typically used for tasks such as classification and regression, while undirected data mining techniques include clustering and anomaly detection.


What is web mining?

Web mining - is the application of data mining techniques to discover patterns from the Web. According to analysis targets, web mining can be divided into three different types, which areWeb usage mining, Web content mining and Web structure mining.


What are the key differences between supervised and unsupervised data mining techniques and how do they impact the outcomes of the analysis?

Supervised data mining techniques require labeled data for training, while unsupervised techniques do not. Supervised methods are used for prediction and classification tasks, while unsupervised methods are used for clustering and pattern recognition. The choice of technique impacts the accuracy and interpretability of the analysis results.


What is data reduction in terms of data mining?

Data reduction in data mining refers to the process of reducing the volume of data under consideration. This can involve techniques such as feature selection, dimensionality reduction, or sampling to simplify the dataset and make it more manageable for analysis. By reducing the data, analysts can focus on the most relevant information and improve the efficiency of their data mining process.


Characteristics of data mining?

CHARECTERISTICS OF DATA MINING CHARECTERISTICS OF DATA MINING


Distinguish between Data mining and text mining?

mining the data is called data mining. Mining the text is called text mining