类别 全部 - modeling - data - metrics - analysis

作者:Diana Muñoz 5 天以前

22

CRSP-DM

The process begins with understanding the data by collecting it from various sources and assessing its quality. This phase includes descriptive statistics and statistical analysis to identify key attributes and their correlations.

CRSP-DM

CRSP-DM

Deployment

If deployment in the scope, the implementations should be described
Final report or a software component

Develops concrete actions based on the results of the data mining models.

User guide

Maintenance.

Monitoring

Evaluation

The process should be reviewed in general.
Results are checked against the defined business objectives
Visualization of model and metrics
Metrics are also visualized to illustrate the results

The visualization of the trained models means, that e. g. a decision tree are shown to explain and evaluate the models

Defining metrics
Use metrics to evaluate the quality of trained models

Modeling

All data mining techniques can be used
Select the modeling technique
Training data

Building test and training sets

Derived attributes have to be constructed

Data preparation

Inclusion and exclusion criteria
Describing input and output data

Cleaning data

Bad data quality can be handled

Data selection

Transformation

Data understanding

Collecting data from data sources
Checking the data quality
Exploring and describing it

Entitle the concrete data sources and explain from where the data has been collected during the data understanding phase

Descriptive statistic

The most important method of data understanding

Statistical analysis and determining attributes and their correlations.

In terms of data mining, it is a classification with the binary target attribute.

Business Understanding

Textual description of the business goals and why data mining is useful for the specific use cases, the concrete data mining goal is specified in a structured manner
Required project plan

Data mining goal

Data mining success criteria

Data mining type