CRSP-DM

Business Understanding

Textual description of the business goals and why data mining is useful for the specific use cases, the concrete data mining goal is specified in a structured manner

Required project plan

Data mining goal

Data mining type

Data mining success criteria

Data understanding

Collecting data from data sources

Exploring and describing it

Statistical analysis and determining attributes and their correlations.

In terms of data mining, it is a classification with the binary target attribute.

Entitle the concrete data sources and explain from where the data has been collected during the data understanding phase

Descriptive statistic

The most important method of data understanding

Checking the data quality

Data preparation

Inclusion and exclusion criteria

Describing input and output data

Transformation

Data selection

Cleaning data

Bad data quality can be handled

Modeling

Select the modeling technique

Training data

Derived attributes have to be constructed

Building test and training sets

All data mining techniques can be used

Evaluation

Defining metrics

Use metrics to evaluate the quality of trained models

Visualization of model and metrics

Metrics are also visualized to illustrate the results

The visualization of the trained models means, that e. g. a decision tree are shown to explain and evaluate the models

The process should be reviewed in general.

Results are checked against the defined business objectives

Deployment

If deployment in the scope, the implementations should be described

Final report or a software component

Develops concrete actions based on the results of the data mining models.

User guide

Monitoring

Maintenance.