CRSP-DM
Deployment
If deployment in the scope, the implementations should be described
Final report or a software component
Develops concrete actions based on the results of the data mining models.
User guide
Maintenance.
Monitoring
Evaluation
The process should be reviewed in general.
Results are checked against the defined business objectives
Visualization of model and metrics
Metrics are also visualized to illustrate the results
The visualization of the trained models means, that e. g. a decision tree are shown to explain and evaluate the models
Defining metrics
Use metrics to evaluate the quality of trained models
Modeling
All data mining techniques can be used
Select the modeling technique
Training data
Building test and training sets
Derived attributes have to be constructed
Data preparation
Inclusion and exclusion criteria
Describing input and output data
Cleaning data
Bad data quality can be handled
Data selection
Transformation
Data understanding
Collecting data from data sources
Checking the data quality
Exploring and describing it
Entitle the concrete data sources and explain from where the data has been collected during the data understanding phase
Descriptive statistic
The most important method of data understanding
Statistical analysis and determining attributes and their correlations.
In terms of data mining, it is a classification with the binary target attribute.
Business Understanding
Textual description of the business goals and why data mining is useful for the specific use cases, the concrete data mining goal is specified in a structured manner
Required project plan
Data mining goal
Data mining success criteria
Data mining type