CRSP-DM
Business Understanding
Textual description of the business goals and why data mining is useful for the specific use cases, the concrete data mining goal is specified in a structured manner
Required project plan
Data mining goal
Data mining type
Data mining success criteria
Data understanding
Collecting data from data sources
Exploring and describing it
Statistical analysis and determining attributes and their correlations.
In terms of data mining, it is a classification with the binary target attribute.
Entitle the concrete data sources and explain from where the data has been collected during the data understanding phase
Descriptive statistic
The most important method of data understanding
Checking the data quality
Data preparation
Inclusion and exclusion criteria
Describing input and output data
Transformation
Data selection
Cleaning data
Bad data quality can be handled
Modeling
Select the modeling technique
Training data
Derived attributes have to be constructed
Building test and training sets
All data mining techniques can be used
Evaluation
Defining metrics
Use metrics to evaluate the quality of trained models
Visualization of model and metrics
Metrics are also visualized to illustrate the results
The visualization of the trained models means, that e. g. a decision tree are shown to explain and evaluate the models
The process should be reviewed in general.
Results are checked against the defined business objectives
Deployment
If deployment in the scope, the implementations should be described
Final report or a software component
Develops concrete actions based on the results of the data mining models.
User guide
Monitoring
Maintenance.