"Making valid decisions using valid data"
Assessment Decisions in Context
Garbage in Garbage out.
The data you collect needs to be valid so that you have valid information to make a valid decision.
Valid data is accurate, reliable, robust, and complete
Validity relates to reliability
Reliability means a test is consistent in the way it is grade and the material
How do you data in valid ways?
Scope of assessment, valid instruments, gather data, make a decision.
IDEA
6 Principles of IDEA
Free Appropriate Public Education (FAPE)
Non-Discriminatory Evaluation
Zero reject
Least Restrictive Environment (LRE)
Procedural Safeguards
Parental Participation
The Metrics of Assessment
Central Tendency
Mean
The mean, or average, helps you see overall all howl the class is preforming. It is not the best measure of how all of your students are doing became you could have an average score and not a single one of your students received that score.
x̄ = (SUMx)/n
Median
The middle data point.
Mode
The most common test score. This helps you know where the majority of your class it at became you can see the scores that are most common in your data set.
Measures of Dispersion
Range
Range is the difference between the extreme scores (highest–lowest).
Variance
"It takes each score difference from the mean (x-mean). But the total of all of the differences added together would be zero, because some would be negative and some positive and they would cancel each other out. Therefore we square them, which makes them all positive and then add them together. In the equation below, we take the first value (x1) and subtract the mean and square the difference. Then the same with the second value, continuing to the nth value. Then we divide by n."
Standard Deviation
The SD is simply the square root of the variances. In the variance you had to square the numbers to prevent negative numbers, but that is not what the SD is. So you have to put all those numbers back to what they were without them being squared. So you take the square root of them.
You can be a certain amount of SD from the mean.
Frequency Distributions
Correlation: when two or more things have some connection with each other. They do not have to cause each other, but they can.
Causation: when two or more things correlate and cause each other to happen.
Standard Error of Measurement
SEM is taking into account the error that will enviably happen when it comes to assessments. There will always be some error for some reason, could be random or could be be a measurement error. But you have to take into account this error.
SEM = SD 1Square root of (1- r)
Developing norm-referenced tests
Standardization: The process of making something the same by predetermined rules and exportations.
Norm Referenced Tests
How to cerate them: Define the purpose, develop a pool of test items, pilot test the items, revise the items and testing procedures as needed, administer the test to a norm group.
Formal assessments
Sources and types of Assessment Data
RIOT
Records
Anecdotal records
Interviews
Interview parents, teachers, the child, ect
Help to confirm findings from other sources.
Observations
Systematic and Nonsystematic
Tests
Validity and Reliability
Informal
interviews, teacher made tests, curriculum based assessments, and portfolios
Formal
Norm Referenced test, WISC-IV, K-TEA
Formal tests are more standardized.
Purpose of tests: Evaluate learning, Grading, Evaluating if objectives are met, and Evaluating instruction.
Scores used in testing and other measures
Scores are just a quantitative representation of a qualitative performance that could be represented qualitatively instead of quantitatively.
Raw Scores: the number of correct answers on a test.
Others scores that we used come from the raw score. The Raw Score is a base for us to then find news ways to interpret the data.
Derived scores: (often called standard scores too) are converted raw scores that have been adjusted and are reported relative to the populations they were taken from.
Scores can be compared when converted. (Standard score, t score, scaled score, V scaled, and Z scores
Standard Score: Mean: 100 SD:15
T Score: Mean: 50 SD: 10
Scaled Score: Mean: 10 SD: 3
V-scaled Score: Mean: 15 SD: 3
Z Score: Mean: 0 SD: 1
Converting Scores
Z Score: (x-Mean)/SD New Score: (Z*SD)+Mean
the standard Curve
In a standard curve you can see that there are 8 sections. The first section is less than 1%. Second is 2%. Third is 14%. The fourth and Fifth is 34%. The sixth is 14%. The sevetnth is 14%, and the eighth is Less than 1%.
If a student's tests scores fall in either of the 34% parts on the standard curve then they are considered average.
Validity
Content
Measures what content/curriculum was taught. These would be test that are given within the class. For example after a math unit you are then given a math test.
Criterion
These are tests that are related to the state standards. The test items should be correlated to the state standards. For example, the Praxis.
Construct
This is the theory behind the test. There is no certain outcome that one must obtain. An example of this test would be an IQ test.
Factors affecting validity are...
Test needs to be reliable
Is the test actually meaning the desired skill?
A test cannot be more valid than it is reliable.
Method of measurement
Enabling sates and behaviors. (Sensory abilities, Language, prerequisite skills, motivation)
Opportunity to acquire skills/behaviors
Norms
Differential item effectiveness
Rules to give valid tests...
Use multiple examiners
Multiple measures
Records, interviews, observations, tests
Multiples times and places