Making Valid Decisions using Valid Data
What is norming and what is standardization?
Standardization is the process of making a test uniform or setting it in line with a specific standard.
Standardization is one way to validate a test because if the test is equal or uniform on all levels, we know that the test will be valid
Norming is the control group that the test was tested on. Within the norming group should be all ages, genders, cultures, religions, etc that make up the population of people who will be taking the test.
This norming group is one way to validate a test. If a student is asked to take a test but is not represented in the norming group, the test will not produce valid results.
Bro. Cloward talked about the little Croation boy that needed to take the test. He could not take the test because he was not represented in the norming group and therefore his results would have been invalid.
How does Reliability influence Validity?
Reliability influences validity because a test must produce accurate (in other words reliable), trustworthy results for the test to be valid. We need reliability to achieve validity!
What are the different types of Reliability?
Stability
Stable over time. Stability must be reliable over time, measured using test-rest, and informal end isn't stable.
Internal
Consistency inside of itself, reliable across time such as split-half, alternate forms, and coefficent alpha
Inter-Rater
Reliable across time, results are the same no matter who the examiner is, and it is measured by inter-rater comparisons
What is Reliability?
Oxford Dictionary defines Reliability as "The degree to which the result of a measurement, calculation, or specification can be depended on to be accurate."
The Salvia textbook defines Reliability as "To the extent that we can generalize from a particular set of observations that those observations are truthful."
I define reliability as able to be trusted, or trustworthy information
Factors that affect Validity
What are the rules for Validity?
Use multiple examiners
Use multiple measures an example of this is RIOT
Multiple times and places
Reliability- you must have reliability to be valid
Methods of measurement, enabling states and behaviors, admin errors, and norms
What are the 3 C's of Validity?
Criterion
Comparing the test to a standard
Measures each test question to the standards. Think Praxis exams.
Construct
Comparing the test to a theory
The test and the theory behind it. Think IQ test or personality tests
Content
Comparing the test to the content taught
Not all tests have to do this. Measures curriculum. Think midterms
What do I need to know for validity?
Who
Who is preforming the test, who is being tested on, what are the disabilities of the person taking the test?
What
What is the content that is being tested?
Where
Where is the test taking place? Will there be any distractions?
When
What is the time and day of the test? Will it be in the morning, afternoon, or evening?
Why
Why is this person being tested and how is this person going to be tested?
What does Reliability have to do with Validity?
"Reliability sets the upper limit of a test's validity, so reliability is a necessary but not sufficient condition for valid measurement." Thus all valid tests are reliable, but not all reliable tests are valid."
I like to think of this as my friends who are reliable but not valid because they don't know everything. On the other hand a certified doctor is more valid because she is able to provide accurate data for my condition.
What are the 5 Methods for Validating Test Inferences?
Evidence related to test content
"Test content refers to the themes, wording, and format of items, tasks, or questions on a test, as well as the guidelines for procedures regarding administration and scoring."
Evidence related to internal structure
"Internal structure refers to the number of dimensions or components within a domain that are represented on the test. For example if a test developer theorized that there were several components of intelligence, one would rightly expect that the resulting test would contain several components of intelligence."
Evidence of the relationships between the test and other performances
"The relationship to other performances refers to the accuracy with which test scores predict performance on the same type of test or other similar tests."
Evidence of convergent and discriminant power
"Convergent power refers to a test's ability to produce scores similar to those produced by other tests of the same ability and skills. Discriminant power refers to a test's ability to produce scores different from those produced by other tests of a different ability or skill."
Evidence of the consequences of testing
"Test's are administered with the expectation that some benefit will be realized either to the test taker or to the organization requiring the test."
Why is Validity such a big deal anyway?
If a test is not found to be valid, the results from the test are worthless and unreliable.
This is such a big deal because the creation of good tests takes years and thousands of dollars. After all of that time prepping and planning, you better hope the test is valid or all of that effort is wasted!
What are some synonyms to validity?
Alignment, Congruence, "True" as in true to standard
It is important as a data reader to ask the right question, define the tests scope and sequence, the data that I gather, and the decision that I am making based on the data set.
The 3 types of Validity are: criteria, construct, content
What about if the teacher is playing hero and tries to give a student extra time on a timed test? Does that make the test invalid?
"If an accommodation undermines what I am trying to measure, it makes the test invalid." - Bro. Cloward
Why is validity important in education testing?
Validity is important because if we are testing a student and the test is not valid, the data collected from the test will also not be valid. If the test results are not valid, the test is a bad test and should not be used.
Part of the reason why we want test to be valid is because test making takes many years an dis very expensive If a test has taken 3 years to produce and cost 100 thousand dollars, we better hope that test is valid and will produce valid, reliable results
First we should define Validity
Oxford Dictionary defines validity as "The quality of being logically or factually sound, soundness or cogency."
The Salvia text defines validity as "The extent to which a test measures what it's authors or users claim it to measure; specifically, test validity concerns the appropriateness of the inferences that can be made on the basis of test results."
Making Valid Decisions textbook defines validity as "Valid data that is accurate, reliable, robust, and complete. Accurate data answers the right questions and without errors. Reliable data is consistent across time, examiners, and internally. Complete data has no major holes in the data that might make the data lopsided toward one perspective over another. All four criteria must be present to call the data valid."
My definition of valid data is "Data that tests on it's intended content area. If a test is intended to test on reading comprehension, it should be testing reading comprehension."
I do want to add one little saying from Brother Cloward. He said we will use the acronym VdVo which stands for Valid data, Valid decision. Essentially if you know that the data is valid because ti comes from a valid source, the decision that you make regarding the test results will be valid as well.
"Validity has everything to do with the congruence to the standard." - Bro Cloward AKA to be valid it must correlate or align with the standard.
Which is why when I am in my practicums it is such a big deal to align my lesson plan to a state standard to make my lesson plan valid. As in to make sure that my lesson plan meets and aligns with the state standard I chose.