Definition
While designing a test, it is
essential for it to be valid. Validity can be defined in different ways.
“It is the
extent to which a test measures what it is supposed to measure”.
“Validity is
the subjective judgment made on the basis of experience and empirical
indicators”.
Validity is
“the agreement between test score or measure and the quality it is believed to
measure”. (Kaplan and Saccuzzo 2001)
In simple words we can say that
validity refers to the meaningfulness of the test. This meaningfulness can work
at two levels. At the level of the design, the design of the test should be
according to the requirements. At the level of context, the test should be used
for the specific purpose for which it is designed. We cannot use a mathematics
test to test the writing skills of the learner, it is against the context, the
test is used other than its context and will not be valid.
There are various kinds and aspects
of validity. Here they are tried to discuss according to their relevancy and
importance.
1. Face
Validity
In fact it is not a kind of
scientific aspect of validity. Face Validity refers to the face of
the test among general public, test-takers and other lay people/non-related
persons etc. Face validity means the test should have certain characteristics.
These characteristics are those which the people expect about the test. They
include proper printing, a governing body, an appropriate manner of test
taking, subjective/essay type of questions etc. Thus a mathematics test will
not be considered valid according to face if it has not numerical
questions. Numerical Question is the expectation of the lay people from
a mathematics test.
A test that does not have such
evidences may be rejected by the test takers and the governing body may bear
criticism. Face Validity is nothing in the opinion of the expert but a
test should look like a test a common man thinks. So face validity becomes
an issue for the test designer and paper setter.
2. Predictive
Validity
Predictive Validity
refers to the future performance and success of the learner. This aspect of
validity ensures that the test is providing valid information about the future
performance of the learner. So it includes all those situations or skills for
testing which the learner encounter or perform in his future. The example of
the tests must have predictive validity are entry tests and selection
tests.
Language aptitude tests should have
predictive validity because they test present skills for future performance.
Proficiency tests also use predictive validity. In diagnostic and achievement
tests although there are other types of validity involved but they should also
have predictive validity. As they are also liked with the future performance of
the learner.
Predictive Validity is
calculated by statistical co-relations. The validity coefficient is
calculated usually by comparing success in the test with the success in job.
Thus the validity of the test is checked and improved for future tests. A 0.6
value of the coefficient is considered high which indicates that all tests do
not have predictive validity.
3. Concurrent
Validity
There is no major difference between the two validity
types i.e. Predictive Validity and Concurrent Validity except time.
Predictive Validity is related to futures while concurrent validity is related
to present. When the future becomes present the predictive validity becomes
the concurrent validity. We compare two tests instead of a test with future or
job performance. This comparison is of
two test taken usually simultaneously, one written and other oral or spoken
usually. It is used to limit the criterion related errors. Most suitable
situation is the comparison of a new test with already established test/criterion
to find its validity and meaningfulness.
So a test having concurrent validity
will show its validity in a given field. Concurrent validity is a statistical measure which requires a
quantifiable criterion. Although all the criterion are not quantifiable but
statistical approach assumes that they are quantifiable. Co-efficient of
validity is used to compare the two tests.
4. Content
Validity
It is the appealing aspect for
the expert. It seeks the extent upto which the test represents the content from
which it is constructed.
It is required in achievement tests.
They represent a content/syllabus and they should be constituted from the given
syllabus and content. Similarly diagnostic tests should also have content
validity because they seek certain deficiencies of the learner from a given set
of skills or syllabus. The chief examiner, advisor etc. can check that the test
is representing the content which it is going to test.
Teaching materials should have their
own validation i.e. predictive and construct validity. Otherwise the content
validity of the test will not be fruitful. So the teaching of speaking skills
should involve such materials and the examples from native speakers which teach
the student appropriate speaking/spoken skills.
In proficiency tests the content
validity can also be employed. Learners will have to perform in certain
situations so the test can be a representative of those skills testing. Those
areas can be specified before the exam just like the syllabus of the achievement
tests. This is a guesswork as compared to other tests where we have syllabuses.
5. Construct
Validity
A construct is a theory, usually a
psychological one, which explains certain mental process say learning. It will
thus say how the learner learns the language and what are the factors
involved and what is the nature of language etc.
On the base of such theory the
test is constructed to evaluate certain factors/indicators of the learner
language to measure his ability. Thus the test will have the construct validity
if it represents the aspects of that particular theory on which it is based.
Here a point should be kept in mind
that a construct may be wrong. So a test having construct validity will become
meaningless due to the false theory on the basis of which it is constituted.
Here the problem will be with that construct not with the test. Materials and
syllabuses should also be evaluated on the basis of construct validity to know
if they represent and teach the language according to the theory of language
and language learning.
Different
questions relating each validity evidence are presented in this table.
Content
1.
Do the evaluation criteria address any
extraneous content?
2.
Do the evaluation criteria of the test
address all aspects of the intended content?
3. Is
there any content addressed in the task that should be evaluated through the
test, but is not?
|
Construct
1.
Are all of the important facets of the
intended construct evaluated through the scoring criteria?
2. Is
any of the evaluation criteria irrelevant to the construct of interest?
|
Criterion
(Predictive + Concurrent)
1.
How do the scoring criteria reflect
competencies that would suggest success on future or related performances?
2.
What are the important components of the
future or related performance that may be evaluated through the use of the
assessment instrument?
3.
How do the scoring criteria measure the
important components of the future or related performance?
4. Are
there any facets of the future or related performance that are not reflected
in the scoring criteria?
|
Validity is a
pre-test concern. We should develop tests in such manner that they have the
validity and meaningfulness. In this regard a three step approach can be
helpful.
1. First, clearly
state the purpose and objectives of the assessment.
2. Next, develop
scoring criteria that address each objective.
3. If one of the
objectives is not represented in the score categories, then the rubric is
unlikely to provide the evidence necessary to examine the given objective. If
some of the scoring criteria are not related to the objectives, then, once
again, the appropriateness of the assessment and the test is in question.
Sources of
Invalidity
Validity can suffer due to
various factors some of which are discussed.
1.
Lack of reliability
indicates that the test is not valid. Although the contrary may also be true,
that is, a test is reliable and consistent in its results but it may be
meaningless in certain context and irrelevant. Reliability should be seen to
prevent invalidity.
2.
Content and Construct under-representation is
a situation in which important aspects of the content and construct are not
included in the test. Thus the results are unlikely to reveal the true
abilities of the student's abilities within that construct or content which
were indicated and having been measured by the test.
3.
Content and Construct
over-representation is a situation in which the aspects of the content and
construct are represented in the test in excess, that is, irrelevant part are
also included in the test. This can
further be divided in two situations:
1.
One where the
over-representation leads the test to easiness and the learner or test-taker
can get some clues from the test to solve some problems, thus guessing
increases invalidity.
2.
Other situation can lead
the test to difficulty and it becomes difficult for the student to score
appropriately. It is not the fault of the student but the construction of the
test affects him and he cannot perform well.
No comments:
Post a Comment