Chapter 1. Measurement Assessment and Ev

CHAPTER 1.

MEASUREMENT, ASSESSMENT, AND EVALUATION

Measurement, assessment, and evaluation mean very different things, and yet
most of my students were unable to adequately explain the differences.
The definitions for each are:
Test: A method to determine a student's ability to complete certain tasks or
demonstrate mastery of a skill or knowledge of content. Some types would be multiple
choice tests, or a weekly spelling test. While it is commonly used in terchangeably
with assessment, or even evaluation, it can be distinguished by the fact that a test is
one form of an assessment.
Assessment: The process of gathering information to monitor progress and make
educational decisions if necessary. As noted in my definition of test, an assessment
may include a test, but also includes methods such as observations, interviews,
behavior monitoring, etc.
Evaluation: Procedures used to determine whether the subject (i.e. student) meets
a preset criteria, such as qualifying for special education services. This uses
assessment (remember that an assessment may be a test) to make a determination
of qualification in accordance with a predetermined criteria.
Measurement, beyond its general definition, refers to the set of procedures and the
principles for how to use the procedures in educational tests and assessments. Some
of the basic principles of measurement in educational evaluations would be raw
scores, percentile ranks, derived scores, standard scores, etc.

Figure 1- Assessment, measurement and testing adopted from Lynch (2001)
The purpose of this representation is to show the relationship between superordinate

and subordinate concepts and the area of overlap between them. Thus, evaluation
includes measurement when decisions are made on the basis of
information from quantitative methods. And measurement includes testing when
decision-making is done through the use of “a specific sample of behavior” (Bachman
1990). However, the process of decision-making is by no means restricted to the use
of quantitative methods as the area not covered by measurement circle shows. Also,
tests are not the only means to measure individuals’ characteristics as there are other
types of measurement than tests, for example, measuring an individual’s language
proficiency by living with him for a long time.
Bachman (1990) has represented the relationship in a somewhat different way. The
goal has been to extend the model to include not only language testing but also
language teaching, language learning and language research domains. Figure 2
depicts this extended view of the relationship among evaluation, measurement and
testing. The areas numbered from 1 to 5 show the various forms of this relationship.
Area 1- Evaluation not involving either tests or measures; for example, the use of
qualitative descriptions of student performance for diagnosing learning
problems.
Area 2- A non-test measure for evaluation; for example, teacher ranking used for
assigning grades.
Area 3- A test used for purposes of evaluation; for example, the use of an
achievement test to determine student progress.
Area 4- Non-evaluative use of tests and measures for research purposes; for example,
the use of a proficiency test as a criterion in second language acquisition
research.
Area 5- Non-evaluative non-test; for example, assigning code numbers to subjects in
second language research according to native language.
After reviewing the conceptualizations and schematic representations proposed by

Bachman (1990) and Lynch (2001), an attempt will be made to more clearly locate
alternative assessment methods in relation to traditional testing methods in order to
help language teachers to make intelligent and insightful choices to assess their
students. Some points are notable about the adapted model. First, despite Bac
hman’s model, language research purposes are not dealt with in it. This is because
language teachers’ immediate needs do not concern the use of tests or assessment
procedures for research purposes. Rather, they need to enhance their assessment
choices to arrive at a sounder judgment about their students.
Secondly, all assessment procedures either traditional or alternative furnish the
function of decision-making and are all subordinated under the term evaluation. Thus,
it would be much better to deal with them as alternatives in assessment (Brown and
Hudson 1998) – available choices for the language teacher – rather than labeling
some of them as normal and others as eccentric. Such a distinction makes the new
developments inaccessible only because they are told to be so, hence our use of more
descriptive terms instead of labels which evoke vague feelings. We have to notice the
fact that all alternatives in assessment have to meet
their respective requirements for reliability and validity to make teachers able to come
to sound judgments (Lynch 2001).
Figure 3- Alternatives in Assessment; decision-making in educational settings
As Figure 3 shows, tests constitute only a small set of options, among a wide range
of other options, for a language teacher to make decisions about students. The
judgment emanating from a test is not necessarily more valid or reliable from the one
deriving from qualitative procedures since both should meet reliability or validity
criteria to be considered as informed decisions. The area circumscribed within
quantitative decision-making is relatively small and represents a specific choice made
by the teacher at a particular time in the course while the vast area outside which
covers all non-measurement qualitative assessment procedures represents the wider
range of procedures and their general nature. This means that the qualitative
approaches which result in descriptions of individuals, as contrasted to
quantitative approaches which result in numbers, can go hand in hand with the
teaching and learning experiences in the class and they can reveal more subtle shades
of students’ proficiency. This in turn can lead to more illuminating insight about future
progress and attainment of goals. However, the options discussed above are not a
matter of either-or (traditional vs. alternative assessment) rather the language
teacher is free to choose the one alternative (among alternatives in assessment)
which best suits the particular moment in his particular class for particular students
What is Assessment?
To many teachers (and students), “assessment” simply means giving students
tests and assigning them grades. This conception of assessmen t is not only limited,
but also limiting (see section below on Assessment versus grading). It fails to take
into account both the utility of assessment and its importance in the teaching/learning
process. In the most general sense, assessment is the process of making a judgment
or measurement of worth of an entity (e.g., person, process, or program).
Educational assessment involves gathering and evaluating data evolving from
planned learning activities or programs. This form of assessment is often referred to
as evaluation (see section below on Assessment versus Evaluation).
Learner assessment represents a particular type of educational assessment normally
conducted by teachers and designed to serve several related purpose (Brissenden
and Slater, n.d.). These purposed include:
• motivating and directing learning
• providing feedback to student on their performance
• providing feedback on instruction and/or the curriculum
• ensuring standards of progression are met
Learner assessment is best conceived as a form of two-way communication in
which feedback on the educational process or product is provided to its key
stakeholders (McAlpine, 2002). Specifically, learner assessment involves
communication to teachers (feedback on teaching); students (feedback on learning);
curriculum designers (feedback on curriculum) and administrators (feedback on use
of resources).
For teachers and curriculum/course designers, carefully constructed learner
assessment techniques can help determining whether or not the stated goals are
being achieved. According to Brissenden and Slater (n.d.), classroom assessment can
help teachers answer the following specific questions:
• To what extent are my students achieving the stated goals?
• How should I allocate class time for the current to pic?
• Can I teach this topic in a more efficient or effective way?
• What parts of this course/unit are my students find ing most valuable?
• How will I change this course/unit the next time I teach it?
• Which grades do I assign my students?
For students, learner assessment answers a different set of questions

(Brissenden and Slater, n.d.):
• Do I know what my instructor thinks is most important?
• Am I mastering the course content?
• How can I improve the way I study in this course?
 What grade am I earning in this course?
Types and Approaches to Assessment

Numerous terms are used to describe different types and approaches to learner
assessment. Although somewhat arbitrary, it is useful to these various terms as
representing dichotomous poles (McAlpine, 2002).
Formative <---------------------------------> Summative
Informal <---------------------------------> Formal
Continuous <----------------------------------> Final
Process <---------------------------------> Product
Divergent <---------------------------------> Convergent
Formative vs. Summative Assessment

Formative assessment is designed to assist the learning process by providing feedback
to the learner, which can be used to identify strengths and weakness and hence
improve future performance. Formative assessment is most appropriate where the
results are to be used internally by those involved in the learning process (students,
teachers, curriculum developers).
Summative assessment is used primarily to make decisions for grading or determine

readiness for progression. Typically summative assessment occurs at the end of an
educational activity and is designed to judge the learner’s overall performance. In
addition to providing the basis for grade assignment, summative assessment is used
to communicate students’ abilities to external stakeholders, e.g., administrators and
employers.
Informal vs. Formal Assessment

With informal assessment, the judgments are integrated with other tasks, e.g.,
lecturer feedback on the answer to a question or preceptor feedback provided while
performing a bedside procedure. Informal assessment is most often used to provide
formative feedback. As such, it tends to be less threatening and thus less stressful to
the student. However, informal feedback is prone to high subjectivity or bias.
Formal assessment occurs when students are aware that the task that they are doing
is for assessment purposes, e.g., a written examination or OSCE. Most formal
assessments also are summative in nature and thus tend to have great er motivation
impact and are associated with increased stress. Given their role in decision-making,
formal assessments should be held to higher standards of reliability and validity than
informal assessments.
Continuous vs. Final Assessment

Continuous assessment occurs throughout a learning experience (intermittent is
probably a more realistic term). Continuous assessment is most appropriate when
student and/or instructor knowledge of progress or achievement is needed to
determine the subsequent progression or sequence of activities. Continuous
assessment provides both students and teachers with the information needed to
improve teaching and learning in process. Obviously, continuous assessment involves
increased effort for both teacher and student.
Final (or terminal) assessment is that which takes place only at the end of a learning
activity. It is most appropriate when learning can only be assessed as a complete
whole rather than as constituent parts. Typically, final assessment is used for
summative decision-making. Obviously, due to its timing, final assessment cann
ot be used for formative purposes.
Process vs. Product Assessment

Process assessment focuses on the steps or procedures underlying a particular ability
or task, i.e., the cognitive steps in performing a mathematical operation or the
procedure involved in analyzing a blood sample. Because it provides more detailed
information, process assessment is most useful when a student is learnin
g a new skill and for providing formative feedback to assist in improving performance.
Product assessment focuses on evaluating the result or outcome of a process. Using
the above examples, we would focus on the answer to the math computation or the
accuracy of the blood test results. Product assessment is most appropriate for
documenting proficiency or competency in a given skill, i.e., for summative purposes.
In general, product assessments are easier to create than product assessments,
requiring only a specification of the attributes of the final product.
Divergent vs. Convergent Assessment

Divergent assessments are those for which a range of answers or solutions might be
considered correct. Examples include essay tests, and solutions to the typical types
of indeterminate problems posed in PBL. Divergent assessments tend to be more
authentic and most appropriate in evaluating higher cognitive skills. However, these
types of assessment are often time consuming to evaluate and the resulting
judgments often exhibit poor reliability.
A convergent assessment has only one correct response (per item). Objective test
items are the best example and demonstrate the value of this approach in assessing
knowledge. Obviously, convergent assessments are easier to evaluate or score than
divergent assessments. Unfortunately, this “ease of use” often leads to their
widespread application of this approach even when contrary to good assessment
practices. Specifically, the familiarity and ease with which convergent assessment
tools can be applied leads to two common evaluation fallacies: the Fallacy of False
Quantification (the tendency to focus on what’s easiest to measure) and the Law of
the Instrument Fallacy (molding the evaluation problem to fit the tool).
Assessment versus Evaluation
Authentic Assessment
An assessment is authentic when it involves students in tasks that are
worthwhile, significant, and meaningful. Such assessments look and feel like learning
activities, not traditional tests. They involve higher-order thinking skills and the
coordination of a broad range of knowledge.
Authentic assessments may involve such varied activities as oral interviews,
group problem-solving tasks, or the creation of writing portfolios. But in their design,
structure, and grading.
In the past, educators have recognized three main purposes for assessment:
 Accountability. Are we getting value for the money we spend on education?
 Monitoring. How well are we doing? As individuals, a class, a school, a district,
a state, a nation?
 Placement. Which students should be assigned to special programs,
promoted, remediated, admitted to college?
To this list, authentic assessment advocates add another purpose:
 Modelling. What do we want teachers to teach and how? What do we want
students to learn and how?

Chapter 1. Measurement Assessment and Ev

Uploaded by

Copyright:

Available Formats

Chapter 1. Measurement Assessment and Ev

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 1. Measurement Assessment and Ev

Uploaded by

Copyright:

Available Formats

CHAPTER 1.

MEASUREMENT, ASSESSMENT, AND EVALUATION

Figure 1- Assessment, measurement and testing adopted from Lynch (2001)

The purpose of this representation is to show the relationship between superordinate

After reviewing the conceptualizations and schematic representations proposed by

For students, learner assessment answers a different set of questions

Types and Approaches to Assessment

Formative vs. Summative Assessment

Summative assessment is used primarily to make decisions for grading or determine

Informal vs. Formal Assessment

Continuous vs. Final Assessment

Process vs. Product Assessment

Divergent vs. Convergent Assessment

Assessment versus Evaluation

You might also like