HomeNewsAboutCommunitiesSearchSchoolsInteractGatewayHelp
ESOL Online Rapurapa

Professional Reading

A Glossary of Assessment Terminology

By Dorothy Brown, Margaret Kitchen and Breda Matthews

Accreditation

A process for ensuring that providers have the capacity, (including management of quality), to assess unit standards.
Accountability
The responsibility we have as teachers to be able to explain the rationale behind our assessment techniques and results to students, parents and intuitions.
Achievement
How well students achieve against a set of objectives.
Achievement objectives
Broad statements of what students are expected to learn, showing progression and continuity in learning for the years of compulsory schooling and beyond.
Administration of test
The complete process of giving a test including giving out and collection of papers and invigilation or conducting an oral interview.
Aptitude
The extent to which an individual possesses specific language-learning ability. Research is unclear about the existence of a general aptitude variable.
Assessment
Measuring our students' performance in any one of many different ways, diagnosing the problems and measuring the progress students make, collecting and judging evidence of achievement or competence.
Benchmark
Standard against which something is judged.
Bias
Bias is a prejudice or lack of fairness towards groups or individuals. It can occur in a variety of forms, for example:
  • gender bias, where an assessment topic or format favours one gender;
  • ethnic bias, where a topic or format favours or discriminates against an ethnic group;
  • individual bias, where a topic or format favours individuals who have a special interest in or background knowledge of a topic which carries heavy weight in the assessment, or where an individual lacks such interest or knowledge;
  • linguistic bias.
Competence (in language)
The knowledge of the language system.
Competency based assessment
Assessment, based upon defined criteria, in which a particular level of performance is set which candidates must reach if they are to be judged as "competent".
Context (also conditions and special notes)
In a Unit Standard these define the conditions in which the tasks should be performed e.g, use of dictionaries, time allowance etc.
Correction code
A code often used by teachers to signal student errors in written work.
Counselling
Individual meetings between students and teachers to discuss assessment results usually in relation to self-assessment.
Criterion/criteria
Descriptions of what our students should be able to do with the language (see Clemson & Clemson, 1991).
Criterion-referencing
Using description of what our students should be able to do with the language in order to determine the pass score in a test or informal assessment.
Curriculum levels (in the national curriculum statement)
The curriculum levels refer to the levels of achievement objectives (usually eight) in the national curriculum statements.
Descriptor
A definition of a level of performance in a level or band scale.
Diagnostic questionnaire
A learner test used to find out what are our students' problem areas with the language. It is usually given at the beginning of a course.
Diagnostic test/assessment
A type of test used to find out what are our students' problem areas with the language. Many progress tests have a diagnostic element. (Clemson & Clemson, 1991)
Discrete item test format
A test format in which there are usually many items requiring short answers. Can be grammar, structure, vocabulary, semantics, and phonology.
Elements
These describe specific learning outcomes. They are the competencies/achievements which must be demonstrated for the achievement of unit standard credits. In a unit standard a series of statements for each performance criteria provides answers or identifies the performance expected of students.
Entry/placement test
A test which will indicate at which level a learner will learn most effectively in the case of different levels or streams.
Evaluation
Consideration of all the factors that influence the learning process such as syllabus objectives, course design, materials, methodology, teachers and assessment.
Evidence
In a unit standard this means the measure that performance will be tested against. Typically it is a series of statements for each performance criteria that provides answers or identifies the performance expected of students.
Examination
A formal summative or proficiency test usually administered by an institution. It is often associated with tests supplied by examination boards.
Exemplars
These are samples of student work, often annotated, that illustrate levels of achievement. These could be examples of written work, or designed tasks, or recordings or oral or musical work. Care needs to be taken to ensure that they are used as examples and not as goals in themselves.
Formal assessment
Tests given under conditions which ensure the assessment of individual performance in any given area.
Formats
Test formats are the tasks and activities which students are required to do (e.g. multiple choice).
Formative assessment
A type of assessment which feeds back into learning and gives the learner information on his/her progress throughout a course thus helping him/her to be a more efficient learner.
Grade
A way of expressing overall results using a number or letter.
High stakes assessment
Important decisions are made on the basis of this assessment.
Holistic rating scale
A scale in which different activities are included over several bands to produce a multi-activity scale.
Impression mark
A number or letter given by a teacher to students' work as a result of informal observation without using a rating scale.
Informal assessment
A system for observation and collection of data about students' performance under normal classroom conditions.
Integrative test format
A type of test format which involves the use of more than one skill by students and which is open-ended involving communication and interaction.
Inter language
The type of language produced by second language learners in the process of learning a new language.
Internal assessment
The process of making a judgment (or a series of judgments) about a student's performance on specific school-based tasks or tests which are integrated into the learning programme, as opposed to assessment on the basis of performance in an external examination.
Inter-rater reliability
A way of describing to what extent different raters or teachers assess performance in a test in the same way.
Interlocutor
A teacher or other trained person who during a test acts as the person with whom the student or candidate interacts in order to complete a speaking task.
Intra-rater reliability
The consistency of one rater e.g., marking before and after lunch.
Item
An individual question in a test which requires the student or candidate to produce an answer.
Item bank
A relatively large accessible collection of test items with known properties used for the construction of equivalent or alternate forms of frequently administered standardised tests (see Davies, A. et al. 1999).
Judgment statements
In unit standards examples of the level of performance that must be reached to obtain each element. It usually consists of a series of statements for each performance criteria that defines the standard to be reached.
Learner training
Ways of helping learners to find strategies to learn more effectively; these strategies should suit their individual learning style.
Learner diary-language development book/file
Record of students' learning experiences containing what they have done in class, the progress they have made and any problems they have.
Learning outcomes
See elements.
Learning strategies
Ways of organising learning which help students to learn more effectively.
Learning styles/preferences
Different ways of learning which learners employ to achieve their objectives.
Linguistic factors
Aspects affecting assessment which are strictly to do with the language in all four skills.
Lockstep
Situation in which all students in a class are engaged in the same activity at the same time, all progressing through a task at the same rate.
Mean
The average score.
Median
The middle of a range of scores.
Moderation
A quality assurance process that ensures providers make fair, valid and consistent decisions.
Monitoring
Observing and making of informal assessments of what is happening in the classroom during learning activities. The process of continually evaluating students' performance or checking that the aims of particular instructional activities have been achieved.
Negotiated syllabus
Students' needs and learning preferences are taken into account during a course; these needs will have been discussed by teachers and students together.
Non-linguistic factors
Aspects affecting assessment which are not to do with language per se but are more connected to other factors such as attitude, working within groups and co-operation.
Normal distribution
A distribution of scores that rise and fall gradually from a single peak. It forms a symmetrical bell-shaped curve.
Norm-referencing
Listing students in order of test results and passing them or failing them according to their position on the list.
Objective marking
Where only one answer is possible and this is given in an answer key that is interpreted by all markers in the same way.
Open-ended test format
Test format that requires no specific response, but which is open to interpretation.
Paralinguistic
Aspects of communication which are outside the scope of the spoken word such as gestures or expressions.
Peer assessment
Where students assess one another during class activities.
Peer editing
Where checking of students' written work and correction of mistakes is carries out by other students.
Peer monitoring/assessment
Where observation and assessments of what is happening in the classroom during learning activities is carried out for students by their class colleagues.
Performance
Observable language behaviour and use.
How our students did in a formal or informal assessment procedure regardless of their actual competence; performance may be lower than competence.
Performance criteria
These are statements against which the attainment of elements is judged. In National Certificate for Education Achievement (NCEA) performance criteria typically contain a number of benchmarks for assessing students' work that refer to what the students should be able to do.
Portfolio
A collection of a variety of types of evidence for example assignments, projects, reports, writings and test results which are personal to the learner.
Practicality
All aspects concerning tests which affect the ways a test will be implemented.
Proficiency
The degree of skill with which a student can use language.
Proficiency tests
Type of test which aims to describe what a student is capable of doing in a foreign language; usually supplied by external examination boards often used to decide whether a candidate has learned enough of the target language to attend a certain course or do a certain job. Some proficiency tests have been standardized for world wide use. The American TOEFL test and the British/Australian IELTS test.
Profile
Written description of student's performance or progress; often used in reporting assessment results – a means of recording outcomes of education for a particular student. Some tests such as IELTS and most proficiency scales such as Australian ESL scales for schools, provide information on learners in terms of profiles.
Progress questionnaire
Learner questionnaire in which students reflect on their own progress over a given period of study.
Progress tests
Type of test which aims to find out how well students have grasped the learning objectives over a particular period of time such as a month, a term or a year or over a number of course modules.
Purpose
In the NCEA, refers to who should achieve the unit standard.
Range
In a unit standard concerns issues such as the type or number of tasks or language features or defines important points of the unit standard.
Rating
Assessing student performance using pre-established scales; usually assessing spoken performance.
Raw score
Number of correct answers obtained by a student in a test; from this score the final grade is often calculated.
Recognition test
Recognise and match words.
Reconstruction test
Reconstruction – no matching.
Record of achievement
A profile prepared at the time a student finishes secondary school. It usually consists of an assessment of the student's level of achievement in relation to the achievement objectives and the development of the essential skills of the New Zealand curriculum, a section on the student's performance in national examinations and qualifications, a section on personal qualities, and a summary of involvement in school activities.
Reliability
A test is reliable if it produces the same result twice with no learning or teaching between the two occasions. Reliability can be affected by:
  • scope of test items
  • conditions of administration
  • clarity of instructions
  • personal factors: motivation, illness etc
  • reliability of markers – objective or subjective test
Note: Reliable tests are not necessarily valid.

The consistency of any form of assessment which means that under the same conditions and with the same student performance the assessment procedure would produce the same results.

Report
Document which describes students' progress and performance.
Reporting
The process of communication assessment results to students, their parents and the institution; usually through written reports.
Rubric
Instructions in a test or any classroom activity which indicate to the candidate or student what he/she has to do to complete any given task.
Sample
The amount of language and content from syllabus plans or teaching records which a test or any classroom activity elicits.
Scales, levels, bands, benchmarks, competencies, profiles, standards
Assessment and reporting of language performance using descriptors that define an increasing level of language ability.
Scanning
Reading a test quickly in order to obtain specific information.
Self-assessment
Assessment carried out by students themselves designed to measure their own performance and progress.
Self-check activity
Where students check their own performance by completing an exercise and then looking at their own results.
Self-editing
Where on completion of a written piece of work, students go through the piece and check for any mistakes.
Self-monitoring
Where students correct their own speech production either at the time of speaking or when listening to a recorded sample of their performance.
Skimming
Reading a text quickly in order to obtain a general idea of the content area.
Standard deviation
How scores are spread around the mean.
Standardisation
Agreement between raters of student performance on the meaning and interpretation of criteria used for assessment.
Standards based assessment
This assesses the performance of a learner in relation to set standards.
Subjective marking
Where the mark or grade given to a performance depends on somebody's opinion or judgment such as in all speaking tests.
Subjectivity and objectivity
See Clemson & Clemson, 1991.
Summative assessment
Type of assessment which aims to measure students' performance at the end of a period of study.
Summative test
A type of test usually administered at the end of courses; often used as a way of deciding whether students move to a higher level or not or obtain a particular certificate or not.
Test format
The way a test is set out. The task types used to elicit language from the students.
Test
A task or situation planned specifically for assessment of students' achievement. Tests can include:
  • nationally standardised instruments prepared by professional test developers and sold by commercial distributors;
  • external examinations;
  • short sets of items devised by a teacher for classroom use on a single occasion.
Transition point assessment
Assessment carried out at key transition points (for example, at the beginning of year 7, or the beginning of year 9) to enable schools to assess the relative performance of their students against national standards.
Validity
The test measures what it is intended to measure.
  1. Face validity: e.g., culturally relevant to students
  2. Content validity: relates to what was actually taught or what really needs to be assessed and not to what is only easiest to assess
  3. Predictive validity: the extent to which the test is able to predict some criterion behaviour
  4. Concurrent validity: the extent to which a test correlates with other (already validated tests
  5. Construct validity: the extent to which a test proves or disproves a hypothesis or psychological construct of language.
  6. Consequential validity: consequences, washback, the teaching that happens as a result.
Washback effect
The influence of tests or examinations on the teaching and learning leading up to the assessment.
Weighting
The relative importance of different skills and language which is assigned in the assessment process.