The session started by Evertt having us all to describe the best test (in the wider sense of the word) we ever had and then determine what the characteristics of that test were? It’s an exercise I recommend for all educators.
The common characteristics of the all the best tests we described in that forum were: formative, challenging, performance based, requiring courage and the feedback helped us. Now consider many of the assessments we set as teachers and the mismatches between our beliefs (perhaps described through those “best test” characteristics) and our assessment practices. I know as I see frustrated students and teachers wade through text book tests with few of these “best test” characteristics, (less now I hope).
Everett defines the purpose of all assessments is to cause an increase in achievement. This I see as a challenge. What I so often observe is tests used as summative evaluations with little direct feedback (other than a grade) going to students. How is that causing an increase in achievement – simply put I don’t think it does.
Everett talked about needing to have around 3 of the 5 “assessment” criteria listed below for a task to be useful and impact on student learning.
- · Impact: “looking at the audience – was the writing informative or persuasive”
- · Craftsmanship: “is the writing – or maths problem clear (organised, punctuated and spelt correctly)”
- · Behaviours define excellence (preparation): “how well is the paper researched”
- · Sophistication: “creativity in the process”
- · Accuracy: “Is it right”
If I think about what dominates our instruction time (thereby our assessment time if they are consistent) in writing or subjects that require writing our thoughts (e.g. history etc) it’s often craftsmanship first, followed by accuracy and preparation (or the same three in a different order).
One of the expectations we at Elsternwick have set ourselves in 2012 is to have 4 of the 10 writing sessions over a fortnight focused on authorial voice or IMPACT. I’ll let you know if our results improve as Kline suggests they always do when you stress impact.
Everett also disputed the notion of teachers constantly developing rubrics for various assessment pieces saying it was unrealistic and the reliability of teacher judgements using the rubric tool is questionable.
He introduced us to the term “inter rater reliability” and suggested that one group in the States had worked on this and had a framework well worth considering. He advocates that teachers need to collect multiple samples on the various criteria used for the rubric framework to ensure reliability of teacher judgement. He suggests that using the one rubric framework which has proven to have inter rater reliability is part of the assessment puzzle. Another key component is the idea of an anthology of work for each student that over time allows us to measure increasing sophistication which is(one of the criteria used by the group in the States.
I’m looking to trial some of the criteria from the group in the States in assessing our Inquiry based units of work with our senior Years 5/6 students in 2012.
So as you can see Everett questions some of the practices we currently advocate teachers should do (build rubrics for each assessment task) and the assessment criteria (Impact) and I think priorities in instruction we have. That’s a lot for one session!