Jérôme, one of the 2interpreters, Michelle (Interpreter Diaries) and myself have been involved in a discussion on how to evaluate interpreter exams. A really tricky business as anyone of you who have been on an exam jury will know. Jérôme published a really interesting reflection on final exams and Michelle and I responded, you can read the post here.
We have now arrived at the even trickier subject of quality in interpreting and this is where I felt I needed to write a post, not just continue the comments. Clearly what exam jurors are after is some type of high quality interpreting, this is also supposedly what accreditation jurors or peer-assessors are looking for. But what is it?
Michelle mentions two early studies, one by Hildegund Bühler (questionnaire study with interpreters as respondents) and Ingrid Kurz (questionnaire study with interpreting users as respondents). These two have recently been followed up by Cornelia Zwischenberger with a more recent one with interpreters as respondents. When we are talking about questionnaire studies it should also be mentionned that AIIC commissioned a study made by Peter Moser on user expectations and that SCIC regularly make surveys of their users expectations. Bühler and Kurz more or less concludes that an interpreting is good when it serves its purposes and that different contexts have different requirements (I’m summing up really heavy here).
As both Michelle and Jérôme points out in their comments there is a flood of articles on quality, and there are many studies made in the area, but I’m not sure we have actually come up with something more conclusive than Bühler and Kurz did. However, I would like to draw you attention to something that I have found most interesting in research on quality – Barbara Moser-Mercer was also mentioned in the comments and she published an article in 2009 when she challenges the use of surveys for determining quality. This seems very inspired by the work that has been done in Spain by Angela Collados-Aís and her research team ECIS in Granada. Unfortunately, she only publishes in Spanish and German, so I had to go there to understand what she does, but it was worth every bit of it. Extremely interesting research. I also have to complement them on how I was received as a guest, Emilia Iglesias-Fernandez made me feel like a royalty, and all the other researchers in the unit was extremely welcoming and accommodating. But here’s the interesting thing:
For the past 10 years they have been researching how users of interpretation perceive and understand the categories most commonly used in surveys to assess interpreting. These categories have typically been since Bühler; Native accent, pleasant voice, fluency of delivery, logical cohesion, consistency, completeness, correct grammar, correct terminology, appropriate style. If I remember correctly, for instance, Peter Moser’s study showed that experienced users of interpretation reported that they cared more about correct terminology and fluency than pleasant voice or native accent.
In their experiments they have been tweaking interpreted speeches so the exact same speech would be done with or without native accent, with or without intonation, high speed or low speed and so forth. Different user groups first rated how important the different categories were and then they were asked to rate different speeches, tweaked for certain features. When you do that it turns out that the exact same speech with native accent gets higher score for quality (i.e. using more correct terminology or correct grammar) than the speech with non-native accent. And the same goes for intonation, speed and so forth.
So it seems like (very strongly argued) features that are not rated important (such as accent) affect how the user perceive important features (correct terminology).
In interpreting research there is also a lot of error analysis going on of course, and many studies base their evaluation of the interpretings used on error analysis. One problem with that is exactly the one that Jérôme points out – maybe the interpretation actually got better because of something that the researcher/assessor perceived as an error. Omissions is a typical category where it’s difficult to judge that. I have also just gotten results with my holistic scales where the interpreter that I perceived as “much better” (only guts feeling) got much worse scores. One reason for this when I started analyzing my results could very well be the fact that that interpreter omitted more, and thereby, in comparison with the source text, there are more “holes” or “faults” or whatever you would like to call it.
When it comes to exams, Jérôme claims that not much has been done in terms of research on exam-assessment and exams. I have not checked that, but my impression is that Jérôme is right. I cannot remember reading about quality assessment of examinees. I know that entrance exams are studies and aptitude tests, but final exams… Please enlighten me.
Another thing that Jérôme also points out, and which is really a pet subject to me, but where there seem to be very little consensus, at least in the environments where I have been, is the training of the exam juror or the peer-reviewer. Now, I don’t mean to say that there are no courses in how to be an interpreting exam juror, of course there are. But what I mean (and Jérôme too I think), is that people evaluating interpreting do not get together and discuss what they believe is good interpreting or not. You could for instance organize a training event before an exam where jurors get together and discuss criteria and how they understand them, and also listen to examples and discuss them. I’m sure this happens somewhere, but I have not come across if so far.
What’s your take on this? Have I left out any important studies or perspectives? Do you have any other suggestions?
Bühler, H. 1986. “Linguistic (semantic) and extra-linguistic (pragmatic) criteria for the evaluation of conference interpretation and interpreters”. Multilingua 5-4. 231-235.
Collados Aís, Angela. 1998. La evaluación de la calidad en interpretación simultánea: La importancia de la comunicación no verbal. Granada: Editorial Comares.
Kurz, Ingrid. 1993. “Conference interpretation: Expectations of different user groups”. The Interpreters’ Newsletter 5: 13–21. (http://www.openstarts.units.it/dspace/handle/10077/4908)
Moser, Peter. 1995. “Survey on expectations of users of conference interpretation”. (http://aiic.net/community/attachments/ViewAttachment.cfm/a525p736-918.pdf?&filename=a525p736-918.pdf&page_id=736)
Moser-Mercer, Barbara. 2009. “Construct-ing Quality”. In Gyde Hansen, Andrew Chesterman, Heidrun Gerzymisch-Arbogast p. 143-156. Efforts and models in interpreting and translation research: a tribute to Daniel Gile Amsterdam & Philadelphia: John Benjamins.
Zwischenberger, Cornelia. 2011. Qualität und Rollenbilder beim simultanen Konferenzdolmetschen. PhD thesis, University of Vienna.