Research on Quality in interpreting

Jérôme, one of the 2interpreters, Michelle (Interpreter Diaries) and myself have been involved in a discussion on how to evaluate interpreter exams. A really tricky business as anyone of you who have been on an exam jury will know. Jérôme published a really interesting reflection on final exams and Michelle and I responded, you can read the post here.

We have now arrived at the even trickier subject of quality in interpreting and this is where I felt I needed to write a post, not just continue the comments. Clearly what exam jurors are after is some type of high quality interpreting, this is also supposedly what accreditation jurors or peer-assessors are looking for. But what is it?

Michelle mentions two early studies, one by Hildegund Bühler (questionnaire study with interpreters as respondents) and Ingrid Kurz (questionnaire study with interpreting users as respondents). These two have recently been followed up by Cornelia Zwischenberger with a more recent one with interpreters as respondents. When we are talking about questionnaire studies it should also be mentionned that AIIC commissioned a study made by Peter Moser on user expectations and that SCIC regularly make surveys of their users expectations.  Bühler and Kurz more or less concludes that an interpreting is good when it serves its purposes and that different contexts have different requirements (I’m summing up really heavy here).

As both Michelle and Jérôme points out in their comments there is a flood of articles on quality, and there are many studies made in the area, but I’m not sure we have actually come up with something more conclusive than Bühler and Kurz did. However, I would like to draw you attention to something that I have found most interesting in research on quality – Barbara Moser-Mercer was also mentioned in the comments and she published an article in 2009 when she challenges the use of surveys for determining quality. This seems very inspired by the work that has been done in Spain by Angela Collados-Aís  and her research team ECIS in Granada. Unfortunately, she only publishes in Spanish and German, so I had to go there to understand what she does, but it was worth every bit of it. Extremely interesting research. I also have to complement them on how I was received as a guest, Emilia Iglesias-Fernandez made me feel like a royalty, and all the other researchers in the unit was extremely welcoming and accommodating. But here’s the interesting thing:

For the past 10 years they have been researching how users of interpretation perceive and understand the categories most commonly used in surveys to assess interpreting. These categories have typically been since Bühler; Native accent, pleasant voice, fluency of delivery, logical cohesion, consistency, completeness, correct grammar, correct terminology, appropriate style. If I remember correctly, for instance, Peter Moser’s study showed that experienced users of interpretation reported that they cared more about correct terminology and fluency than pleasant voice or native accent.

In their experiments they have been tweaking interpreted speeches so the exact same speech would be done with or without native accent, with or without intonation, high speed or low speed and so forth. Different user groups first rated how important the different categories were and then they were asked to rate different speeches, tweaked for certain features. When you do that it turns out that the exact same speech with native accent gets higher score for quality (i.e. using more correct terminology or correct grammar) than the speech with non-native accent. And the same goes for intonation, speed and so forth.

So it seems like (very strongly argued) features that are not rated important (such as accent) affect how the user perceive important features (correct terminology).

In interpreting research there is also a lot of error analysis going on of course, and many studies base their evaluation of the interpretings used on error analysis. One problem with that is exactly the one that Jérôme points out – maybe the interpretation actually got better because of something that the researcher/assessor perceived as an error.  Omissions is a typical category where it’s difficult to judge that. I have also just gotten results with my holistic scales where the interpreter that I perceived as “much better” (only guts feeling) got much worse scores. One reason for this when I started analyzing my results could very well be the fact that that interpreter omitted more, and thereby, in comparison with the source text, there are more “holes” or “faults” or whatever you would like to call it.

When it comes to exams, Jérôme claims that not much has been done in terms of research on exam-assessment and exams. I have not checked that, but my impression is that Jérôme is right. I cannot remember reading about quality assessment of examinees. I know that entrance exams are studies and aptitude tests, but final exams… Please enlighten me.

Another thing that Jérôme also points out, and which is really a pet subject to me, but where there seem to be very little consensus, at least in the environments where I have been, is the training of the exam juror or the peer-reviewer. Now, I don’t mean to say that there are no courses in how to be an interpreting exam juror, of course there are. But what I mean (and Jérôme too I think), is that people evaluating interpreting do not get together and discuss what they believe is good interpreting or not. You could for instance organize a training event before an exam where jurors get together and discuss criteria and how they understand them, and also listen to examples and discuss them. I’m sure this happens somewhere, but I have not come across if so far.

What’s your take on this? Have I left out any important studies or perspectives? Do you have any other suggestions?

  1. I just realized comments are more appropriate than TW for longer texts 😉

    I had the same idea – organizing a short training on what to focus on, how to assess the performance etc. But more importantly, I would organize such trainings and draw a plan for all trainers to stick to it. At least at our faculty, students sometimes complain that they don’t get the same assessment criteria with all trainers, that sometimes it feels more like the case of personal liking or not … If the same happens in final exams, it leaves students with a bitter taste, not really knowing what went wrong (especially if during the year everything seemed ok). So, I agree, more work needs to be done in this area and schools should opt for a more systemic approach in defining the assessment criteria.

    (PS: Not sure, if they are important, but before joining the trainees at our faculty I read a couple of other interesting papers than the ones cited in your post and the already mentioned thesis (well, I skimmed it through the part that were of interest):
    Hartley, Mason et al.: Peer- and Self-assessment in Conference Interpreting Training > has a great appendix with an assessment sheet – couldn’t get more objective!
    Grbič Nadja: Constructing Interpreting Quality > a more theoretical take on the subject of quality
    Magdalena Bartlomiejczyk: Interpreter Quality as perceived by Trainee Interpreters, self-evaluation

    Hope I am not too long. Btw: compliments for your new page, have to say it agin, I love it!!)

  2. Thanks very much for this post, Elisabet. Ever since I found out you were doing research on quality, I’ve been meaning to ask you to write a post giving an overview of the field – and now it looks like I’ve got it! I particularly appreciate the bibliography (with Jana’s additions). I’ll add some of those papers to my summer reading list ;).

    On the issues you address above, I have to say that I agree that the various criteria can influence each other, and that surveys only record what respondents *believe* they consider important. It’s very interesting to see that researchers have managed to tease out evidence that those beliefs are not always as accurate as respondents might think.

    Along these same lines: I told my students not too long ago that in my experience, communication is worth a lot more than assessors will generally admit. To explain: assessment guidelines may weigh “communication skills” as 25% of the total score, but a good communicator will subconsiously be rated by the listener as being better in the other areas (accuracy, fluency, appropriate use of terminology etc.), which means that it will ultimately end up worth much more than the nominal 25%. Conversely, a student who gives an extremely accurate rendition but doesn’t “communicate” very well will probably be given a lower score on accuracy (and other criteria, for that matter) for the same, subconcious reasons.

    As for omissions, I’ve never been a big fan of omission analysis, and I personally don’t consider all omissions as errors. Often, as we know, omission is used as a coping tactic, and serves to help the interpreter render the message better when under duress. As Dick Fleming told my students when he came to visit, “It’s quality, not quantity”.

    To illustrate: recently, I was working with a very experienced staff colleague who had to do an extremely dense speech full of complex arguments lasting about 20 minutes. Listening to her, I realized that she was probably omitting about 15-20% of the detail, but she was making such good choices that the full message, in particular the argumentation, was coming across loud and clear, and the listener wasn’t missing any of the essence of the original. That takes true skill. If an omission analysis were to grade her performance as worse than someone who had all the detail but presented it in a way that was not easily grasped by the listener, then I’m sorry, but I just couldn’t agree with that.

    Which brings me to the question of exam assessment: if an examination board simply uses the number of omissions as a criterion for assessment, then they are doing the student interpreter a disservice. There is so much more to it than that.

    Of course, there are the dreaded “contre-sens” or errors of meaning, and I have heard more than one juror say “two contre-sens and they’re out”, which is pretty harsh, although I can see their point (you can’t go around saying the opposite of the speaker and expect to get away with it for long). But that’s not the same as an omission.

    The idea of training jurors is an interesting one. I’ve never seen it done at more than an informal level (say, hearing jurors start the exam day with a brief discussion of “where to set the bar”). Training trainers is equally important, if not more so, since they should definitely be singing from the same song sheet when offering guidance to students. Nothing worse for a student than having one trainer tell you to stick closer to the original and the next one telling you to take more distance. I know it happens, unfortunately.

    All this, and we still haven’t talked about the famous “gut feeling” that guides so much of what we do, both as trainers and as assessors. I’ve been thinking lately about how to deal with that, and I’ve decided that “faute de mieux”, it’s probably best to try to train students to develop the same “gut feeling” as their trainers. That way, they may still not know exactly what it is they are doing right or wrong, or be able to measure or quantify it in any objective way, but at least they will still pass the exams in Brussels ;). What do you think?

  4. Hello,

  5. Bühler and Kurz are right that that “interpreting is good when it serves its purposes and that different contexts have different requirements.” For example, it’s wrong to assess community interpreting or court interpreting by the same standards and criteria as conference interpreting. It’s a pity that in the typical discussions about quality, people don’t specify they’re only talking about CONFERENCE interpreting, and then usually only the simultaneous mode. In the part of Canada where I come from, a different kind of test, called CILISAT, is used for community interpreting. It’s based on what psychologists call ‘propositional analysis’ and uses POSITIVE scoring for successfully communicated segments and not negative scoring for faults. I learnt from French translation teacher Daniel Guadec that negative scoring is discouraging for students.

    I also agree with The Interpreter Diaries that omissions aren’t necessarily faults. I used to tell my students, “If the speaker gives three examples, one is usually enough.” In fact consecutive conference interpretations should be SHORTER than the originals because they take up extra time, so something has to give.

  6. I have been thinking of becoming a court interpreter.. but i can not find any information about what to do or where to start anywhere. I speak Spanish and i feel that being a court interpreter would be a great career for me but i need to learn more about it, it has been a struggle for me because like i mentioned above, i cant find anyone in Phoenix, AZ that could answer my questions or walk me through the process.

    What I really want to know is what schools should i look into? how long would it take to get certified or learn enough to begin working as an interpreter?…how difficult is it to get hired?.. what people like and dislike about the profession.. What is the working environment like? What is the stress level.. What kind of qualities do you need to do well in the position?… Any comments would be appreciated.

    I am currently working at a law firm as a paralegal, I began working here with out any experience at all. I really like it and have considered law school but the truth is i do not want to go to school for that long and have such a stressful career. I dont think i am passionate enough to become an attorney, i dont feel like all that loan debt and years in school is worth it to me. I have always liked translating for others and i like the idea of not having as much stress and still being able to be in court while helping others when there is a language barrier.

    If there is anyone out there that could help me, please feel free to contact me via email.

    Thank you very much!

