Test Adaptation: Methods and Pitfalls
概要
TLDRKurt's talk on test adaptation emphasizes a comprehensive understanding of linguistic and cultural nuances essential for fair assessments. He draws on his experience from various international assessments, stressing that test adaptation goes beyond mere translation; it involves a deep understanding of the constructs being measured and the cultural contexts of test-takers. Kurt also addresses the ethical implications of test adaptation, such as the necessity of obtaining permission for the use of copyrighted tests. His discussion includes methodologies for effective adaptation and highlights the ongoing need for rigorous research in this field to improve the quality and fairness of test assessments worldwide.
収穫
- 🌍 Test adaptation involves more than translation; cultural context is vital.
- 📝 Ethical considerations, including copyright, are essential in test usage.
- 🔍 Back translation may not be the best method for ensuring quality.
- 📊 Rigorous research is needed to ensure the reliability of adapted tests.
- 🌐 Language differences can introduce bias in test outcomes.
- 🤝 Collaboration with local experts enhances adaptation processes.
- 📚 Clear guidelines help in maintaining fairness across international assessments.
- 💡 Understanding local education systems is crucial for effective test adaptation.
- 🎯 Adapted tests should maintain validity and construct equivalence.
- 🎷 Test adaptations should account for cultural variations in response styles.
タイムライン
- 00:00:00 - 00:05:00
The speaker introduces Kurt, the director of a testing center, who discusses test adaptation. He plans to cover historical perspectives, current practices, and future considerations, focusing on the implications for language minorities and individuals with disabilities in assessments.
- 00:05:00 - 00:10:00
Kurt shares his early experiences in test adaptation, having interned at ETS in the 1970s. He emphasizes the need for adaptation rather than mere translation, highlighting the complexities involved in cross-cultural testing as a burgeoning field, especially with multinational corporations aiming for global assessments.
- 00:10:00 - 00:15:00
He discusses the necessity of obtaining permission for using measures, citing issues of copyright, especially after an incident involving a retired professor demanding payments from students. Kurt underscores the professional courtesy of seeking permission, even when not legally required.
- 00:15:00 - 00:20:00
The talk explores the pros and cons of adapting tests, including cost-effectiveness, the need for international standardized measures, and psychological considerations. He stresses that adapting tests can present validity and fairness issues, especially when differing cultural contexts influence content comprehension.
- 00:20:00 - 00:25:00
Kurt discusses cultural variability in performance assessments and presents examples of test adaptations across countries, illustrating that seemingly trivial differences can yield significant variations in outcomes. He addresses the importance of understanding cultural context when comparing test performances across different demographics.
- 00:25:00 - 00:32:55
He concludes by emphasizing the need for rigorous research on adapted measures and the potential pitfalls involved in equating assessments across languages. Despite the financial allure for testing companies, he urges caution and thorough investigation into the validity and applicability of adapted tests.
マインドマップ
ビデオQ&A
What is test adaptation?
Test adaptation refers to modifying assessments to fit different languages and cultures, ensuring they are relevant and fair across various demographic groups.
Why is cultural understanding important in test adaptation?
Cultural understanding is crucial because it affects how test items are perceived and answered, influencing the fairness and validity of the assessment.
What are some challenges in test adaptation?
Challenges include differences in education systems, cultural biases, translation errors, and copyright issues.
What is the difference between translation and adaptation?
Translation focuses solely on converting language, whereas adaptation considers cultural context and regional relevance.
What skills are required for effective test adaptation?
Skills include fluency in languages, understanding of cultural nuances, and psychometric principles.
Why should permission be obtained for test adaptation?
Obtaining permission respects copyright laws and acknowledges the intellectual property of original test developers.
What is 'back translation' in test adaptation?
Back translation is a process where a translated test is re-translated back to the original language to check for consistency and accuracy.
What role do ethics play in test adaptation?
Ethics ensure that adaptations respect the rights of original developers and promote fairness in testing practices across cultures.
What are some methodologies for adapting tests?
Methodologies include double translation, pilot testing, and cross-cultural reviews to ensure the adapted test's equivalence.
How can adaptations affect test outcomes?
Adaptations can introduce biases or discrepancies that affect the reliability and validity of scores across different cultures and languages.
ビデオをもっと見る
- 00:00:10and I have the pleasure of introducing
- 00:00:12Kurt guys under who's the director of
- 00:00:14the borough Centre for testing and the
- 00:00:17WC Meier Henry distinguished professor
- 00:00:19at the University of Nebraska to talk to
- 00:00:22us about test adaptation thanks Kurt
- 00:00:25let me thank Amy both for the invitation
- 00:00:28and the introduction ETS and MHS and
- 00:00:33when I was first asked to do the talk I
- 00:00:35was asked whether I was going to talk
- 00:00:37about past present or future and I told
- 00:00:40her past and that was just based on my
- 00:00:42age at the time but but I think I'm
- 00:00:44going to talk about all three and I
- 00:00:47decided to do this most of the work I've
- 00:00:50done in recent years is unfairness
- 00:00:51either for language minorities or for
- 00:00:54people with disabilities and that was my
- 00:00:57initial thought of what I would talk
- 00:00:58about but instead I've decided to talk
- 00:01:00about tests that in part because the
- 00:01:05International test Commission is meeting
- 00:01:07the summer in Montreal and I'm trying to
- 00:01:08drum our supporter for that and in terms
- 00:01:12of history as Neil's started talking
- 00:01:15about when he was in this room I think I
- 00:01:18beat him because I was an intern at ETS
- 00:01:20in 1975 and we had a meeting in here and
- 00:01:24interestingly there's another person who
- 00:01:27was an intern with me that same year I
- 00:01:30was here and it was amazing that they
- 00:01:31allowed Linda Kuk to come when she was
- 00:01:34only in junior high school at the time
- 00:01:38but I also was in this room I was in
- 00:01:42with Warren Willingham and then he hired
- 00:01:44me back the next year as a research
- 00:01:46associate and I joined the GRE technical
- 00:01:51advisory committee in 1993 and in this
- 00:01:55room they they brought me in a day early
- 00:01:58so they had a session on ER and wine
- 00:02:00well again when he was stepping down as
- 00:02:02an officer of the organization and that
- 00:02:05was in this room as well so it does have
- 00:02:07some history now I also have a history
- 00:02:11with the division five APA I first
- 00:02:14joined the executive committee of the
- 00:02:16division of 1993 25 years ago as well
- 00:02:19and based on discussions I had the Jim
- 00:02:25Butcher who was then the assessment
- 00:02:27coordinator or whatever the it's called
- 00:02:29he was a head of the assessment group he
- 00:02:32and I got into a discussion and he asked
- 00:02:33me to do a paper for psych assessment
- 00:02:36which he was editing at the time and I
- 00:02:38did it on the topic of test translation
- 00:02:41and it's the paper that I've done that I
- 00:02:45probably shouldn't have done because I
- 00:02:47very little history in that topic at the
- 00:02:49time but I was a foreign language major
- 00:02:51as an undergraduate I knew something
- 00:02:53about translation and so that's that's
- 00:02:57what I'm gonna talk about and I'm gonna
- 00:02:59say that some of the logic I'm gonna use
- 00:03:01toward the end of the presentation is
- 00:03:03what I would call journalistic logic I'm
- 00:03:05using examples to make points that runs
- 00:03:08counter proper to the typical method of
- 00:03:11division five but I do tell you that
- 00:03:16second to fifth is when ITC is meeting
- 00:03:18in Montreal and it's at the same time as
- 00:03:20a Jazz Festival there in case you love
- 00:03:23jazz so again I'm going to give a very
- 00:03:27quick overview of test adaptation we use
- 00:03:29the word adaptation rather than
- 00:03:31translation because it's more than
- 00:03:33language you have to involve other
- 00:03:35things other than language and I think I
- 00:03:37will make that point very clearly to you
- 00:03:39with some examples and and what I'm a
- 00:03:42little different about is I've actually
- 00:03:45done some work now on adaptation from
- 00:03:48one language and culture to another of
- 00:03:50some performance assessments through
- 00:03:52OECD and I'm going to show you just how
- 00:03:56difficult it is and and it fits exactly
- 00:03:59in with Neil's comments over the
- 00:04:02difficulty of equation or even linking
- 00:04:04at some point because you start
- 00:04:06questioning whether they're the same
- 00:04:07measures by the time you've done some of
- 00:04:09that why are we doing this kind of work
- 00:04:14well first off there are lots of testing
- 00:04:17companies right now that realize that
- 00:04:18the world has shrunk and there are multi
- 00:04:20cult multinational corporations who want
- 00:04:23to administer the same tests all over
- 00:04:25the world we want to make international
- 00:04:28comparisons
- 00:04:30I think our psychological science is
- 00:04:32getting a little stronger that allows us
- 00:04:34to do that in some cases we've
- 00:04:36recognized the differences between attic
- 00:04:39and emic kinds of measures and there are
- 00:04:42a lot of fiscal and pragmatic reasons
- 00:04:44why this it may be cheaper and easier to
- 00:04:46adapt the test than it is to to build a
- 00:04:49new one now let me give you two
- 00:04:52precursors and can't read the citation
- 00:04:56there but I got quoted in a science
- 00:04:59article this past year I was called in
- 00:05:04the interview
- 00:05:04and so forth but it turns out there was
- 00:05:07a retired professor in California who
- 00:05:10had built a test well he go to a survey
- 00:05:14in the sense of whether you take your
- 00:05:16medications the way you're supposed to
- 00:05:18and it's mostly used by insurance
- 00:05:20companies as part of their tests of
- 00:05:23drugs which they have to do to get
- 00:05:25approval and it turns out after he
- 00:05:28retired he got very sporadic about his
- 00:05:32answering of emails and letters and
- 00:05:34things like that and he had copyrighted
- 00:05:37these scales and he makes it very clear
- 00:05:39that they're very expensive to use
- 00:05:41because after all he's selling them to
- 00:05:43insurance companies but a bunch of grad
- 00:05:45students wrote and said can I use this
- 00:05:47for my master's thesis or doctoral
- 00:05:49dissertation he didn't answer them and
- 00:05:52they went ahead and used it anyhow and
- 00:05:54he then sent them bills for upwards of
- 00:05:57$20,000 each and the question was is
- 00:06:02that appropriate and it's a very complex
- 00:06:05question it's not a simple question
- 00:06:06because it is a copyrighted thing what
- 00:06:09just used it without permission and it
- 00:06:12was his right to do so now he has since
- 00:06:14adjusted and decided that when companies
- 00:06:16use it he's gonna have one rating for
- 00:06:18when students and so forth he's going to
- 00:06:20use another rate but nevertheless this
- 00:06:22is an important issue because there are
- 00:06:24a lot of people that translate tests or
- 00:06:27adapt tests that they don't have the
- 00:06:29right to do so and they don't ask for it
- 00:06:31so so I start off by saying if I if a
- 00:06:36measure is copyrighted and published you
- 00:06:38need to get that permission first and
- 00:06:40even if it's not copyrighted you
- 00:06:43probably should write the authors and
- 00:06:46get and at least inform them that you're
- 00:06:48planning to do that that I mean that's
- 00:06:50just common courtesy I think
- 00:06:52professional courtesy now why would you
- 00:06:55do it and I've listed pros and cons here
- 00:06:57and in the interest of science I'm gonna
- 00:06:58go through this really fast that these
- 00:07:01are established measures
- 00:07:03that makes sense they're cost-effective
- 00:07:05and cheaper as I said globalization
- 00:07:08necessitates across culturally
- 00:07:10appropriate measures to fulfill the
- 00:07:12needs to compare evaluate Selectric et
- 00:07:14cetera guidelines and best practice
- 00:07:16research offer more options to test
- 00:07:18users to make informed decisions and to
- 00:07:20reduce negative outcomes the cons are
- 00:07:22that there can be copyright issues and
- 00:07:24count country membership requirements
- 00:07:26and I think ETS has dealt with some of
- 00:07:28those copyright issues I know of over
- 00:07:30the years that they you have to ask you
- 00:07:34the benefits justify the efforts and is
- 00:07:37there a real need to have the same
- 00:07:39measure in a different language that
- 00:07:41fairness and validity of scores for
- 00:07:42target populations and use must be
- 00:07:44normed on the on the target demographic
- 00:07:46issues and translated assessments even
- 00:07:50with careful adaptation still introduce
- 00:07:52additional negative psychometric and
- 00:07:54cross-cultural issues and one of the
- 00:07:56ways that I had learned to this is I was
- 00:07:59an expert witness in two court cases in
- 00:08:01Canada of 25 years ago and the witness
- 00:08:04on the other side was John Conger who
- 00:08:06some of you know and what happened is
- 00:08:10all their tests were built in English
- 00:08:11but then they had to translate them into
- 00:08:13French because they are a bilingual
- 00:08:15country and the French students are the
- 00:08:18French candidates did about 3 percent
- 00:08:21worse than the english-speaking
- 00:08:23candidates and when I asked why I was
- 00:08:25thought it was because the French
- 00:08:27schools are not as good as the English
- 00:08:28schools and later on was told that I was
- 00:08:31indeed right that it was a translation
- 00:08:33issue that that they that questions made
- 00:08:36more sense in English the way they were
- 00:08:37written first and then they were
- 00:08:38translated they didn't do as well so
- 00:08:40essentially it was a built-in bias
- 00:08:42against the french-speaking candidates
- 00:08:47but we do an OECD which is the publisher
- 00:08:50of peas and a bunch of the other surveys
- 00:08:52and they have done some in recent years
- 00:08:55on critical thinking
- 00:08:56in economics at the higher ed level
- 00:08:58which people are not as familiar with
- 00:08:59and and I've worked on the critical
- 00:09:01thinking one which is what I'm going to
- 00:09:02give you examples of people want to make
- 00:09:05comparisons and I'm gonna make the
- 00:09:07argument that some of those comparisons
- 00:09:09are less sophisticated than we'd like to
- 00:09:12think so what are the skills you need to
- 00:09:16adapt to measure well certainly you need
- 00:09:17to be fluent in both languages you need
- 00:09:19a comprehensive understanding or the
- 00:09:21constructs being assessed you need a
- 00:09:24thorough understanding of both cultures
- 00:09:26you have to have some ability to work on
- 00:09:28testing measures there are skills
- 00:09:30involved in writing items and so forth
- 00:09:32and I'm gonna tell you I gave a keynote
- 00:09:34at the Mexican National Academy of
- 00:09:37assessment a couple of years ago and I
- 00:09:40learned they've taken a very different
- 00:09:42model than the United States has they
- 00:09:45have some 89 languages that are from
- 00:09:48indigenous people that make up only 5.4
- 00:09:52percent of the population but they have
- 00:09:54schools representing about 20 different
- 00:09:56indigenous languages and the decision
- 00:09:59they've made rather than the United
- 00:10:01States is that all indigenous people
- 00:10:04will be taught in their own language and
- 00:10:06tested in their own language so their
- 00:10:08national assessment group has to
- 00:10:11translate all their tests to about 20
- 00:10:13languages besides Spanish and and
- 00:10:18instruction and in fact of those 20
- 00:10:20languages 10 of them didn't even have a
- 00:10:22written language so the first thing the
- 00:10:24Mexican government had to do was to
- 00:10:26develop those languages
- 00:10:28languages before they could even decide
- 00:10:31that they were going to testing so that
- 00:10:35were instructed so it's a very different
- 00:10:38model in it's a model that I'm actually
- 00:10:41very comfortable with and I think if we
- 00:10:43were going to build a ball maybe we
- 00:10:44ought to do it the other way keep us
- 00:10:47from going to Mexico but I also know
- 00:10:50that in South Africa they have 11
- 00:10:54official languages so then they build
- 00:10:57the test they have to build it in those
- 00:10:5811 languages right off the bat now what
- 00:11:05do we want in a translation and
- 00:11:07adaptation well the idea initially
- 00:11:09anyhow was it item difficulty should be
- 00:11:11the same within reason across languages
- 00:11:13that sociolinguistic nuances should be
- 00:11:15removed or avoided content relevance
- 00:11:18that access should be comparable across
- 00:11:20cultures the construct relevance and
- 00:11:22validity should be constant we should
- 00:11:25focus on the defined objectives and the
- 00:11:27purpose that formatting appearance at
- 00:11:30comparable tasks should be the same and
- 00:11:32to avoid really bad practices now to
- 00:11:36give you a sense of this the first study
- 00:11:38I did in this regard was with a graduate
- 00:11:40student many years ago who studied the
- 00:11:42ewok which is the ways adult
- 00:11:44intelligence scale the initial form was
- 00:11:47translated bike into Spanish in Puerto
- 00:11:50Rico and for example you may know and
- 00:11:53giving the waist the first test you
- 00:11:56usually gave was the vocabulary and they
- 00:11:59go from easy to hard and that decides
- 00:12:01what you're gonna do well with the
- 00:12:03initial version of the way you are they
- 00:12:06simply translated these the English
- 00:12:08words into Spanish and there was no
- 00:12:11longer any reasonable rank ordering of
- 00:12:13difficulty because once you've done the
- 00:12:14translation but that's how it was it was
- 00:12:16just the same words in the different
- 00:12:18language nots in my mind
- 00:12:20believably bad practice and and I'm
- 00:12:24gonna give you Ron Hamilton has two
- 00:12:26examples that uses frequently one of
- 00:12:29these comes from pieces fourth grade
- 00:12:32science test and the question asked is
- 00:12:35why do ducks swim so well the students
- 00:12:39that do the best on that are the Swedes
- 00:12:42in the world and it turns out when you
- 00:12:44translate webbed feet which is the right
- 00:12:47answer in English in Swedish that's
- 00:12:50swimming feet now there's also another
- 00:12:55question that he's often used that was
- 00:12:58there's a technique which I'm going to
- 00:12:59talk about the minute but back
- 00:13:01translation where you translate it to
- 00:13:03new language and you back translate to
- 00:13:05see how it looks and how comparable it
- 00:13:07is and it was essentially an analogy
- 00:13:10question that was out of sight :
- 00:13:13out of mind translated back that comes
- 00:13:17to blind and insane so you can see this
- 00:13:24is not everybody in test instruction
- 00:13:27knows that test instructions both art
- 00:13:28and science and like Neil was just
- 00:13:31talking about the science part of it I'm
- 00:13:32going to talk more about the art part of
- 00:13:34it because that's that's what we're
- 00:13:35talking about I mean among the
- 00:13:37translation processes you can have a
- 00:13:39simple translation which is what a lot
- 00:13:40of tests like the ewok used initially
- 00:13:43they can have adaptation with checks and
- 00:13:46that's where usually you do this kind of
- 00:13:48a back translation just decided what to
- 00:13:50do and I just looked up back translation
- 00:13:54Pierce was first developed as a
- 00:13:56technique by Brisbane in 1970 so it's
- 00:13:59been around for a while there are people
- 00:14:01when I edited the handbook of assessment
- 00:14:06psychology there are people that had
- 00:14:09chapters in there like butcher
- 00:14:10still say that back translation is the
- 00:14:12state-of-the-art most people would
- 00:14:14disagree with that now simply because if
- 00:14:18you're a translator and you know you're
- 00:14:20going to be evaluated by the quality of
- 00:14:22your translation what happens is you
- 00:14:25translate the question not to be optimal
- 00:14:28in the target language but to be
- 00:14:30optimally translated back to the
- 00:14:32original language and those are two very
- 00:14:34different things okay so so so back
- 00:14:38translation has some problems in the
- 00:14:41article that I wrote in psych assessment
- 00:14:43I argued that those skills that I listed
- 00:14:45are unlikely to be found well in one
- 00:14:47person and say you need committee
- 00:14:48approaches to doing this this has to be
- 00:14:50done by more than one person and cadre
- 00:14:54are secand is also a vice president here
- 00:14:56at ETS has wrote a chapter for my
- 00:14:58handbook on concurrent ways of doing
- 00:15:01this is which is what we CD is trying to
- 00:15:03do this where you build the tests in the
- 00:15:06same in different languages at the same
- 00:15:08time basically it doesn't work for
- 00:15:13pre-existing measures which a lot of the
- 00:15:14Tesla translation work is done on
- 00:15:17measures that achieve a certain amount
- 00:15:20of notoriety in target tipica in the
- 00:15:22initial language usually English but in
- 00:15:25this concurrent model what happens is
- 00:15:28you develop two forms at the same time
- 00:15:30you have groups working together that
- 00:15:33they work with a shell it's malleable so
- 00:15:36that they can change it as they go now
- 00:15:38if you can imagine two committees doing
- 00:15:41that that's not so hard but if you start
- 00:15:43thinking about Mexico when you think
- 00:15:45about 89 committees doing that it's it's
- 00:15:47unimaginable in my mind you know or even
- 00:15:5011 perhaps in South Africa so it's very
- 00:15:53difficult once you get more than two now
- 00:15:57one of the things we forget about is
- 00:15:58culture and culture has a big impact
- 00:16:01especially when you get into personality
- 00:16:03variables and things like that but the
- 00:16:04examples I'm going to give
- 00:16:05journalistically in performance
- 00:16:08assessments I would argue that there's a
- 00:16:09lot of cultural issues that affect those
- 00:16:12responses to we heard an earlier talk
- 00:16:16that length is one of the big
- 00:16:18characteristics that assess the quality
- 00:16:20of essays well length might be a very
- 00:16:23culturally dependent kind of variable as
- 00:16:24an example so if you're going across
- 00:16:27languages or cultures you might find big
- 00:16:29differences and as someone who's
- 00:16:31traveled to a variety of
- 00:16:32english-speaking countries including
- 00:16:34South Africa I will tell you there are
- 00:16:36big cultural differences even as you
- 00:16:38start going across some of those
- 00:16:41countries so in this 1994 article that I
- 00:16:48wrote I listed steps for for adapting a
- 00:16:50measure and I'm gonna go through them
- 00:16:53really fast
- 00:16:54first e translator they have to measure
- 00:16:56that sounds like it should be the whole
- 00:16:57thing then you review the translated
- 00:17:00measure 3 you revise that measure based
- 00:17:03on comments from the review then you
- 00:17:05pilot that's a small scale testing then
- 00:17:08field tests standardized scores perform
- 00:17:12validation research as appropriate
- 00:17:14develop a manual and other documents for
- 00:17:16users of the assessment train users and
- 00:17:18collect reactions from users dan well I
- 00:17:22know that our second and Lyons Thomas
- 00:17:25which were the people who wrote the
- 00:17:26chapter for my my handbook had some
- 00:17:29other steps and I'm not going to go
- 00:17:30through them but but shortly after I
- 00:17:33wrote that article which was really one
- 00:17:35of the first things on how to how to
- 00:17:38adapt measures Hamilton and Petula
- 00:17:40suggested that I left a few things out
- 00:17:43which included hiring the appropriate
- 00:17:45translators ensuring construct
- 00:17:47equivalents and that's something Barbara
- 00:17:49Byrne has written
- 00:17:50and I would encourage you to take a look
- 00:17:53at her work and then even to decide
- 00:17:56whether or not to adapt there to build
- 00:17:58the new and and whether to link scores
- 00:18:00across and I'm going to come back to
- 00:18:02that and in another article I've written
- 00:18:04I've pointed out that I think there are
- 00:18:06real scoring issues that have to be
- 00:18:08addressed across versions and so I think
- 00:18:12there are lots of different things we
- 00:18:13could add to that
- 00:18:14it certainly wasn't something to keep
- 00:18:16down on a tablet in terms of and that
- 00:18:20this is where quantitative and
- 00:18:22qualitative clearly get involved you
- 00:18:26have to have reviews of the assessment
- 00:18:27for usability reviews of the instrument
- 00:18:30for comparability pre tests with
- 00:18:32relevant individuals timing and and we
- 00:18:35know that culturally there are huge
- 00:18:36differences in terms of people's
- 00:18:38consideration of time and how important
- 00:18:41time is suitability of instructions and
- 00:18:44questions about the appropriateness of
- 00:18:46certain items and so forth Billy
- 00:18:50Solana Flores with whom I've worked on
- 00:18:52some of the projects we're talking about
- 00:18:54here has defined something called test
- 00:18:57translation error the lack of
- 00:18:59equivalence between the source language
- 00:19:01version and the target language version
- 00:19:02of test items due to the nature of
- 00:19:06languages it's possible that an adapted
- 00:19:08formative assessment does not capture
- 00:19:09our transfer of nuances and psychometric
- 00:19:12or consequence is that the adapted
- 00:19:16version potentially tests different
- 00:19:17constructs in the original form or test
- 00:19:20them slightly differently so what kinds
- 00:19:24of research is needed after adaptation
- 00:19:27certainly you need to check reliability
- 00:19:29in a variety of different ways
- 00:19:31because it's so easy we frequently only
- 00:19:33do internal consistency anymore but I
- 00:19:36think test three tests and other things
- 00:19:38to note whether it's a state or a threat
- 00:19:40for example are also important item
- 00:19:43analysis important factor analysis of
- 00:19:46items SEM analyses and that's what
- 00:19:48Barbara pushes and then secondarily
- 00:19:51there I've got the SEM Fairness analyses
- 00:19:54although one of my former students Steve
- 00:19:58Sarita who many of you know he's talked
- 00:20:01a lot he's actually used to do workshops
- 00:20:03on using DIF in adapted and translated
- 00:20:06measures but more recently has come up
- 00:20:09with the idea it's probably not
- 00:20:10appropriate to do different ala C's
- 00:20:12across versions because what you're
- 00:20:15doing is you're confounding two
- 00:20:17variables with no ability to separate
- 00:20:19them you have group differences and
- 00:20:21translation differences and those are
- 00:20:23completely and totally confounded so you
- 00:20:25can't separate them he and Swami Nathan
- 00:20:29have written that up looking at norms
- 00:20:32and and then there's the possibility of
- 00:20:33linking and I should note that Linda and
- 00:20:36Bill egg off I think Linda Kuk into the
- 00:20:38logoff did probably the best-known
- 00:20:41blinking study I think on the Spanish
- 00:20:44version of the SAT to the English
- 00:20:45version back maybe 20 years ago we're so
- 00:20:4825 now beyond validity this came up
- 00:20:52earlier there's the term of utility and
- 00:20:54usefulness and and my favorite example
- 00:20:57of this is the Canadian SAT now there's
- 00:21:00probably only one or two people that
- 00:21:01even know
- 00:21:02wasn't in this room but in the in the
- 00:21:0660s the Ontario Institute for studies of
- 00:21:10Education decided they were going to
- 00:21:12build an SAT an ETS send-up said a lot
- 00:21:15of consultants to work with them and
- 00:21:17they built a very nice Canadian SAT and
- 00:21:20when they were all done the Canadian
- 00:21:24government decided students shouldn't
- 00:21:26pay for it the university should pay for
- 00:21:28it if they want so all the costs were
- 00:21:30going to be distributed to the
- 00:21:32universities and at that point they
- 00:21:35decided no one wanted to use it and so
- 00:21:37it went away so all the development
- 00:21:38costs were for naught and and that's in
- 00:21:41my mind the classic case of poor utility
- 00:21:43and poor planning but there are there
- 00:21:47are other cases you have to decide is it
- 00:21:48really worth doing this from a whole
- 00:21:50variety of purposes and then the
- 00:21:52question does it make sense to equate or
- 00:21:54link tests across languages and perhaps
- 00:21:57if the questions are really similar and
- 00:21:59you have a lot of other information that
- 00:22:02you know about it might make sense
- 00:22:06slides get better okay
- 00:22:08I thought it was making it easier for
- 00:22:10people to sleep but let's see the
- 00:22:14decision demands really very high level
- 00:22:18of tests and psychometric equivalence
- 00:22:20you must be convinced that those tests
- 00:22:23are really highly comparable and most
- 00:22:25acquainting designs have much more
- 00:22:27rigorous requirements as Neil just
- 00:22:29explained then we have in adaptation
- 00:22:31studies and there's a good article in
- 00:22:35measurement issues and practice entitled
- 00:22:38problems and issues and linking
- 00:22:40assessments across languages by Cerises
- 00:22:41and others what we need to know and
- 00:22:45we're adapting a measure is one are the
- 00:22:49constructs equivalent you need to know
- 00:22:51that even before you get into the
- 00:22:53measure
- 00:22:53itself then we have these same
- 00:22:55constructs in different cultures are
- 00:22:56they equally meaningful in different
- 00:22:58cultures then are the tests equivalent
- 00:23:01in those different cultures and are the
- 00:23:03testing conditions and so forth the same
- 00:23:06and all those really need to be
- 00:23:07established Creuset noted that
- 00:23:10adaptation errors are most prevalent
- 00:23:12source of DIF and international
- 00:23:14assessments and he said that we know
- 00:23:17that even state the state there are some
- 00:23:18particular differences but when you go
- 00:23:20across countries there are huge
- 00:23:21curricular differences then there are
- 00:23:24cultural biases and translation errors
- 00:23:26all of which cause postural issues now
- 00:23:32the international test Commission bill
- 00:23:33is is famous really for its tests
- 00:23:36adaptation guidelines that came out a
- 00:23:38few years ago in their second edition
- 00:23:40and these are to promote good practice
- 00:23:43and test an adaptation you may know that
- 00:23:45ITC the International test Commission
- 00:23:47was developed initially because of
- 00:23:50European countries as as Europe became
- 00:23:53the European Union they they you can now
- 00:23:59move easily across countries and so
- 00:24:00forth so that people need to take tests
- 00:24:02in different languages and so they've
- 00:24:05they've put out really simple
- 00:24:08easy-to-understand guidelines and
- 00:24:10they're all freely available and
- 00:24:12downloadable they now have like six sets
- 00:24:13of guidelines this is to ensure a level
- 00:24:17playing field for testing across
- 00:24:18national boundaries and to provide a
- 00:24:20mechanism whereby test users can observe
- 00:24:23their duty of care to the public without
- 00:24:25regard to national boundaries I do
- 00:24:28believe documentation is important and
- 00:24:31that's one of those things that has
- 00:24:33increasingly become difficult to find in
- 00:24:35the test
- 00:24:35lots of tests don't have manuals anymore
- 00:24:38I know when camara told me a few years
- 00:24:41ago that the college water decided well
- 00:24:43we're doing the research but we don't
- 00:24:44have to pull it all together into a
- 00:24:46single book you know I think users need
- 00:24:50information that's easily available and
- 00:24:53so forth
- 00:24:55now I'm gonna get into the adaptation
- 00:24:58issue some of you may know the critical
- 00:25:01thinking component of the ELA which
- 00:25:05stands for let's say one good assessment
- 00:25:09of higher education learning outcomes in
- 00:25:12English it's not as well known as Pisa
- 00:25:14and so forth
- 00:25:16used the CLA the clergy collegiate
- 00:25:20learning assessment which you also may
- 00:25:21know is a outcomes assessment measure
- 00:25:24used by some 1,300 colleges in the
- 00:25:26United States it's now the CLA plus it
- 00:25:30was based actually on a GRE model in a
- 00:25:32sense it's a performance assessment
- 00:25:34where you read three or four pages of
- 00:25:36material and then you write an essay and
- 00:25:39it is it used to be scored it isn't
- 00:25:42anymore but it used to be scored in
- 00:25:43English by the GREs assessment automated
- 00:25:49assessment now burrows was was hired to
- 00:25:54translate this into a variety of
- 00:25:56languages or to work with National
- 00:25:58Committee's South Korea Slovakia Egypt
- 00:26:00Colombia and so forth and a few other
- 00:26:02countries now it's an essay you read
- 00:26:05this problem the problem that they used
- 00:26:07internationally was that there's a two
- 00:26:11legs there's a river between them you
- 00:26:13want to harness water power as it goes
- 00:26:17from one leg to the other across the
- 00:26:18river but there's an endangered fish
- 00:26:20that lives in
- 00:26:21River and so there's no right answer to
- 00:26:24this but the thought is you have to
- 00:26:25write an essay that describes you're
- 00:26:27sensitive to the fish and you understand
- 00:26:30the need for power and things like that
- 00:26:32and I should note it's a company and
- 00:26:34it's a for-profit company that once the
- 00:26:37harness to power so so we work with
- 00:26:42these different countries with teams of
- 00:26:44people in the country to work on those
- 00:26:46translations now Slovakia as an example
- 00:26:48was a Western country used to be part of
- 00:26:51Czechoslovakia they're a NATO country
- 00:26:53there were almost no problems there at
- 00:26:56all it translated very easily into their
- 00:26:58language it makes sense to them and so
- 00:27:00forth
- 00:27:00now we Richard eggleston had had done
- 00:27:03the translation there the year before
- 00:27:05and they have no rivers no legs the
- 00:27:10students don't know anything about water
- 00:27:12power and so the way the problem was
- 00:27:15changed was this became a seagoing fish
- 00:27:19and they were trying to harness ocean
- 00:27:20power now that starts changing the
- 00:27:23question it's now it's introducing a new
- 00:27:27concept as opposed to a concept that
- 00:27:29people may know about now Columbia was
- 00:27:33another country and and Willie Solano
- 00:27:35Flores worked with us on this one and
- 00:27:39years ago I knew that when we talked
- 00:27:41about turning the GRE into Spanish we
- 00:27:44were told we'd need at least three
- 00:27:45different versions of Spanish and indeed
- 00:27:47Colombia needed a different version from
- 00:27:49Mexico that had already translated it
- 00:27:52and and it was it was mostly the same
- 00:27:56but just different words being inserted
- 00:28:00now then we get to South Korea South
- 00:28:03Korea said they had a great adaptation a
- 00:28:07great translation but our analysis of
- 00:28:10the data showed that it made no sense
- 00:28:11was almost random data it looked like
- 00:28:14and yet so I was charged to find out why
- 00:28:18this was not work
- 00:28:19and it just so happened I had a doctoral
- 00:28:21student was Lightman ETS intern by the
- 00:28:22name of my son Li and she is now done
- 00:28:26and she teaches in the California State
- 00:28:27University system and she's read it and
- 00:28:31said this doesn't make any sense because
- 00:28:33there's no power companies in South
- 00:28:36Korea the government supplies the power
- 00:28:38and it's and it isn't something that
- 00:28:41people pay for in the same kind of way
- 00:28:43that they do in the United States so
- 00:28:46they had to change the question and once
- 00:28:50we changed it to the government suddenly
- 00:28:52the data came out much better so they
- 00:28:55had they have done more of a literal
- 00:28:57translation of that component and it
- 00:28:59just didn't work now then we get to
- 00:29:03Egypt and I did this one with Willie
- 00:29:07Solano by the way and we had an Arabic
- 00:29:10version already based on Kuwait but we
- 00:29:13were called equate spoke hi Arabic in
- 00:29:15Egypt spoke low Arabic I'm not sure
- 00:29:17about that differences but that happens
- 00:29:20in a lot of countries and so we we knew
- 00:29:22we had to at least do that now obviously
- 00:29:24I'm like wait they have the Nile running
- 00:29:27right through Cairo so they they know
- 00:29:29rivers and lakes and we actually did
- 00:29:32think aloud as much as you would do with
- 00:29:34students with disabilities and we
- 00:29:36watched two people it's interesting to
- 00:29:38watch people and doing this in Arabic
- 00:29:39when you don't speak it but we got
- 00:29:41translations back and the biggest
- 00:29:44problem they said is their power is also
- 00:29:47provided by the government but they said
- 00:29:50no one in the government would ever ask
- 00:29:52for our input as a consultant it just
- 00:29:55would never happen the government
- 00:29:57believes it knows all the answers and
- 00:29:58basically the bottom line is and I want
- 00:30:03to just mention till we did this at the
- 00:30:05tail end of the Revolution and there was
- 00:30:07gunfire in the background while we were
- 00:30:09doing this and it was and you know to
- 00:30:12get into our hotel
- 00:30:14wieners and that dog sniffers on the car
- 00:30:17and stuff like that it was really quite
- 00:30:19fascinating so their solution was to
- 00:30:23make this into the United States that
- 00:30:25you're a consultant to a company United
- 00:30:27States doing this because they thought
- 00:30:28in the United States people might
- 00:30:30actually ask answer ask questions I'm
- 00:30:32not for interest of time I'm not going
- 00:30:35to go through these methodological
- 00:30:36issues I'm told that I'm getting really
- 00:30:39sure but our experience looking at some
- 00:30:42adapted measures is that some users try
- 00:30:46to translate the validation from English
- 00:30:48and just say it's the same he's done
- 00:30:51that for example we just believe the
- 00:30:53violation research is the same as it was
- 00:30:55in English sometimes they do it in a
- 00:30:58couple of countries and then they assume
- 00:30:59well it's if it's true in Mexico and
- 00:31:02it's true in Spain well then it would be
- 00:31:03true all over the world and that that's
- 00:31:05problematic some scales even use the
- 00:31:08same norms from the original language
- 00:31:11let's see where they do have norms it's
- 00:31:13usually a much smaller and less
- 00:31:14representative sample that was true on
- 00:31:16the e wall which was done in a very
- 00:31:18disproportionate sample in Puerto Rico
- 00:31:20and that there become lots of other fit
- 00:31:23issues we believe osed uses national
- 00:31:27expert committees they have double
- 00:31:29translation which means two people are
- 00:31:30actually translating the measure are two
- 00:31:32groups and then they compare the two
- 00:31:34translated measures for comparability
- 00:31:37and they have either cross checks or
- 00:31:40reconciliation we think again I'm going
- 00:31:46to skip this one I think but there are
- 00:31:47reasons why this is continuing on
- 00:31:51context measures matters and we know in
- 00:31:54some countries for example a huge
- 00:31:56proportion like the United States
- 00:31:58actually a huge proportion of people go
- 00:31:59to college and universities if there are
- 00:32:01some other countries where it might only
- 00:32:02be five or ten percent of the population
- 00:32:04so then you're Aaron you have very
- 00:32:07unusual comparisons there's lots of
- 00:32:11economic factors cultures perceptions
- 00:32:13linguistic structures styles there are
- 00:32:17countries
- 00:32:18go to different universities for example
- 00:32:21so my themes as a whole that adapted
- 00:32:25measures have a huge appeal they have
- 00:32:27great potential they have huge financial
- 00:32:29work for testing companies but we need
- 00:32:32to conduct more and better research on
- 00:32:34adapted measures and I questioned
- 00:32:36whether a lot of cross national cross
- 00:32:39language and weightings will be possible
- 00:32:40or even meaningful thank you very much
- 00:32:44[Applause]
- test adaptation
- cultural understanding
- language assessment
- psychometrics
- ethical considerations
- test translation
- cross-cultural research
- validity
- reliability
- international assessments