View our in-class presentation at: http://prezi.com/ugewnywkdqnc/

The Myth of Objectivity

1. Introduction

Figure 1. Barack Obama addresses the value of standardized testing

Our essay will critically analyse the concepts of "standards", "testing", and "accountability". To begin, it is important to define our terms.
When discussing "standards", we will be using the definition found in the Encyclopedia of Education.
  • "The term standards here refers to official, written guidelines that define what a country or state expects its state school students to know and be able to do as a result of their schooling" (Crighton, 2002)
For the purposes of this essay, "testing" will consistently refer to "standardized testing". Again, we will use the definition found in the Encyclopedia of Education.
  • "The term standardized testing was used to refer to a certain type of multiple-choice or true/false test that could be machine-scored and was therefore thought to be 'objective.'" (Crighton, 2002)
With regards to "accountability", Jacob E. Adams and Michael W. Kirst define it as follows:
  • "[A] relationship in which a 'principal' holds an 'agent' responsible for certain kinds of performance. The agent is expected to provide an 'account' to the principal. This account describes the performance for which that agent is held responsible." (1999)

The defining quality of a good test is objectivity; the test itself should be objective, and as such the results and data it produces should be objective. Given the title of our essay, "The Myth of Objectivity", it is evident that we have come to critical and skeptical position on the objectivity of standardized testing. To illustrate this central theme we have included the following graph (Figure 2), which will be revisited throughout the essay. Given the data in Figure 2, which school would you choose for your child? The answer seems obvious - School B; but can it be said that School B is objectively the best school? Our hope is that by the end of this essay the choice of School B will not be as obvious as it seems now.
Figure 2. Hypothetical Standardized Test Scores for 3 Schools

2. Initiating Readings Summarization and Analyses

2.1 "Examined Life" by Malcolm Gladwell

Stanley Kaplan grew up in Brooklyn, New York, on Avenue K. Since Stanley was a little boy, he was immersed in his studies while his friends were playing stick ball. He always went out of his way to assist students if they were having difficulty understanding the material presented in class. On occasion, he would even take over instructing a math class. After graduating Phi Betta Kappa from a City College, Kaplan began tutocring students out of the basement of his parents home which he called the "Stanley H. Kaplan Educational Center." In 1946, he was approached by a student for help on a college entrance exam he was unfamiliar with called the Scholastic Aptitude Test (S.A.T.).
Figure 3. Malcolm Gladwell

The intention of the test was to measure 'innate ability.' It was supposed to measure not what a student has learned, but their capability to learn. Stanley was puzzled by the idea of not studying for a test, so when students came to him to be tutored for the S.A.T., he gave them repetitious exercises to practice for the test. His students seem to score high on the exam after following his guidance. In the 1970s, he went national and designed a S.A.T. preparation course.

Kaplan undermined the design of the S.A.T. As soon as he was able to coach students to write the S.A.T., the validity of the aptitude test was lost. Research is now suggesting that the S.A.T. is not a good measure of predictive validity, which is a statistical measure of how well a high-school student's performance on any given test predicts his/her performance as a college freshman. Rather, achievement tests which measure accomplishment are 10 times more likely to predict the success of students.

The University of California is one of the first universities that is initiating a move to a holistic process of admission, which abandons the reliance on standardized test scores. Kaplan ultimately devoted a great amount of time figuring out the ideology of the S.A.T. He used old exams as templates to learn its design. He reasoned that anyone that could understand the ideology of the exam could have a tremendous advantage.

Additional research shows that it is not simply about innate ability, but rather parental investment and practice are factors in predicting how well one might do. Ultimately, Stanley Kaplan punctured the mystique of the S.A.T. His program revealed that even test-taking skills could be developed by practice.

2.2 Goals 2000: What's in a Name? by Susan Ohanian

Figure 4. Susan Ohanian
This article begins by questioning the educational goals entitled “Goals 2000”, which has been implemented by the American government. These established goals deal with issues in which the government determines as important steps that must be taken by American youth during their education. The author of this article refutes these notions and develops many interesting points that explain why the government is just, “interfering in the lives of children”. Ohanian describes the ways in which the corporate world is meddling with the educational theories and practices of today. She explains that these people describe students as “human capital” (p345) and the relationships between teachers and the community as the relationship between “buyers and sellers” (p345). This article explains that it is business people that are defining the business of education and the author insists that these individuals should stop because they are completely ignorant of how the education system truly functions. Ohanian develops this argument by stating that the test developers pay no attention to the social structures, events and ramifications that can occur when they are developing a test. From this article it can be concluded that students are simply seen as intake machines that can be bought or sold on an open market and the only concern of test developers is the vast quantity of funds that can be generated through the educational system.

The author investigates standardized testing and comes to the conclusion that these tests are not serving any real educational purpose because in some cases, 98% of schools in the state are receiving failing grades. She points out that she believes it is terrible that these types of test scores reflect issues outside of the educational system, including real estate values and community funding. This article gives examples of ridiculous questions that have been on previous standardized tests and then goes on to explain the reasoning that the test creators gave in choosing the correct answer. Ohanian makes the point that students do not grow according to standards, so how can society justify using these tests to pinpoint the intellectual aptitude of the student. These standardized tests are pre-condemning students to failure. Ohanian points out the fact that when student’s fail standardized tests miserably, society is quick to blame students and teachers, but never challenges the system that has designed the test in the first place. These standards that are being thrust onto the student body of America are having negative implications and leading to children as young as 9 years old thinking they will bring shame upon their family if they do not pass a test (p348). Ohanian points out that standardized tests seem to be taking over all forms of major assessment and evaluation, leaving many students with unsatisfactory grades. Ohanian also explores the concept of accountability by stating that the individuals who develop the tests so not have to answer to anyone, as only teachers and students are challenged when failing grades are achieved. Ohanian also points out that students should be allowed to fail without facing dire repercussions, as it is through failure that students learn how they can improve. The author concludes by stating that it is the educational system that is actually failing and improvements can only be made through massive restructuring actions.

This article is written from an individual who is very sceptical of standardized testing. Writing from an extremely left perspective, Ohanian makes many accusations and allegations against the merits of standardization. This article shows the differences that exist between conformity and critical thinking, and how individual thinking can be detrimental to overall grades. Throughout the article, it is brought to the readers attention that there is no way in which the validity of test creation can be assured. This article also describes the way in which students are forced to conform to the test and that tests never conform to students. This article brings to light that these standardized tests do not have any assessment for learning and instead only have assessment of learning. This article shows that testing is seen as an end and not a mean. Standardized tests are also freighted with assumptions and as a society, we are ignorant of how to fully understand them. This article brings to light the fact that the problems with testing are just a cover up for the larger governmental problems that exist socially. This article also shows that as a social group, we believe that the output data generated from tests alone is significant enough to assign a level of intelligence to a student. We believe that numbers seem to be truthful whereas other more narrative means of evaluation may be seen as fictitious. The myth of tests being objective incorporates our hunger for numbers and masks the arbitrariness that under inspection seems outrageous

2.3 "Beware of the Standards, Not Just the Tests" by Alfie Kohn
Figure 5. Alfie Kohn

Mr Kohn argues that too many people are criticizing the tests, and not the standards. He continues to say that both are inextricably connected, tests serve as the enforcement mechanism for the standards. So if the standards are flawed, then the testing must also be flawed, garbage in--garbage out!
Mr. Kohn put a microscope to the idea of 'standards' and decided on four categories in which to judge them.

#1 How Specific? (Students learning how to learn vs. Being told what to learn)
Students learning how to learn:
The U.S. commissioner of education (Harold Howe II) under President Johnson (1963-1969), was asked what should the national standards be like? He responded "as vague as possible"
Alfie argues that its better to offer broad guidelines for student learning. This gets students more actively involved, designing their own learning, formulating questions and creating their own projects. Students can then, reason carefully, communicate clearly and have fun!
Being told what to learn:
Policy makers want a teacher-proof curriculum. Teaching then becomes a race to cover a huge amount of material, this dumbs down classrooms. It also makes students more excluded from the learning process, alienated and learning isn't fun.
Teaching well isn't about working your way through a list of what students should know. But getting students to think for themselves...

#2 How Quantifiable? (Measurement vs. Exploration)
Standards are chosen on their testability and tests are based on those standards. Can we teach a student a list of facts and skills and have that person write an exam in which we can grade them? You have to be careful! It is much easier to quantify the number of times a semicolon has been used correctly in an essay than it is to quantify how well the student has explored the ideas of that essay. Some things can't be quantified and measured, these turn out to be very important for a students future and understanding the world.
Alfie concludes: "the more emphasis that is placed on picking standards that are measurable, the less ambitious teaching will become."

#3 How Uniform? (Opportunity vs. Uniformity)
A positive aspect of standardized testing is making sure students from poor families don't receive a second rate education
But, standards are dictating what each student needs to know by the end of each grade. This is an assumption that each child is the same, forces all children to learn at the same pace and has negative effects on learning (as described before).
Not every student learns at the same speed, some are more gifted than others
We should push each student to their full potential!

#4 Guidelines or Mandates? (Free thinking vs Conformity)
Free thinking standards or standards that offer only guidelines.: this helps teachers venture into different ways of thinking and improve their craft/skills
Standards presented as mandates: "teach this or else" approach. Virtually all the states have chosen this approach. It control teachers. Insulting to teachers: that they need to be told what and how to teach, infers that teachers otherwise wouldn't know how to teach. Some of the best educators might find another career. Students will be the ones who suffer.

-Mr. Kohn had three overall views:
1) standards are just as problematic as testing
2) Our top priority should be to free educators and students, to allow teachers to instruct creatively and guide students to learn how to learn
3) What is troubling today is not people disagreeing, but rather the rarity of such discussion.

2.4 "The Questio
Figure 6. E.D. Hirsch
n of Fairness" by E.D. Hirsch

In an attempt to counteract the current inequalities between the test scores of ethnic and racial minority students and white students, some people have suggested using compensatory practices such as “renorming” test scores to raise the averages of the lower achieving students. Hirsch argues that this is not a practical solution because it simply erases the differences between students instead of addressing them.
(Renorming: One takes the scores from each “racial and socio-economic” group that shows a gap, and then makes a mathematical adjustment for all members of that group)
The strongest objection to renorming is to the “practical injustice” it presents to those people it is trying to help; getting high scores on a test is not the ticket to social equity for students with low educational competencies. Real social justice, aurgues Hirsch, lies in improving the economic conditions of groups that have been both economically and educationally depressed. Nowadays, he continues, one’s real-world competencies are much more important than one’s grades or scores. As such, by superficially increasing the achievement of minority students through methods like renorming or by adjusting the ways in which student performance is measured you are still not addressing the most crucial part of the problem.
“Anti-testers” who suggest that tests like the SATs are biased are right in one sense and wrong in another: the tests are not “culturally biased” in that they contain information that would function to discriminate against people belonging to a specific ethnic or racial group; tests are, however, biased against certain cultural and linguistic groups that do not promote reading, writing, and speaking of standard English, and the solving of math problems. But this is a problem that will necessarily arise if the tests continue to focus on skills that are necessary to function in the marketplace – as they should, says Hirsch.
Most parents, including minority parents, accept the idea that schools should teach mainstream science, mathematics, and language arts. These parents clearly recognize the direct connection between economic advancement for their children and the mastery of the culture of the marketplace.
An important concept that Hirsch discusses is that of the lingua franca. Hirsch describes the lingua franca as the language of the marketplace, but it is more widely used to mean to the "standard language" of an era - the language in which people of different mother tongues communicate. Currently, the lingua franca is standard English. Hirsch goes on to say that the marketplace is a commons, it erases ethnic distinctions, and is, in fact, the creator of the lingua franca. The standard of written English we use now is hybrid created in and by the marketplace; as such, the "notion that such a hybrid culture, devised to enable communication between strangers, is somebody’s essential, identity-defining culture is a historical [mistake]."
Another point of contention for those in favour of renorming is that of the "monocultural classroom". What detractors fail to see is that effective classroom schooling has to be monocultural for the same reason the marketplace has to be – so that all can participate. But it is misleading to suggest that this monocultural school-based culture is not itself a hybrid, as lingua francas always are.
Despite school integration, Hirsch posits, American public education has always had a differential effect on social classes, and as such a differential effect on ethnic and racial groups who belong disproportionately to disadvantaged classes. The reason is as follows: "students from middle and upper classes, coming from educated homes, learn more in school and become more competent than educationally less advantaged students because the intellectual capital derived from their homes enables them to derive a great deal more from poorer schooling than can students who are in no position to fill in the gaps with home-provided knowledge."
Finally, fairness in testing cannot be separated from fairness in schooling. Like mentioned earlier, the focus on test scores neglects the underlying issue: incompetent students and ineffective schools.

3. The Achievement Gap

3.1 Understanding the Achievement Gap

To carry on any discourse about “The Achievement Gap” we must first define the term. According the Encyclopedia of Education, “numerous cross-cultural studies indicate that many ethnic minority students are not faring well in U.S. schools” (Ho, Raley, Whipple, 2002); the disparity in academic achievement between ethnic groups is referred to as “The Achievement Gap”. The term is used most frequently used in the United States, and is also used to describe the differences in academic achievement between social and socioeconomic groups.

While high school graduation, college-enrollment and -completion statistics are occasionally used as a measurement tool, the most common data used to compare groups are scores on state and nation-wide standardized tests. In its National Assessment of Educational Progress (NAEP), the US governmental agency National Center for Education Statistics found the following disparities in achievement between Whites, Blacks, and Hispanics:

Figure 7. Reading- ages 9 (light gray), 13 (dark gray), and 17 (black)


3.2 Overcoming the Achievement Gap - Addressing the Quality of Education, Not the Quality of Tests

As introduced in section 2.4 "The Question of Fairness", E.D. Hirsch discusses the practice of "renorming" - arguing that it does not, in fact, help the minority students that proponents claim to be benefitting. The argument that Hirsch makes, and that I have come to agree with, is that the focus should be on address the gap in student knowledge as opposed to test results. (Hirsch, 1999)

In an article titled "It can be done, it's being done, and here's how" (2009), Karin Chenoweth profiles a number of American middle and secondary schools that have succeeded in lessening and closing the achievement gap - not by focussing on standardized test results as the problem, but by using them as an indicator. Her findings on the success of these schools are summarized succinctly in the following observation: "they succeed where other schools fail because they ruthlessly organize themselves around one thing: helping students learn a great deal." (Chenoweth, 2009)

Chenoweth explores several factors that have contributed to the success of these schools; for the purposes of this section I will address only one: background knowledge. A critical part of the struggle faced by minority and low-income students is that of background or precursory knowledge. As mentioned previously (section 2.4), minority and low-income students often struggle because the gaps in their knowledge are not being filled in at home. (Hirsch, 1999) Whereas middle- and high-income students may have their learning reinforced and supplemented by educated parents, school is essentially the only source of knowledge for many minority and low-income students. (Chenoweth, 2009; Gladwell, 2008; Hirsch, 1999) In his best-selling book "Outliers" (2008) Malcolm Gladwell examines studies done by sociologist Karl Alexander that compare the results of math- and reading-skills exams in California first graders. Alexander found that by the end of school year low-income students scored just as well, or better, on these exams when compared to their middle- and high-income counterparts. The gap appeared when students were tested in the fall, after returning from summer break. The reading and math scores of middle- and high-income students increased over the summer, while low-income students showed no gains and often had lower scored after the summer. (Gladwell, 2008)

In the successful schools that Chenoweth profiles, this issue is addressed head-on. Chenoweth states "[schools] that successfully teach students of poverty and students of color do not begin with the assumption that there are things they don’t have to explain" (2009); she then goes on to explain the rigorous process of diagnostic assessment that these schools practice. The deciding factor in success, however, is that the schools do not simply use the scores to rank and classify students, they are used to inform curriculum and classroom practices. Gaps in student knowledge are identified and then filled in with resources such as documentaries, monthly field trips to the zoo or theatre, trips to the state capital - all with the intended purpose of building student vocabulary and general background knowledge. One teacher gives the example of viewing a nature documentary and earthquakes and volcanoes so that his students could understand the references in a novel study. (Chenoweth, 2009)

Using Standards to Guide Learning - When "Teaching to the Test" is Useful

"Good Tests are necessary to instruct, to monitor, and to motivate." - E.D. Hirsch

Hirsch and many people in the “pro-standards /-test" camp argue that the standardized tests that students are required to write are indeed a valid measure of learning, and that they test relevant and valid material. Again, in the practical world of educating low-income and minority children this argument is essentially a moot.

One of the distinguishing characteristics of the school’s profiled in Chenoweth’s article is that instead of analysing and critiquing the state and national standards, the schools regularly “study their state’s standards (and, sometimes, other states’ standards), think about what their students already know and are able to do, and decide what more they need to learn.” (Chenoweth, 2009) Schools that are succeeding in lessening and closing the achievement gap are using the standards as a tool instead of an excuse.

Figure 8 is a video profiling the KIPP Academy of Opportunity. At about 2:30min the students begin to talk about the progress they have made in terms of their standardized test result. The standards provide students with concrete ways to measure their learning and their progress.

Figure 8. KIPP Academy of Opportunity

3.3 Canadian Results

It is important to acknowledge that the majority of the information and statistics in this section relate to American schools, and the American population at large. The ethnic, racial, and socioeconomic demographics of Canada are quite different than those of the United States; while most of the comparative education literature from the United States deals with Whites, Blacks, and Hispanics, Canadian literature focuses more on the differences between Canadian-born (or White) students, Aboriginal students, and Immigrants/children of immigrants.

In my opinion, the Canadian group that can be compared most closely to the African-American population of the US is Aboriginal students. The two groups are similar in that they are both "indigenous" (in that they did not recently come to the country), they share a painful history of oppression, and they face similar challenges with respect to socioeconomic conditions. The two groups have had similar high school graduation rates in recent years as well (53% of African-Americans, 54% of Aboriginals as of 1999), although Aboriginals have made significant progress in the last ten years to increase their graduation rate to about 66% (Levin, 2009).

Another important distinction between the United States and Canada is that of the success of immigrants and children of immigrants. Recent Canadian studies have shown higher levels of educational attainment in foreign born students and visible minorities than in their native-born counterparts. (Abada & Tenkorang, 2009)

3.4 Connecting Themes

In each of the following sections of our essay we will conclude with a "Connecting Themes" section to revisit Figure 2 in our introduction.
With regards to the achievement gap, it is crucial to consider what the graph in Figure 2 does not tell us. To draw any meaningful conclusions about the students at these schools more information on the ethnic, racial, or socioeconomic breakdown of each school would be helpful.
With respect to standards and testing, if school "A" were to adopt the approach of the schools profiled in Karin Chenoweth's article (2009) - using state and nation standards as guidelines that inform curriculum and school policy - their results could soon look like school "C", and eventually "B".

4. Reliability and Validity

4.1 It’s Not Just a Test

Traditionally, standardized tests serve two primary purposes: to judge the quality of pedagogical instruction and to quantitatively measure the learning accomplished by students (Grant et al., 2002). However, standardized tests also serve a tertiary purpose of posing as an educational bar, which all students and schools can be measured against and consequently ranked as either ‘sufficiently meeting the expected curriculum standards’ or shunted as being ‘less than average.’ Lastly, the quaternary purpose for some high-stakes standardized tests is to use the results to drive changes for further teaching and learning. Having said that, scholars have also suggested that standardized tests are “used as substitutes for perhaps less convenient, more difficult or otherwise cumbersome modes of measurement and evaluation” (Botel, 1986).

The results from standardized tests can be vitals components to a student’s future; the spectrum of opportunities can often be drastically reduced to the student that receives a poor test grade on a so-called entrance or exit exam, or what is more commonly referred to as a high-stakes examination. Figure 9. provides a good depiction of how society can often allow test scores to define their identity and self-worth. Thus, the understanding of the reliability and validity of a mass-produced standardized test is critical for progression and optimization of the standards of education.
Figure 9. A cartoon depiciting the signficance of standardized test scores


4.2 Unraveling Reliability

In the context of standardized tests, reliability is “the degree to which test scores for a group of test takers are consistent over repeated applications of a measurement procedure and hence are inferred to be dependable and repeatable for an individual test taker” (Rudner and Schafer, 2001). The reliability of a standardized test is expressed as a coefficient between 0-1, where 1 denotes an infallible or undeniably reliable exam; standardized tests usually have a coefficient of 0.90 (Rudner and Schafer, 2001).

The high reliability measure has led to the widespread pervasiveness of implementing standardized testing. Riffert (2005) argues that it is this high reliability coefficient of 0.90, compared to the individual teaching reliability that is estimated between 0.30-0.50, which enables testing to continue in our culture. Thus, it is this high reliability of standardized testing that holds statistical power and catalyzes top-down educational changes and reform because classroom teachers paint an unreliable picture of student proficiency.

4.3 Validity and Standardized Testing: Is there another option?

The validity of a standardized test is determined by how well the test measures what it was designed and intended to. As discussed in the article summary of the “Examined Life” by Gladwell, the S.A.T. II, which is aimed to measure achievement and gauge mastery of the high school curriculum is a better measure of predictive validity than the S.A.T. I, which measures aptitude and inherent capability. As is the case with the S.A.T, many researchers have found that the most valid measure of students’ capabilities consists of combining various forms of psychometrics. These findings are supported by the research conducted at Concordia University by Volante (2004), where he found that the validity of standardized tests is undermined by the phenomenon he deems “teaching to the test” (Figure 10). In his research, he observed that many schools allocate time for test preparation and familiarization at the cost of teaching the curricula. This suggests that the scores that some schools may receive are not an indication of student achievement or aptitude, but rather of how much time a particular school deems appropriate for test preparation.

Figure 10. Katie Couric addresses the phenomenon of "teaching to the test."
Standardized testing offers a wide-scale method of testing students identical material in a setting where variables can be controlled as the test-makers see fit. However, is there another way to obtain this information without compromising validity? In a study conducted by Grigorenko et al. (2009), the researchers found the best measure of students’ ability accounts for aspects of self-regulated learning (SRL), such as academic self-efficacy, academic motivation and academic locus of control. In addition, the validity increases when the measure incorporates the WICS (Wisdom, Intelligence, Creativity Synthesized) theoretical framework. When both sets of SRL and WICS indicators were measured, the validity of predicting success increased. This data suggests that although standardized testing may be valid, there may be other performance assessment measures that are more effective and extensive to measure student achievement.

4.4 Connecting Themes

When re-looking at Figure 2. and reframing our thinking to include the reliability and validity of standardized test scores, it is apparent that the data may be flawed. If you take for example the fact that some schools may be preparing for the test in advance, which undermines the validity of the test, or that the trends in reliability may vary from year to year, this static graph all of a sudden does not depict what we think is so clear when we first look at it.

Finally, in Figure 11, experts in Canada are asked what is learned from standardized testing (EQAO) in Canada. The experts in this video agree that (a) the results are a measure of the system's performance, not the student's (b) parents misunderstand the results and (c) the test only measures certain kinds of achievement, mostly literacy and numeracy, and is only valid at predicting narrow measures of success. This video reiterates the concerns of validity and reliability of standardized tests and if they are in fact an effective diagnostic tool.

Figure 11. Experts discussing what we learn from EQAO testing

5. Bloom's Taxonomy

5.1 Understanding Bloom’s Taxonomy

In 1956 a handbook was circulated highlighting taxonomy for cognitive thinking. Bloom, Engelhart, Furst, Hill, & Krathwohl, were the authors of this handbook entitled "The Taxonomy of Educational Objectives, The Classification of Educational Goals, Handbook I: Cognitive Domain." Bloom’s Taxonomy separates educational objectives into three “domains:” Affective, Cognitive, and Psychomotor. Bloom’s Taxonomy only focuses on the cognitive, breaking it down into 6 categories of learning.
“The taxonomy classifies cognitive performances into six major headings arranged from simple to complex:
1. Knowledge: the recall of methods and processes
2. Comprehension: type of understanding of apprehension such that the individual knows what is being communicated
3. Application: the use of abstractions in particular and concrete situations
4. Analysis: the breakdown of a communication into its constituent elements or parts
5. Synthesis: the putting together of elements and parts so as to form a whole
6. Evaluation: judgments about the value of material and methods for given purposes”
(Nitko, 2001)

Figure 12. Bloom's Taxonomy

5.2 Taxonomy, Testing and Culture

Bloom’s Taxonomy is used in some classrooms today as a tool for rating the standards. Many questions in Mathematics, for instance, can use a variety of different questioning which gauge students understanding in a multitude of topics. For example, a student might be asked solve for ‘x’, create an equation representing a real life scenario, or to draw a graph. Teachers using the old percentage approach might assign 10 marks for each of these three questions. This creates the assumption that all three questions have the same degree of difficultly. What if one of these questions requires a higher level of understanding or knowledge in order to be solved? A lower level student might get a higher mark, as they are able to gain more points on the easier questions. This is where Bloom’s Taxonomy can separate these questions into different categories in which to be evaluated. “The goal was to categorize the levels of abstraction of questions that commonly occur in educational settings. It provided a structure that could be used to develop tests that would accompany teaching to the objectives.” (Kohl, 2006)
Bloom’s Taxonomy can be used as more than a measurement tool. Lets say two teachers write a test, both are marked out of 80. This does not mean the tests are equal. One test could be much easier than the other. Bloom’s Taxonomy can help by assigning a certain percent of the questions to each category, helping to neutralize the two but still leaving the teachers free to decide the questions. This can also be extended to unit planning and long-term goals. Bloom saw potential in this taxonomy as it was being developed. “He believed it could serve as a: common language about learning goals to facilitate communication across persons, subject matter, and grade levels; basis for determining for a particular course or curriculum the specific meaning of broad educational goals, such as those found in the currently prevalent national, state, and local standards; means for determining the congruence of educational objectives, activities, and assessments in a unit, course, or curriculum; and panorama of the range of educational possibilities against which the limited breadth and depth of any particular educational course or curriculum could be contrasted.” (Krathwohl, 2002)
Bloom’s Taxonomy only touches on the Cognitive principles of learning and not the Affective (emotional) or Psychomotor (physical manipulation of objects). The problem is classrooms across America are full of students with diverse backgrounds, religions and cultures. Bloom’s doesn’t address these ideologies either. “This approach did not deal at all with culture, content, or ideas. It was a first draft of a map that would presumably lay out guidelines into which all of the substance and content of learning could be fitted.” (Kohl, 2006) Bloom’s Taxonomy is constantly being updated and revised as our understanding of knowledge and learning also evolves. As educators we must understand what Bloom’s Taxonomy does for us but also what it doesn’t.

5.3 Personal Experiences of Bloom’s Taxonomy

Bloom’s Taxonomy was introduced into Canadian classrooms after I graduated High School. I was only aware of percentage scoring and letter grading until my practicum in November 2009. I was at an Adult High School where most of my day was marking Mathematics papers. I used percentages as per the request of my associate teacher. I then asked her why? Why percentages and not Bloom’s Taxonomy? She told me she had tried Bloom’s but found it too time consuming. As we had to mark 20-30 assignments averaging 15 pages each. Using Bloom’s we would have to count up all the different categories on each page and write the total for each on the cover. But then I was asked to create an exam where she told me I had to use Bloom’s Taxonomy. Pouring through the course content I had to identify questions that fit into knowledge, comprehension, application etc. I found in the end the exam was well rounded as it contained not only question throughout the course but varying degrees of difficultly. The students in the end where able to use the feedback from the exam to identify which areas they need to enhance and work on.

5.4 Connecting Themes

When looking at figure 2 you might make the assumption that School B is effective at educating their students where as school A is not. What this graph doesn’t tell you is how the students are being assessed, among other things. School B might be using a percentage approach when grading tests and assignments. This mean equal weight might be given to easy and difficult questions alike. And the tests might not be focusing on all the different cognitive levels of understanding. Students with lower ability maybe given a higher mark vs. Schools A and C, which could be using Bloom’s Taxonomy approach to learning. This will give students more feedback when being evaluated and increasing the students understanding. Helping students concentrate on those areas they need improvement on. And of course better preparing them for the future which is the ultimate goal.

6. Grade Inflation

6.1 Understanding the Practice

Figure 13. Source: http://rattlernation.blogspot.com/2009/09/opinion-fix-grade-inflation-before.html

According to Minnesota State University, grade inflation can be defined as, “the change in grade patterns so that the overwhelming majority of students receive higher grades for the same quantity and quality of work done by students in the past” (Minnesota State Colleges and Universities, 2009). Alteration of student achievement levels has been an issue of high debate in many academic institutions, with some institutions supporting the practice, while others condemn the exercise. There are a variety of causes for grade inflation, including pressure from the educational institution to retain student enrollment rates, higher grades being given with the hopes of better teacher evaluations, changing grading practices, teacher and institution incentives, and increased levels of sympathy and/or empathy given to students. There is however, one level of equality that exists in grade inflation, and this is between the sexes. According to Bejar and Blew (1981), whatever the cause of grade inflation, it is operating equally for males and females. Therefore, it can be understood that grade inflation is not an issue affected by gender issues but rather an educational and pedagogical issues.

6.2 The Incentive to Inflate

Freakonomics: A Rogue Economist Explores the Hidden Side of Everything
By Steven D. Levitt and Stephen J. Dubner
Chapter 1: What d o School Teachers and Sumo-Wrestlers Have in Common?
Figure 14: Freakonomics Cover

This contemporary text evaluates many social issues and this case study investigates personal morality. An investigation of the morality or lack thereof, of entrepreneurs, athletes and teachers was explored and the subject of teacher deviance (in the form of cheating) was investigated. The form of teacher cheating that was studied came in the form of inflating student scores on standardized tests. The motivation behind inflation was also stud ied.
This case study begins by examining the presence and influence of incentives on human life. Le vitt and Dubner (2006) explain that incentives urge people to do more of a good thing and less of a bad thing. From this reading it can be determined that incentives are not naturally forming coincidences and are instead developed through human and social interaction. Levitt and Dubner categorize incentives into the following 3 groups;
a) economic b) social c) moral

Within society, dishonest people exist and they will attempt to take advantage of these incentive groups in any way possible. They will attempt to cheat the system to ensure personal gain. This case study investigated the Chicago Public School system, which embraced standardized testing in 1996, under the threat of being shut down if they did not improve student reading and comprehension levels. Advocates for the test argued that they would raise the standards of learning and give students a greater incentive to study. Though it has been previously determined by authors such as Gladwell that the value and reliability of these tests comes with great assumptions, these individuals still supported their claims of being beneficial to student development. Opponents argued that certain students would be unfairly punished if they did not perform well on this single test. Arguments of teachers focusing on teaching for the test and not more important lessons were also presented. Using computer generated algorithms, student test results were analyzed and possible teachers using grade inflation techniques were discovered. Tests were then re-administered to a control group and the suspected teachers under the supervision of board employees. These results were analyzed and it was found that a large number of the suspected teachers were guilty of grade inflation. A number of these teachers were relieved of their positions and upon further study it was found that levels of suspected cheating dropped by 30% the following year.

This case study critically examined how standardized tests have warped the incentives to such a high degree that teachers now have a reason to raise grades. If students receive poor scores, a teacher’s career is at stake, but if students perform well, teachers can be promoted which leads to higher financial gain, praise and recognition for their teaching skills. Therefore, the incentive for a teacher to inflate test scores is great. Another issue that is examined in this article is the concept of accountability. Teachers make the decision to cheat because their deviance is rarely investigated or punished. Therefore, teachers are rarely held accountable for their actions and feel no pressure to perform within the high degree of morality that should be present in their profession.

Teachers may feel that they are promoting student growth and aiding in development when in reality they are providing an irreversible disservice. As made apparent by Ohanian, in following years, students will find themselves in a dire struggle as they realize they have been cheated out of a quality education by a teacher with more concern for their personal self worth than the young minds they are supposed to be developing.

6.3 Grade Inflation in Ontario's Universities

Obtaining data that can be used to study the presence of grade inflation can be difficult. Maclean’s magazine, which is famous for its annual ranking of Canadian universities, does not even include this statistic when ranked positions are being formulated. Paul Anglin and Ronald Meng, who are both professors in the department of Economics at the University of Windsor, overcame these barriers and attained data from two time periods, 1973-1974 and 1993-1994, which they used to assess the presence of grade inflation within Ontario Universities. Their study found that though grade inflation is not uniform, it is the common trend and this practice challenges the social and financial worth of obtaining a university degree. Due to the effects of grade inflation, being awarded a degree from a university is not as distinctive as it was in previous decades. As Levitt and Dubner’s case study noted, Anglin and Meng are also agree that grades can be manipulated and used as an attraction device for potential students. This study investigates the influence of grade inflation on employers and concludes that inflation makes the signals of a strong employee less obvious and the differentiation across both temporal and disciplinary scales makes assessment of quality increasingly difficult. Even the quantity of students receiving the “coveted A” has increased, leading to further stipulations as to the actual value of continuing education.

This study brings forth a number of implications for universities in Ontario. As noted by the authors of this study, universities in Ontario are under pressure to attract and retain students even in the presence of declining federal support. This means that professors may feel inclined to award higher grades for work of a lower quality just so the faculty remains at a surplus both financially and in terms of student enrolment. As a result of this, cross university competition is beginning to increase. From this, it can be determined that in the future, students may be offered larger incentives such as entrance scholarships for enrolling in the university. Using Levitt and Dubner’s theories on incentives, this may seem like remarkable arrangement, but upon further examination, such actions may seem futile, as the realization occurs that students may be not only accepting lowered tuition fees, but also a decreased quality of education.
Click on the widget below to obtain a full copy of “Evidence on Grades and Grade Inflation in Ontario’s Universities” by Anglin and Meng.

6.4 Personal Experiences of Grade Inflation

As an individual who has undergone several different levels of educational study, I believe that I have been subjected to the use of a bell curve on multiple occasions. During the assessment of one activity during high school, the teacher distinctly told students that he would use the bell curve to score this assignment. Students were told that it was a common practice in the post-secondary setting and it was being used to familiarize us with its outcome. The manipulation of these grades assigns a certain percentage of the class into each grade category. Therefore became a competition for the “coveted A”. As students, we became concerned not with the overall quality of our own projects. Our only concern was that it was better than the work of other students.

Though it was not my own personal experience, during my undergraduate degree, a classmate submitted an assignment in a course and received a grade of A. After making subtle changes to the assignment, he submitted it again the following year for a different course. The classmate was considerably disappointed when they received a C- grade on the paper. Though the chances of the professor for the first year course admitting he was partial to the practice of grade inflation were slim, after considering that this particular faculty was having noted trouble in retaining second year enrolment, the possibility of grade inflation became apparent. This was the first time that I began to seriously consider the use and practice of grade inflation within my postsecondary institution and at first I was in shock, though after some research I came to understand that the practice of increasing or altering student scores was more common than I thought.

6.5 Avoid the Hassle, Just Give Them an A

This video is of George Leef, the director of research for the John W. Pope Center for Higher Education Policy, outlining some of the problems associated with grade inflation on college campuses. These comments were offered during an interview with Donna Martinez for Carolina Journal Radio (caro linajournalradio.com) Program No. 310 Video courtesy of Carolina Journal TV (cjtv.carolinajournal.com)

Video Analysis:
This video demonstrates that teachers may inflate grades just to avoid the hassle connected with the justification of giving students low grades. Using Levitt and Dubner’s notions, the incentive to inflate grades lies in the desire to avoid conflict. The lesson that can be drawn from this media insert is that as teachers, we must be conscious of the reasons behind the awarding grades and suitable justifications for awarding these grades must be had. As professionals, teachers must also have the courage to rationalize decisions infront of others and remain firm in the judgments they have made.

Figure 15. The incentive to inflate grades

6.6 Connecting Themes

Through a careful analysis of initiating readings and supplemental texts, a number of conclusions about the practice of grade inflation can be made. The most common theme that is evident throughout the texts studied is that grade inflation is a reality and exists even in the universities of Ontario. There are a number of incentives for a teacher to inflate student test scores, some driven by personal gain while others are driven by a government or institutional mandate. This brings into focus the ideal that the standardized tests that are thought to be so significant in raising the standard of education are in fact a hindrance to the development of society’s youth. These tests can act in a condemning nature as they force student conformity and punish individual thought. Test scores have even been so low that they are inflated to give students a break. A better break for the students would have been to never subject them to the tests in the first place.
This leads us to the conclusion that through the practice of inflating student grades and test scores only robs a student of the quality education they are entitled to. Teachers may feel they are promoting student development when in actuality they are depriving students of a fair and just schooling experience.

Another inference that can be made is that high stakes tests are not just high stakes for students. Teachers are also under pressure to have their students perform well and the outcome of these tests can have large implications on the careers of teachers. These incentives can lead to an astonishing level of dishonesty, which seems even more unbelievable when one considers that the profession of teaching is founded in honesty and trust. A teacher’s modification of grade scores can also be influenced by the fact that they are only rarely held accountable for their actions. The concept of accountability can be investigated on an even wider scope, and it can be determined that test creators are even less accountable for the tests they create. Therefore, whenever students produce poor results, they are the ones who face the largest amount of punishment. Under rare cases, such as the case study pertaining to the Chicago Public School system that was investigated in this section, teachers can be held accountable, and blame may seldom fall on the creator or institution that developed the test in the first place.

The final implication that can be made relates to the title of this essay. The practice of grade and test score inflation makes it evident that the myth of objectivity is in fact a myth. The alteration of student achievement levels demonstrates that both the test questions and the test outcomes are influenced by subjective factors. Therefore, in relation to Figure 2, one cannot simply decide that students should be sent to School B because it exhibits the highest average test grade. Upon reflection, one may begin to question as to whether the grades of School B were inflated to allow it to surpass the average test score of the other schools and therefore may not be the greatest school. By considering this, one is able to make the conclusion that standardized tests are in fact not objective, and as all aspects of the test, ranging from creation, administration and scoring are subjected to alteration processes, the actual pedagogical reasoning behind testing becomes outrageous.

7. Conclusion

In our investigation of the themes and issues studied in our research, we have deconstructed
Figure 16. A visual representation of the common themes throught the essay
the ideology behind objectivity and its purpose in relation to standards, testing and accountability. Our initial understanding was that standardized testing was an objective and uniform assessment tool. However, upon closer examination, we have come to understand the flaws of wide-scale and high-stakes testing have undermined the sought-after objectivity. Figure 16 is a visual representation of our findings.

Some implications that have been made evident through our research include:
  • “Teaching to the test” neglects the broader scope of education and comprises the validity of standardized testing.
  • Grade Inflation and Renorming handicaps students of an authentic achievement experience.
  • Untested subjects such as music are de-emphasized due to our narrow definition of success that includes focusing on literacy and numeracy.

The implications relate to Figure 2., which has been used as a reoccurring theme throughout our essay. This graph is a static display of achievement; it represents pure output data and does not reflect factors such as baseline position or increases or decreases in standardized achievement levels. Thus, it is illogical and unrealistic to infer conclusions or draw meaning from this data. When revisiting this graph, we are forced to re-examine and reframe our thinking to include the multitude of variables that influence the standardized scores of each of the hypothetical schools.

8. Work Cited

Adams, J.E., & Kirst, M.W. (1999). "New Demands and Concepts for Educational Accountability: Striving for Results in an Era of Excellence." in Handbook of Research on Educational.

Abada, T., & Tenkorang, E.Y. (2009) Pursuit of university education among the children of immigrants in Canada: the roles of parental human capital and social capital. Journal of Youth Studies, 12(2), 185-207.

Bejar, I., & Blew, E. (1981). Grade Inflation and the Validity of the Scholactic Aptitude Test. Educational Testing Service: College Board Report, 81(3), 1-16.

Botel, M. (1986). A Comparative Study of the Validity of the Botel Reading Inventory and Selected Standardized Tests. Botel Reading Inventory, Penn Valley School, MO.

Chenoweth, K. (2009). It can be done, it's being done, and here's how. Phi Delta Kappan, 91(1), 38-43

Crighton, J. (2002). Standardized tests and educational policy. In Encyclopedia of Education. Ed. James W. Guthrie. (Vol. 2, 2nd Ed., pp. 2530). New York: Macmillan Reference USA.

Dubner, S. & Levitt, S. (2005). Freakonomics: A Rogue Economist Explores the Hidden Side of Everything. William Morrow and Company.

Gladwell, M. “Examined Life” [The New Yorker, 17 December 2001].

Gladwell, M. (2008). Outliers: The Story of Success. New York: Little, Brown and Company.

Grant, S.G., Gladwell, J.M., Lauricella, A.M., Derme-Insinna, A., Pullano, L. and Tzetzo, K. (2002). When increasing stakes need not mean increasing standards: the case of the New York state global history and geography exam. Theory and Research in Social Education, 30(4), 488-515.

Grigorenko, E.L., Jarvin, L., Diffley, R., Goodyear, J., Shanahan, E.J., and Sternberg, R.J. (2009). Are SSATs and GPA Enough? A Theory-Based Approach to Predicting Academic Success in Secondary School. Journal of Educational Psychology, 101(4), 964-981.

Hirsch, E.D. The Question of Fairness [from The Scho ols We Need and Why We Don't Have Them, New York: Anchor/Doubleday, 1999, pp. 206-214].

Ho, H., Raley, J.D., & Whipple, A.D. (2002). Ethnicity. In Encyclopedia of Education. Ed. James W. Guthrie. (Vol. 2, 2nd Ed., pp. 1122-1123). New York: Macmillan Reference USA.

Kohl, Herbert (2006). ERIC A Love Supreme-Riffing on the Standards

Kohn, A.
Beware of the Standards, Not Just the Tests. [Education Week, 26 September 2001].

Krathwohl, D. R. (2002). ERIC A Revision of Bloom’s Taxonomy: An Overview. p. 212-18

Levin, B. (2009). Aboriginal Education Still Needs Work. Phi Delta Kappan, 90(9), 689-690

Minnesota State Colleges and Universities (2009). Grade Inflation Article. Retrieved from http://www.mnsu.edu/cetl/teachingresources/articles/gradeinflation.html

Nitko, Anthony J. Educational Assessment of Students. [Third Edition, 2001].

Ohanian S. Goals 2000: What's In a Name? [Phi Delta Kappan 81 (5), January 2000, pp. 345-355].

Riffert, F. (2005). Interchange: A Quarterly Review of Education, 36, 231-252.

Rudner, L. M & Schafer, W. D. (2001). ERIC Clearinghouse on Assessment and Evaluation.

Volante, L. (2004). Teaching To the Test: What Every Educator and Policy-maker Should Know. Canadian Journal of Educational Administration and Policy, 35.

Malcolm Gladwell <http://www.mdcbowen.org/cobb/archives/gladwell.jpg>
Susan Ohanian <http://www.vtcommons.org/files/images/Susan.thumbnail.jpg>

Alfie Kohn <http://www.gurteen.com/gurteen/gurteen.nsf/id/L000786/$File/alfie-kohn.jpg>
E.D. Hirsh <http://edrev.asu.edu/reviews/rev558-hirsch.jpg?