| Sign In to gain access to subscriptions and/or personal tools. |
Hidden Benefits and Unintended Consequences of No Child Left Behind Policies for Students Who Are Deaf or Hard of HearingThe University of Texas at Austin
No Child Left Behind (NCLB) creates a high-stakes environment by holding schools accountable for how all students perform on state assessments, including students with disabilities and students who are English Language Learners. The focus of this article is on the impact of NCLB on students who are deaf or hard of hearing (SDHH). The SDHH have diverse linguistic characteristics and are served in a range of educational settings. The purpose of this article is to explore the hidden benefits and consequences of NCLB policy on SDHH in two areas: assessment and accountability. Drawing on findings from the authors program of research, the article illustrates areas where policy may differentially affect students depending on their state of residence and educational setting. The discussion ends with a summary of benefits and hidden consequences of NCLB for SDHH.
Key Words: accountability No Child Left Behind assessment students with disabilities The purpose of this article is to discuss the impact of No Child Left Behind (NCLB) legislation on the educational structures that serve students who are deaf or hard of hearing.1 As laid out in the introduction to the legislation, the purpose of NCLB is to ensure that "all children will have a fair, equal, and significant opportunity to receive a high-quality education and reach, at a minimum, proficiency on challenging state academic achievement standards and state assessments" (No Child Left Behind, 2001). This article reviews two key components of NCLB: assessment and accountability. Assessment policies include guidelines for how many students participate in standardized and alternate assessments as well as the use of testing accommodations. The accountability framework outlines how states use scores from assessments to make determinations about progress made toward NCLB goals.
For the purpose of NCLB accountability legislation, students who are deaf or hard of hearing fall under the group of students with disabilities.2 There are substantial gaps between the standardized assessment results for students with disabilities and those without. Education Week recently published "Quality Counts 2004: Count Me In—Special Education in an Era of Standards" (2004), a state-by-state evaluation of participation and proficiency rates for students with disabilities in large-scale assessments.3 At the 4th-grade level, more than half of the states had achievement gaps of 25% to 50% between students with disabilities and students without disabilities (Cawthon, 2004). The gaps in proficiency increased for only students in 8th and 10th grades. More recent analyses of changes in student proficiency since the start of NCLB suggest that there may have been some closing of the achievement gaps between student subgroups (Center on Education Policy, 2007b). Yet the data on achievement gap trends for students with disabilities cannot support strong conclusions because of the many changes through the early years of NCLB implementation in how students are tested and how scores are interpreted. The achievement gap for students who are deaf or hard of hearing appears to be similar to that of students with disabilities as a whole (National Center on Low-Incidence Disabilities, 2006). In early analyses of academic performance on state assessments under NCLB, Cawthon (2004, 2005) presented available information on student achievement at schools for the deaf. For example, as of 2005, a total of 15 states reported standardized assessment results for students attending state-administered schools for the deaf. There was a range of student proficiency rates in math and reading on school report cards. However, overall performance was low; no more than half of students assessed at any school demonstrated grade-level proficiency in math or reading. In some schools, no students met state benchmarks. The advantage of looking at assessment and accountability components is transparency: NCLB requires states to design policies and data-reporting measures that make it possible to track the progress toward proficiency goals. With NCLB goals of 100% proficiency for all student subgroups within 8 years, there is certainly a long way to go in closing this achievement gap. To what extent will NCLB have an impact on educational outcomes for students with disabilities? It is not clear whether students with disabilities as a subgroup, particularly students who are deaf or hard of hearing, will benefit from all of the strategies implemented under NCLB. Now that reauthorization is in the near future, it is important to identify both the strengths of the assessment and the accountability elements of the program. For students who are deaf or hard of hearing, revisions to the law can strengthen the positive impact already in place and remedy those hidden, but real, negative consequences of NCLB.
The NCLB is based on three main conceptual components: standards, assessments, and stakes (Quenemoen, Lehr, Thurlow, & Massanari, 2001; examples of this framework applied are discussed in Ladson-Billings & Tate, 2006; Orfield & Kornhaber, 2001; Peterson & West, 2003). Standards refer to the content knowledge and skills that form the basis for evaluating student proficiency. Standards-based reform began in the 1990s and was the precursor to the larger accountability framework we see today (Loeding & Crittenden, 1993; Porter & Smithson, 2001). Clear standards are essential to an effective accountability framework because it is important to be able to define what schools are asked to teach their students (for an in-depth discussion of state literacy standards, see Thompson, Johnstone, Thurlow, & Clapper, 2004; for a discussion of mathematics content standards, see Palacios, 2005). Assessments are used to measure a schools effectiveness in teaching students standards-based material (Linn, 2000). Assessments of student achievement can vary in their ability to accurately represent student proficiency in core content areas (Sanders & Horn, 1995). Valid and representative test results are an important part of credible assessments (Phillips, 1994). Stakes are the end result of the accountability process (Moe, 2003). After assessment results are aggregated, policy then designates decision rules about the implications of student assessment scores. If a school is seen to meet expectations of ensuring adequate student proficiency on state standards, as measured by assessments, then the consequences for the school may be a reward or recognition. If expectations are not met, the stakes may be more restrictive or include intervention at the administrative or curricular level in hopes of improving future results. Taken together, standards, assessments, and stakes make up the Grand Theory of Action of NCLB (Forte-Fast & Hebbler, 2004). Through clear expectations of educational goals, measurement of student progress, and consequences for not meeting these goals, NCLB hopes to motivate schools to improve instruction in core content areas. One of the main assumptions behind NCLB is that states, districts, and schools have the capacity (or potential capacity) to increase student achievement. To maintain accountability for all students, NCLB legislation requires states to disaggregate student performance data by (at minimum) student subgroups, including students with disabilities. The purpose of highlighting student performance for each student subgroup is to encourage schools and districts to improve student outcomes, not just as an average across all students, but specifically for students who have historically been underserved.
One of the key components of the NCLB legislation is standardized assessment. The purpose of standardized assessments is to have a uniform, efficient, and valid method of measuring schools progress in developing student knowledge. Assessments focus on the core content areas of reading, mathematics, and science (new in 2007 to 2008). With an emphasis on annual (now Grades 3 to 8 and one grade in high school) assessment, NCLB represents a more rigorous and extensive implementation of required standardized assessment programs that had been in place in previous years. By tying test participation and results to an accountability mechanism, including funding and school administration elements, NCLB creates a high-stakes assessment environment.
Accommodations and Standardized Assessment Although there is no cap on the number of students who can participate with accommodations, there are limits on the kinds of accommodations students can use. Each state sets its own regulations about accommodations use on assessments (Clapper, Morse, Lazarus, Thompson, & Thurlow, 2005; Lazarus et al., 2006; Minnema, Thurlow, Anderson, & Stone, 2005). Three commonly allowed accommodations (without restrictions on the use of student test scores) include extended time, small-groups or individual testing, and the use of interpreters for test directions (Thurlow & Bolt, 2001). Other accommodations that may be allowed, with restrictions on student test scores or only in certain circumstances, include the use of interpreters for test items or having test items read aloud to the student (descriptions of these accommodations examples are given in the appendix). Accommodations that change the presentation of the test item may pose a threat to assessment validity because it can change the cognitive demands of the test item (Koretz & Barton, 2003). These accommodations are thus viewed with greater caution than accommodations that change the setting of assessment (Braden & Elliott, 2003; McKevitt & Elliott, 2003).
Alternate Assessment The NCLB places significant restrictions on the proportion of student scores from alternate assessments that can be used toward a districts calculation of proficiency (Yell, Katsiyannas, & Shiner, 2006). Under the original implementation of NCLB, a maximum of 1% of the students in each district could have their alternate assessment scores counted toward proficiency (these caps apply only to the district and not to individual schools; Yell et al., 2006). In the past year, there have been some proposed modifications that, in certain cases, may allow an additional 2% of a districts eligible test takers to participate in modified assessments and still be counted toward proficiency rates (National Archives and Records Administration, 2005, 2007). This 2% flexibility option is geared toward students who have persistent academic difficulties (Council for Exceptional Children, 2006). After the district has met the allocation of scores that can be counted from alternate assessments, the remaining scores are counted as not proficient, regardless of the actual assessment results.
Assessment results are reported on a uniform reporting mechanism for all schools, districts, and states: reports of adequate yearly progress (AYP). Student proficiency data are typically reported on a "school report card" with a rating in whether the school or district has met AYP benchmarks. The AYP frameworks build on each states definition of standards for learning, assessments, and benchmarks for proficiency. Although the format is uniform, individual state components vary greatly. (For a discussion of differences in state definitions of proficiency and the benchmarks for annual progress, see Linn, 2003, and Porter, Linn, & Trimble, 2005.)
Group Size for Reporting
Confidence Intervals
Students With Disabilities and AYP
Unintended consequences and hidden benefits of NCLB have the potential to have long-term effects on children who are deaf or hard of hearing. The deaf and hard of hearing student population is unique in several ways. First, it is a low-incidence group, carrying with it the challenges of having only a few students educated in a single location. There are only an estimated 70,000 students who are deaf or hard of hearing who receive special services in public schools in the United States, less than 1% of the total K through 12 student population (Mitchell, 2004; U.S. Department of Education, 2004). Second, when their primary language is American sign language, many students who are deaf or hard of hearing are part of a cultural and linguistic minority (Lane, 1999). Although classified as a disability for the purpose of NCLB and much of the relevant special education legislation (Americans with Disabilities Act, 1990; Rehabilitation Act, 1973), deafness also has linguistic and cultural aspects that affect how some students experience educational reform. Finally, deaf or hard of hearing students are a diverse group served by several different educational structures (Gallaudet Research Institute, 2005; Ramsey, 1997). For example, hearing losses range from mild to profound, with differences in how each individual perceives sound, particularly speech (Musselman, 2000). Individuals also vary in the onset of hearing loss and whether they have attended early intervention programs for language or speech development (Marschark, 1997).
Education Settings Although there is often overlap between these three models, such as a school for the deaf providing regional or itinerant services, it is from these three perspectives that I will discuss benefits and consequences of NCLB. Depending on individual state NCLB policies, accountability reform affects the district programs, regular education schools, and schools for the deaf in different ways (Cawthon, 2004). It is upon these unique and varied educational structures that NCLB legislation will reach students who are deaf or hard of hearing. Studying students who are deaf or hard of hearing thus provides policy developers with a useful example of potential long-term effects of NCLB on students with low-incidence disabilities, those who may not use English as a first language, or those who are served by educational settings outside of the mainstream.
The purpose of this article is to present 2004 to 2005 data on how NCLB assessment and accountability components affect students who are deaf or hard of hearing. Two questions guide this discussion:
National Survey of Assessments Findings for the first research question draw on the results of the Second Annual National Survey of Assessments and Accommodations for Students Who Are Deaf or Hard of Hearing (Cawthon & the Online Research Lab, in press). The purpose of the survey was to collect data on the ways students who are deaf or hard of hearing participated in state assessments used for NCLB accountability during the 2004 to 2005 academic year. Data from this survey allow for a descriptive analysis of accommodations use across states with different assessment policies. Participants provided information about the location of the school or program, characteristics of the educational setting, student characteristics, number of students participating in state standardized and alternate assessments, and testing accommodations used, where relevant. The National Survey investigated assessment practices for deaf or hard of hearing students in K through 12 public school settings. The survey format included multiple-choice, Likert-type scale, and open-ended response items. The survey instrument was administered in two ways: (a) online at the project Web site, www.dhh-assess-survey.org (developed using www.surveymonkey.com) and (b) paper versions provided to individuals with stamped, self-addressed envelopes for returned responses. Incentives for participation included entry in a drawing for $25 gift certificates upon completion of the survey. Multiple sample strategies were required because recruitment can be particularly challenging when the target population is dispersed throughout varied educational settings (Cawthon, 2006). Survey participants were individual teachers recruited initially through the Gallaudet Research Institutes Annual Survey of Schools and Programs contact list as well as the participant list from the First Annual National Survey. Additional recruitment contacts were made through Web site affiliations, state lists of programs and services, and 687 personal e-mails and postcard invitations by the principal investigator. Nearly all (344, or 88%) of total participants responded online and 48 (12%) responded via hard copy. Participant confidentiality was maintained by coding individual responses with an identification number; no specific names of individuals or schools are reported in the study results.
Participants Teachers who responded to this survey served just more than 9,300 students nationwide; this is approximately 12% of the estimated 70,000 population in the United States who receive special services under the Individuals with Disabilities Education Act (IDEA; U.S. Department of Education, 2004). The sample was not random; all known schools and programs serving deaf or hard of hearing students were contacted to participate in the study. The representativeness of this sample varied by census region. A total of 10% of the students in the National Survey were from the Northeast Census region, compared with 18% of the Child Count estimates of the deaf or hard of hearing student population (Mitchell, 2004). Students of study participants from the Midwest region made up 22% of the sample, compared with 23% in the Child Count estimates. Because of recruitment procedures, the sample was heavily weighted toward students in the South Census region (43%). This was a higher concentration of students in the South than in the IDEA Child Count (31%). Finally, 24% of the students served by study participants came from the West Census region, compared with 25% in the national estimate. In sum, the study sample was representative of the national population estimates in the West and Midwest Census regions, overestimated population from the South region and underestimated counts from the Northeast Census region.
Accountability Plans and AYP Report Cards The second source of data was the individual 2004 to 2005 school report cards for schools for the deaf provided on state department of education Web sites. A list of state Web sites can be found at the Education Resources Organization Directory: http://bcol02.ed.gov/Programs/EROD/org_list.cfm?category=ID-SEA. Each state determines its own method of data reporting and display. Most state Web sites had a designated location for NCLB information and accountability data. State special education directors were contacted to verify the location of school report cards and clarify reporting policies. The list of these contacts was found on the National Association of State Directors of Special Education Web site (NASDE, 2006).4
National Survey National survey data were analyzed using the SPSS software package. Analysis began with descriptive reports of the prevalence of student participation in standardized assessments and their use of seven accommodations: extended time, small groups or individual administration, test directions interpreted, test items read aloud, test items interpreted, student signs response to a scribe, and simplified English. Full results of accommodations used can be found at Cawthon & the Online Research Lab (2006). Policy variables were added to the data set to allow for analysis by the type of accommodations policy in each participants state (Cawthon, 2007). These designations included whether the accommodation was "Allowed," "Allowed with implications for scoring and/or aggregation," "Allowed in certain circumstances," or "Prohibited" for use on state standardized assessments (Clapper et al., 2005). These policy variables are the main independent variable in the secondary data analysis presented in this article.
Accountability Plans State has a definition of "public school" and "local educational agency" for AYP accountability purposes. The State Accountability System produces AYP decisions for all public schools, including public schools with variant grade configurations (e.g., K-12), public schools that serve special populations (e.g., alternative public schools, juvenile institutions, state public schools for the blind) and public charter schools. It also holds accountable public schools with no grades assessed (e.g., K-2). (U.S. Department of Education, 2002, p. 7) State plans were defined as fitting into one of four categories: (a) The school for the deaf receives AYP report card, (b) assessment scores for students at school for the deaf are sent to the sending or referring district, (c) student scores are aggregated at the state level, or (d) schools for the deaf are exempt from score reporting.
School Report Cards
Assessment and Students Who Are Deaf or Hard of Hearing Standardized assessment and accommodations Because state NCLB report cards do not provide information about use of accommodations or participation in alternate assessments, it is difficult to track the assessments practices that lead to student scores. The first goal of this analysis is to better understand how state NCLB policies might affect assessment practices for students who are deaf or hard of hearing. Participation data from 2004 to 2005 statewide assessments are shown in Figure 1 (Cawthon & the Online Research Lab, in press). In this national sample of 7,646 students, the majority participated in standardized assessments with accommodations (70%). (This figure is smaller than the overall sample for the study because not all participants provided student assessment participation information.) An additional 13% of students represented in this study participated in alternate assessments. The 10% that did not participate includes students who were not enrolled for the entire school year or who were in grades that were not assessed in 2004 to 2005.
Students who are deaf or hard of hearing participate in standardized assessments using a range of accommodations (Cawthon & the Online Research Lab, 2006; Horvath, Kampfer-Bohach, & Kearns, 2005). Teachers in this study reported which accommodations were used by at least one of their students who were deaf or hard of hearing on the 2004 to 2005 state assessment (Cawthon & the Online Research Lab, in press). Extended time was one of the most prevalent accommodations for students who are deaf or hard of hearing, used by more than 80% of respondents who had students receiving an accommodation on statewide tests. The use of an interpreter for test directions (81%) and a separate room or small groups for test administration (90%) were also used by the majority of teachers in this study for at least one subject (math or reading). It is understandable that these three cluster together: having test directions interpreted may both take additional time and be distracting to those who are not receiving this accommodation. Also prevalent, though not to the same extent as those accommodations above, is the use of test items read aloud (70%) and test items interpreted (65%).
Interaction of assessment policy and practice
This policy context provides an important backdrop on the potential impact of accommodations policies on students who are deaf or hard of hearing. Although few states prohibit the use of accommodations outright, policies on the use of some accommodations may restrict how student scores are used within state NCLB accountability frameworks. More specifically, assessments given with restricted accommodations may result in the students scoring at "below proficiency" regardless of test score or having the score removed from the schools aggregated total. For example, only a small percentage (2%) of deaf or hard of hearing students served by teachers in this study lived in states where having test items read aloud is allowed without implications for scoring or aggregation of results. If this is a frequently used accommodation, as results indicate, there may be cause for concern on whether student scores will be used toward schools demonstration of student proficiency (as provided in reports of Adequate Yearly Progress, discussed in the next section). Depending on where the student lived, policies for score use may result in different ways that students using the accommodations were integrated into the accountability framework. This is demonstrated in Table 1 by the distribution of students in states with policies for student-signed response and test items interpreted. For student-signed responses, just more than a third (37%) of students lived in states where state policies allow for students to sign their responses to test items without implications for scoring. However, about half (53%) lived in states where there was no clear policy for this accommodation in the Clapper et al. (2005) summary. Lack of policy information was not as significant for test items interpreted (only 12% of students lived in states without a policy). Instead, about a third of students lived in states where there were no implications for scoring (30%); another third in states where test items interpreted was allowed in certain circumstances but with no implications for scoring (38%), and the remaining students (19%) in states where there were restrictions both in the tests for which this accommodation is allowed and in how scores were aggregated within the schools accountability framework. This is a surprisingly small proportion of students in states that allowed test items interpreted, particularly when seen in comparison to the distribution of students in states that allow test items to be read aloud. To illustrate the importance of the state policy context, consider data for students from the state of California. In the Second Annual Survey, 40 participants from California (serving 890 students) provided information about their assessment practices with deaf or hard of hearing students. California policy for the 2004 to 2005 assessment year allowed students to use extended time, small groups, test directions interpreted, and student signs response without restriction on the state assessments. Changes to the test items, such as having test items read aloud or interpreted, could only be used in certain circumstances and had implications for including the score in aggregates of student performance. There was no policy information available for the use of simplified English. California had a relatively lenient accommodations policy, whereas other states have more restrictions on assessments taken with accommodations. States such as California served a relatively large proportion of the study sample; policies in the larger states weigh more heavily in these overall summaries of how accommodations policy and practice interact. Table 1 provided an overview of the potential impact of state policies based on the distribution of students by state of residence. Table 2 shows available data on assessment practices with students who are deaf or hard of hearing, aggregated by state policy categories (Clapper et al., 2005). Figures in Table 2 represent the proportion of teachers who had at least one student participate in standardized assessments using each accommodation. Percentages across each row total to 100%, or all teachers who provided information for that accommodation. Cells with "—" indicate accommodations where no teacher reported information about student participation in states with those policies.
Two columns to note are Allowed (first column) and Did not use Accommodation (last column). These columns help to anchor our understanding of what accommodations are important to look at in this analysis. In the first column, we see the relative prevalence of accommodations use in states that allow them to be used without restriction. For example, more than half (57%) of teachers with students who used extended time reside in states that allow for this accommodation without restriction. Other high-prevalence accommodations in this category include small groups (86%) and test directions interpreted (77%). On the other end of the table, we see which accommodations were not used by teachers with students who are deaf or hard of hearing. Student-signed responses (83%) and simplified English (91%) stand out as accommodations that are not common practice. As shown in Table 1, these accommodations were also those that had a large percentage of students who lived in states without policy information on their use in accountability frameworks. For most of the accommodations in Table 2, students either largely received accommodations that allow for their score to be included or the accommodations are not prevalent enough to cause concern. However, read-aloud test items and interpreted test items showed a more complex picture of the relationship policy between practice. For read aloud, only 3% of teachers using the accommodation lived in states where it is allowed without implications. Approximately half (49%) lived in states where it is allowed in certain circumstances, usually for tests that do not target reading skills. An additional 16% lived in states that limit both the tests during which read aloud can be used and how the scores are aggregated. Student scores with interpreted test items also have restrictions, but in contrast with read aloud, 40% of teachers that report using this accommodation lived in states with implications for scoring. How does the interaction between policy and practice play out in the California example? California state policy explicitly allows students to use extended time, small groups, interpreter for test directions, and student sign response without restriction. Students who use these accommodations should have their scores fully included in the NCLB accountability process. We turn our focus to the remaining three accommodations, each with restrictions on how scores can be used: test items read aloud, test items interpreted, and the use of a simplified English format. For test items read aloud, 36% of participants had a student use this accommodation for math, reading, or both subjects. A similar proportion, 38%, used test items interpreted for at least one tested subject. Only 8% of teachers in California used a simplified English format of an assessment with their students who are deaf or hard of hearing. For students who participated using either test items read loud or interpreted (students from more than a third of California survey participants), it is likely that their test scores were not aggregated or automatically received a "zero" or "below basic" proficiency rating. In sum, assessment practices with students who are deaf or hard of hearing are complex, but concerns about the validity of test scores and inclusion in AYP accountability appears to be limited to read-aloud and interpreted test item accommodations. The application of these findings is not limited to students who are deaf or hard of hearing. Students with disabilities, as a whole, use accommodations to participate in statewide NCLB assessments, including those that result in restrictions in how scores are included in AYP decisions (Thompson et al., 2005). Students who are English Language Learners (ELLs), especially those with disabilities, may face similar difficulties in having their scores meaningfully represented in the accountability frameworks (Albus & Thurlow, 2005; Anderson, Minnema, Thurlow, & Hall-Lande, 2005). English language proficiency and literacy are essential to a students ability to access information on a standardized test. Accommodations, such as having the test items read aloud or interpreted, may help a student gain access to the test content. Yet assessment policies for ELL students with disabilities are still in the early stages of development (Anderson et al., 2005). Increasing clarity in this area should have a direct benefit for teachers and administrators who serve students who are developing proficiency in English literacy, including those who are deaf or hard of hearing.
Alternate assessments
Summary
Accountability
Under NCLB, schools for the deaf that receive their own report card should also receive their own AYP designation. In 2004, 11 schools had a school report card with an AYP designation, and 16 schools had a report card without an AYP designation (Cawthon, 2004). The number of schools with a report card for the 2004 to 2005 school year doubled to 23 schools in 15 states. Approximately 25% of the schools for the deaf that reported data met AYP guidelines. It is not clear how representative this group of schools is relative to the rest of the schools for the deaf because there is still a great deal of missing information regarding student proficiency and AYP status. For 2004 to 2005, only 12 of the 29 states with policies to include schools for the deaf in NCLB data reporting had report cards with both student proficiency scores and AYP designations. As schools for the deaf with report cards are the only direct way to see the impact of NCLB on educational structures that serve students who are deaf or hard of hearing, increasing the number of reporting mechanisms available will be an important part of how we are able to track the effectiveness of NCLB.
Challenges in data reporting If a school for the deaf with an AYP report card was in a state with a very low minimum number of students (e.g., Maryland, with a minimum n = 5), it was more likely to report individual grade proficiency rates than if the school is in a state with a high minimum group size (as with California, n = 100). A school will automatically meet AYP if there are not enough students to meet minimum n for data reporting. Although many schools for the deaf enroll more than the minimum for data reporting, it is often the case that there were not enough students in the assessed grade (e.g., 8th or 10th grade) to meet these requirements. Of the 15 states with report cards for their schools for the deaf for 2004 to 2005, five (Arizona, Florida, Hawaii, Rhode Island, and Washington) did not report data because there was an insufficient sample of student scores for the assessed grade. These findings confirm research that suggests that a rise in minimum n results in exclusion of students with disabilities from the accountability framework (Simpson et al., 2006). If states raise their minimum group sizes, we will likely know less about how NCLB is working to improve educational outcomes for students who attend schools for the deaf. A second challenge for AYP report cards at schools for the deaf is in how small student populations interact with the use of confidence intervals. When combined with small group sizes, confidence intervals can result in misleading interpretations of student scores. The smaller the group size, the wider the range of scores that statistically meet the criteria for passing the AYP benchmark. Large confidence intervals allow for a wide range of average scores to be counted as proficient, ranges that can hide what are actually low rates of student proficiency. There are even cases where a 0% proficiency, because it is within the confidence interval, is counted as meeting AYP. In 2004 to 2005, of the nine schools for the deaf that met AYP targets for mathematics, four did so through the use of confidence intervals. For reading, two of the five schools that met AYP targets did so through confidence intervals. The relationship between proficiency rates and confidence intervals will be an important factor to watch in future years.
Summary
Hidden Benefits of NCLB The overall goal of NCLB is to improve student achievement for all students. There are several aspects of NCLB legislation that may benefit students who are deaf or hard of hearing and, by extension, other students with disabilities or English Language Learners: An emphasis on student outcomes instead of placement are more transparent reporting mechanisms and better research on testing accommodations.
Student outcomes
Reporting frameworks
Accommodations research The high stakes of state standardized assessment should result in more rigorous and much needed research on the validity of accommodations that change the presentation of the test items. Several researchers have begun comprehensive validity programs of study for students with disabilities as a whole (Elliott, Kratochwill, & McKevitt, 2001; Elliott, McKevitt, & Kettler, 2002; Koretz & Hamilton, 2001; Tindal & Fuchs, 2000). These studies include a look at the effects of read aloud on both reading and mathematics assessments. However, results thus far are limited in that they do not study the validity of using these accommodations with students who are deaf or hard of hearing. In addition to emerging research on the use of test items read aloud or interpreted, other language accommodations include having students sign responses that are then recoded by a scribe. Each of these accommodations allows students who are deaf or hard of hearing to participate in assessments in their primary language, which is sometimes also the language of instruction. Research on language, communication, and American sign language accommodations within test administration will play an important role in improving the validity of accommodations used with students who are deaf or hard of hearing.
Unintended Consequences
Accommodations and integrating test scores
Alternate assessments Equally important as concerns about validity are those about the appropriate use of alternate assessments. Alternate assessment use is a small but important component of how students who are deaf or hard of hearing participate in statewide assessments. Teachers are wary of how standardized assessments without language-sensitive accommodations may underestimate student knowledge (Nagle, Yunker, & Malmgren, 2006). They are also concerned about the use of having students participate in assessments at several grades above their level of proficiency. Unfortunately, a higher proportion of students may currently participate in alternate assessments than NCLB has room for in accountability frameworks. This point is similar to concerns raised by recent roundtable discussions of issues facing students with disabilities as a whole (Center on Education Policy, 2007a). Alternate assessment policies may have a disproportionately negative effect on districts with schools for the deaf or regional programs because a higher proportion of their students take alternate assessments than at schools that serve fewer students who are deaf or hard of hearing (Cawthon & the Online Research Lab, 2006). The impact of alternate assessment use on student integration in accountability needs to be verified with additional studies at the school and district level.
Differential effect on educational structures The second setting where students are taught is district programs that serve students from schools in the region. Districts are accountable for student proficiency, but district AYP report cards offer little information about the effectiveness of programs for special populations, specifically. District programs are not like stand-alone schools for the deaf because they do not receive a single report card that holds the program accountable for student achievement. Instead, student scores are sent back to the referring or sending school, dispersing the student scores across the district. This said, district programs may affect district AYP designations to the extent that students in their programs make up a large proportion of the students with disabilities within the district. For example, depending on the number of other students scoring below proficiency, scores from students who are deaf or hard of hearing may alter whether the district meets state proficiency benchmarks. There are also similar concerns about the inclusion of student test scores that were obtained using accommodations such as test items read aloud or interpreted. Concerns about the use of alternate assessments, particularly when the prevalence exceeds the districts 3% cap on whether proficiency scores can be counted, are most salient for districts with large programs for students with disabilities. As with accountability for students in mainstreamed settings, accountability for district programs for deaf or hard of hearing students is related to the overall status of AYP reports for students with disabilities as a whole. Most of the assessment and accommodation information available for students who are deaf or hard of hearing is for those who attend schools for the deaf. In contrast with mainstreamed settings and regional programs, schools for the deaf are the only educational setting that receives its own report cards and AYP designation. Whether a school for the deaf itself is accountable for student test scores depends on the state policy for special education schools. For states that aggregate student scores at the hosting school, there is the potential for a transparent and consistent mechanism for knowing how students fare on state assessments. At schools for the deaf, issues of minimum subgroup size and confidence intervals will continue to play a role in what we know about student proficiency for students with disabilities. Yet overall, this represents transparent accountability for only a small proportion of students who are deaf or hard of hearing.
One challenge of a large-scale accountability system is to hold all schools to the same standards while leveraging the mechanisms that will improve outcomes for a diverse student population. By disaggregating assessment scores by subgroups of students and integrating them into how schools are evaluated, NCLB broadens accountability from the school level (overall student performance) to the subgroup level. As a result, the bar for satisfactory progress has been raised and shows that there is significant work to be done. There are places where the assumptions of the NCLB Grand Theory of Action are met and some where there is little connection between policy and the lived reality in schools. There are two key lessons we can take from this discussion. First, NCLB accountability depends on a clear assessment process—one with alignment between state policies and teacher practice. One finding from this analysis is teachers use a range of accommodations that reflect the linguistic diversity of their students. State policies are, at times, at odds with local-level decisions. The implication is not simply that teachers should change how they include students in assessments or that policies should be revised to meet what happens in classrooms. Instead, we need a thorough investigation of why these accommodations are used with students who are deaf or hard of hearing, what is needed to ensure access to test content, and what impact these changes have on the validity of the test scores. The result of this effort should be an informed policy framework that takes both research and practice into account. The second requirement for a successful accountability framework is a consistent application of high-stakes policies. The Center on Education Policy noted in its own analysis that shifting policies and measurement practices result in great difficulty in tracking the progress of students with disabilities who are proficient from year to year (Center on Education Policy, 2007b). Because both assessment and AYP guidelines vary from state to state, it is difficult to know how to interpret the relative level of student achievement in schools and districts. For students from low-incidence populations, cross-state analyses are necessary because of the small number of students who may participate in each assessed grade. Yet inconsistent accountability guidelines make comparisons of student outcomes impossible. For example, students who participate in assessments with controversial accommodations may find their scores included in school AYP decisions in one state but not in another. This leads to different decisions about the quality of the school with potentially significant consequences. Reducing variability in assessment and accountability policy would improve accuracy in measuring how NCLB increases student learning. Findings in this article also illustrate the broader challenges of using standardized assessments as the basis for high-stakes AYP decision making. State standardized assessments were not designed to measure achievement in students without grade-level English literacy proficiency and academic preparation. Testing accommodations are useful in facilitating participation in assessments, but they are limited in their capacity to "even the playing field" for students with disabilities without invalidating test scores. Current pilot programs for using measures of student gains in knowledge and skills, or growth models, are potential ways to demonstrate how students are making progress toward proficiency benchmarks instead of absolute achievement levels (U.S. Department of Education, 2007). States will need to be deliberate in how they adjust growth models to apply to students without English proficiency, with disabilities, or who may have other persistent academic difficulties. For example, growth models can consider the additional time these students may need to reach proficiency, extending the time schools have to bring all students to proficiency. Similarly, models can tie benchmarks to the students level of English proficiency and opportunity to learn content area materials, creating a more accurate picture of what schools are accountable for in a given school year. Even under recent modifications to the guidelines, students who are deaf or hard of hearing do not typically qualify for alternate assessments, limiting the way in which their scores can be included into accountability decisions. The same issue applies to students who may not have a severe cognitive disability but do not have the written English proficiency to meaningfully participate in a standardized assessment. There is a hole here that needs to be filled in a way that allows teachers to make assessment decisions that best match the linguistic and academic skills of their students. The Commission on No Child Left Behind, although not endorsing formative assessments or portfolios of student achievement, does concur with the position that current assessment tools for students with disabilities or who are English Language Learners are insufficient (Commission on No Child Left Behind, 2007). Perhaps the role of alternate assessments in NCLB thus needs to be revisited, either by broadening whose scores are eligible or by providing additional options for students who cannot meaningfully participate in standardized assessments because of language barriers. Without these adjustments, students who are deaf or hard of hearing and other English Language Learners may be unintentionally excluded from the NCLB. Analysis in this article demonstrated that there are some significant barriers to demonstrating accountability for all student groups in AYP decisions, especially students who are a low-incidence group or who attend special education schools. Only a small number of student assessment scores, those at schools for the deaf that receive a report card, are transparent in the current AYP structure. In 2004 to 2005, more than half of the states where there was a policy in place to report scores at schools for the deaf had done so. Students in mainstreamed and district program settings are folded into the larger "students with disabilities" context, holding the schools accountable in a general sense but making it difficult for the community to monitor academic proficiency of students who are deaf or hard of hearing, specifically. The NCLB accountability should not be limited to students who are a part of a large subgroup or who attend their local neighborhood school. State-level or district-level aggregation of student achievement for low-incidence groups is one recommended way to increase the transparency of NCLB effectiveness for students from low-incidence populations. The success of NCLB will not be measured in how it raises scores for the majority of students but in how well students who have previously struggled to meet academic goals achieve new levels of success. The research community now has the opportunity to study the effectiveness of accountability strategies on improving student achievement for student subgroups, not just students as a whole. Yet NCLB lacks a mechanism for understanding how the education structure changes and improves student outcomes. Measures of instructional content and its impact on students, particularly those from diverse subgroups, are largely missing from the NCLB Grand Theory of Action. In essence, we see what comes out of the "black box" of learning but understand little about what happens within it. For example, do all students have the opportunity to learn standards-based content that is on state assessments? How do we know whether student gains are because of changes in instruction or other factors? The implicit assumption that NCLB will change the quality of instruction needs to be explicit for the community to be able to target areas for improvement. Future reauthorizations of NCLB (or additional reforms) thus will need to go beyond measuring student achievement and focus on actually increasing our capacity to meet the educational needs of students with disabilities, English Language Learners, and students with diverse backgrounds.
STEPHANIE W. CAWTHON has a background in educational psychology and human development and focuses on the impact of educational policy on students with disabilities. Department of Educational Psychology, SZB 254, 1 University Station D5800, The University of Texas at Austin, Austin, TX, 78712-0383; stephanie.cawthon@mail .utexas.edu. The authors research background is in educational psychology and human development, looking specifically at issues of language and academic achievement in students who are deaf or hard of hearing. The author has conducted a series of studies investigating the affect of educational reform at both the local level (in classrooms) and the macro level (across the United States). The author hopes this research shows how large-scale reforms affect students with low-incidence disabilities and those with diverse linguistic backgrounds. From this vantage point, the author hopes to help develop research-based practices and educational policies that are responsive to the needs of these students. This work was supported in part by the 2006 Faculty Excellence Fund at Walden University. The author wishes to thank the many participants of the National Survey of Assessments and Accommodations for Students Who Are Deaf or Hard of Hearing, her students in the Online Research Lab at Walden University, and all of those who have supported this work. Portions of the findings here were presented at the 2005 Conference of Educational Administrators of Schools and Programs for the Deaf and the 2006 meeting of the American Educational Research Association. The author also gives many thanks to the editors and reviewers for their assistance in shaping this article for publication.
1 The definition of deaf and hard of hearing varies by hearing threshold and cultural identity. Researchers in this field take care to describe the characteristics of the deaf or hard of hearing population participating in each study. Deaf or hard of hearing may include cultural deaf, sign language users, those with cochlear implants, those who wear hearing aids, and those who use a range of communication styles in a variety of settings. Because there is no clear distinction between who identifies with deaf and deaf or hard of hearing groups, this article will refer to the group as a whole.
2 Designation of deafness as a disability does not go uncontested by those who advocate for a linguistic and cultural minority group understanding of deaf persons (Lane, 1999).
3 These data include participation in, accommodations with, and proficiency scores for statewide, large-scale assessments. Many formulations are similar, though not always exactly, to the assessment procedures used to determine adequate yearly progress.
4 This approach was used in analyses of student proficiency in other school years (Cawthon, 2004, in press). Received for publication September 11, 2006. Revision received May 12, 2007. Accepted for publication July 3, 2007.
American Educational Research Journal, Vol. 44, No. 3,
460-492 (2007) This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||











