
The Nature and Effects of Middle School Mathematics Teacher Learning Experiencesby Heather C. Hill  2011 Background/Context: Teachers’ mathematical knowledge has been the subject of recent federal policy, public programs, and scholarly attention. Numerous reports identify a need for improving this knowledge, and total federal spending on contentfocused math and science professional development during the period 2002–2007 is estimated to be above $1.2 billion. Purpose/Objective/Research Question/Focus of Study: In this study, we investigated the patterns in and the effect of professional development in the area of middle school mathematics. We focus in particular on professional development thought to influence teachers’ mathematical knowledge, and use a measure of such knowledge to gauge potential growth in a sample of teachers included in a national survey. Research Design: We surveyed a nationally representative sample of roughly 1000 teachers in both 2005 and 2006 and obtained responses from 461 middle school teachers at both time points. The survey measured their mathematical knowledge for teaching at both time points and also inquired about their professional learning opportunities during the intervening year. Data Collection and Analysis: We used survey data both to describe the nature of teachers’ learning opportunities during 20052006 and to associate participation in these learning opportunities with teachers’ 2005 knowledge scores as well as other characteristics. We also linked these learning opportunities to observed growth in teacher knowledge. Findings/Results: Results indicate that extensive effort and expenditures have not dramatically transformed teacher professional development practices as compared to past descriptions, in that learning opportunities reported by teachers are still typically short and fragmented. However, there are indications that teachers’ mathematical knowledge might have improved somewhat during this time period.
Conclusions/Recommendations: Results from this study suggest a very mixed picture of the middle school mathematics professional development system. Effects on teacher knowledge are modest, and critics might claim that money would be better spent in hiring or induction programs. Teachers’ mathematical knowledge has been the subject of recent federal policy, public programs, and scholarly attention. No Child Left Behind attempted to require every middle school mathematics teacher, for instance, to “demonstrate competence” in the subject by completing a subject matter major, advanced degree or credential, or passing a state certification test in the subject taught. The “mathscience partnerships” funded by the National Science Foundation and U.S. Education Department spent nearly $1.2 billion^{1} on providing mathematics and science learning experiences for preservice and inservice teachers between the years 2002 and 2007. Local district efforts doubtlessly added to this total. Recent commission reports, national panels, and scholarly investigations have all urged attention to K–8 teachers’ mathematical knowledge. Professional learning opportunities offer one important avenue for improving such knowledge. However, evidence for the efficacy of specific programs—and for the professional development sector of the educational economy more broadly—is mixed. Despite several wellknown and effective programs (Carpenter, Fennema, Franke, & Empson, 1989; Cobb et al., 1993; Garet, Porter, Desimone, Birman, & Yoon, 2001; Saxe, Gearhart, & Nasir, 2001), many question the adequacy of the opportunities reaching the majority of teachers (Desimone, Porter, Garet, Yoon, & Birman, 2002; Hill, 2004; Little, 1989; Wei, DarlingHammond, Andree, Richardson, & Orphanos, 2009; Wilson & Berne, 1999). Little evidence exists regarding the efficacy of currently available programs. Further, evidence from past surveys suggests that for many teachers, professional learning opportunities are often short and ill regarded (NCES, 2001; Whittington, 2002). This suggests that commonly available professional learning opportunities may prove an only modestly effective route to the goals of current reforms. To learn more about how national concern and attention might act through specific types of professional learning opportunities, this paper addresses how and whether such opportunities can improve teachers’ mathematical knowledge for teaching. It focuses in particular on inservice experiences of middle school mathematics teachers, the subject of arguably the most concentrated efforts to improve teacher quality in mathematics. Although not a formal evaluation of any specific program, this study surveyed a nationally representative sample of middle school mathematics teachers twice (in the winters of 2005 and 2006) to gauge which learning opportunities teachers attended over the course of one year and to determine whether any of those learning opportunities were associated with changes in these teachers’ mathematical knowledge. Specifically: • In what learning opportunities do teachers participate during the 12month period between survey waves? • What predicts teachers’ attendance in different learning opportunities? • Do the data suggest an increase in teacher mathematical knowledge across waves of the survey? • If there is such an increase, can it be attributed to any specific public policy or program? In answering these questions, a picture of the professional development system emerges that, although promising in some regards, is consistent with past reports of inefficacious programs and limited teacher involvement. BACKGROUND In recent years, policymakers have focused on improving teachers’ subjectmatter knowledge on the belief that such knowledge is linked to student outcomes. However, evidence on this point, particularly in mathematics, is equivocal. In a recent review of the literature, the National Mathematics Advisory Panel (NMAP, 2008) claimed that “research that has used teacher test scores and other ad hoc measures [to predict student achievement] has produced mixed results” (NMAP, 5–16). This echoes findings from a more general literature review across multiple disciplinary subjects by Hanushek (2003), which also found mixed results. However, the NMAP argued that the closer the measure of teacher knowledge to the actual work done in the classroom, the more likely a positive association to achievement would be found. For instance, Harris & Sass (2007) find no effect of teacher SAT scores on student achievement; however, Hill, Rowan, & Ball (2005) find that a measure that taps more jobspecific mathematical knowledge does predict student achievement. Rockoff, Jacob, Kane, & Staiger (2008) directly compared new teachers’ general cognitive ability with a measure similar to the one used in Hill, et al., finding that only the latter was a significant predictor of students’ valueadded scores. And the results in Hanushek (2003) suggest that of all teacherlevel indicators, including teacher experience, certification, salary, and education, teacher test scores are the most consistently significant and positive predictors of student outcomes. Given this relationship, one might ask whether the U.S. teaching force holds the mathematical knowledge needed to teach competently. Researchers have yet to establish how much or what kind(s) of knowledge is necessary to produce adequate instruction, or to benefit students in classrooms. However, the evidence that does exist about the mathematical knowledge of elementary and middle school teachers is troubling. Numerous qualitative and mixed method studies have documented shortcomings in the mathematical knowledge of elementary school teachers (e.g., Ball 1990a; Ma 1999), and the few studies of middle and high school teachers that exist suggest that even relatively stronger collegiate subjectmatter preparation does not necessarily ensure deep knowledge of the content taught to students (Ball, 1990b; Even, 1993; Post, Harel, Behr, & Lesh, 1991; Swafford, Jones, & Thornton, 1997). A recent quantitative study found a threequarter standard deviation gap in mathematical knowledge for teaching between middle school teachers credentialed for grades 6–12 and those credentialed for grades K–8 (Hill, 2007). An international comparison found that future U.S. middle school math teachers are not as well prepared as those in other countries, including some that typically score higher on international studies (Schmidt et al., 2007). Given these findings, it is not surprising that current reform efforts have focused on increasing inservice teachers’ content knowledge through professional development and related means. Although the heavy emphasis on teachers’ content knowledge is relatively recent, the more general body of research on professional development shows that this mode of workforce development can have positive effects. Several programspecific evaluations (e.g., Carpenter et al., 1989; Cobb et al., 1991; Saxe et al., 2001) have shown improved teacherlevel performance and/or studentlevel outcomes. Several studies have also examined the efficacy of larger programs and policies intended to improve teacher knowledge or practice. Cohen & Hill (2001) showed that teachers who attended two specific types of workshop reported more innovative instructional practice; their schools also posted higher scores on a state assessment. Harris & Sass (2007) demonstrated via a large longitudinal sample that in Florida, middle and high school teacher participation in contentspecific training was associated with improved student achievement. Desimone et al. (2002) demonstrated via a longitudinal sample that teachers whose professional development focused on specific instructional practices did tend to use more of those practices in the classroom. In the area of teachers’ mathematical knowledge specifically, several studies conducted evaluations of teacher growth without attempting to link teacher outcomes to student achievement. Hill & Ball (2004) found that professional development workshops that were longer in time and focused on mathematical explanation, representation, and communication boosted teachers’ mathematical knowledge for teaching scores more than did shorter workshops with less of such focus. Garet et al. (2001) demonstrated that professional development that concentrated on content knowledge, provided opportunities for active learning, and cohered with other learning activities was related to selfreports of improved knowledge, skill, and change. Nonetheless, some observers suggest that the professional development described in such studies is atypical of that which reaches the average teacher (Desimone et al., 2002; Hill, 2004; Little, 1989; Wilson & Berne, 1999). Garet (2001) noted, for instance, that only about onequarter of the teachers in their sample experienced consistent, highquality professional development. Cohen & Hill (2001) made a similar observation, noting that a minority of teachers in their sample attended workshops that were associated with positive teacher and student outcomes. Teachers themselves agree: when queried about the impact of their past three years of professional development experiences, less than a quarter, on average, reported that professional development affected their instruction (Horizon Research, 2002; NCES, 2001). Most teachers, in fact, reported that professional development reinforced their existing practices, with a minority reporting no impact at all. This may result, in part, from the fact that most teachers do not or cannot invest substantial amounts of time in professional development. Surveys with nationally representative samples show that the modal teacher typically spends only a few hours studying in a specific content area each year (e.g., student assessment; Horizon Research, 2002; NCES, 2001). Although teachers do accumulate professional development hours across topics, this may lead to fragmentation and more superficial learning opportunities. Other questions on these and similar surveys (Cohen & Hill, 2001) have asked teachers to report their overall time investment in conventional professional development in the past year; teachers typically report 1–2 days. The weakness of professional development as a means toward instructional improvement may, in some part, trace to the nature of this subsector of the educational economy. As Rowan (2002) noted, the offerings in this sector come from mainly smallscale forprofit and notforprofit enterprises, and the content is prone to faddishness, as new educational “problems” and “solutions” emerge. Districts contract with multiple providers and often maintain a “menu” of offerings (Little, 1989; also see Desimone et al., 2002). Thus the quality and effects of the professional development are constrained by what is immediately available, as this enterprise—despite growth in the number of companies offering professional development at scale—remains largely local (Hill, 2009). Teachers’ decisions to enroll in professional development have also been the subject of much speculation. Although there is little direct evidence about teachers’ decisions on a large scale, some themes emerge. Many authors (Cohen & Hill, 2001; Desimone, et al., 2002) look at highly variable withinschool patterns of teacher participation and conclude that teachers are the drivers of their own professional development choices. Participation in programs—and in particular, in programs that require extensive time investment—is largely voluntary, and individual predispositions, interests, and capacity may play a large role in making those decisions (see also Supovitz & Zeif 2000). Others, however, see districts as playing a stronger role by mandating attendance at specific workshops (Little, 1989). While no survey has directly asked teachers about the mix of mandated and choicebased professional development, one summary of published evaluations of scienceprofessional development found that most programs relied upon volunteers to fill their ranks (Bobrowsky, Marx, & Fishman, 2001). Finally, although nearly every teacher attends some form of professional development yearly (NCES, 2001), there are few evaluations of program outcomes (Borko, 2004; Wayne, Yoon, Zhu, Cronen, & Garet, 2008). In its recent report, the National Math Panel identified only eight studies between 1989 and 2008 that rigorously examined the relationship between professional development and student outcomes. Wayne et al. (2008) note that despite wide consensus around features of “effective” professional development, such features have yet to be confirmed empirically. Past empirical evidence, then, is remarkably consistent regarding undesirable features of the professional development system, including slender time investments on the part of teachers, oftenshifting programs, and variable program quality. Questions arising from these literatures, combined with current largescale policy efforts to improve middle school teachers’ mathematical knowledge, inform the study described here. Have such policy efforts, which have taken place primarily through teacher professional development and course taking, resulted in a more knowledgeable U.S. teacher population? Have they changed the patterns of participation on the part of the “average” teacher, as compared to past studies? A description of how data from the study shall answer this question is next. METHODS This study, like several others in the past decade, uses survey methodology to examine the potential efficacy of professional development in improving teaching and learning. Garet et al. (2001) surveyed over one thousand participants in the federally funded Eisenhower Professional Development Program and found that specific program characteristics were associated with teachers’ selfreports of both knowledge and skill enhancement, as well as changed teaching. Cohen & Hill (2001) similarly surveyed nearly 600 California teachers and were able to link attendance at specific professional development programs to innovative approaches to teaching and learning, and to stronger school performance on a state assessment. Desimone et al. (2002) followed a sample of 207 teachers longitudinally, identifying an impact of specific professional development opportunities on instructional practice. This study combines many of the best features of these previous research efforts. First, it uses data from two time points rather than a single crosssectional survey to assess teacher learning. Although not as strong as studies that employ three or more waves of data (e.g., Desimone, et al. 2002; see Singer & Willett, 2003), having baseline data provides advantages over crosssectional studies. Second, this study directly measures teachers’ mathematical knowledge for teaching (MKT) rather than relying on items representing instructional practices or selfreports of learning. Third, respondents come from a nationally representative sample, and thus paint a broad picture of professional learning opportunities. SAMPLING AND ADMINISTRATION Our project’s goal in sample selection was to accurately represent the population of middle school teachers in the U.S. To enable this goal at only modest expense, we elected to use a mail survey. Although mail surveys have significant disadvantages—for instance, we have no knowledge of whether teachers used outside resources to answer the items and improve their scores—an inperson survey would have been prohibitively expensive, given the geographical dispersion of respondents. To obtain the sample, we first selected schools from the Common Core Database (CCD) of the National Center for Education Statistics (NCES). We defined “middle schools” as those schools that had at least 10 students in each of the sixth and seventh, or seventh and eighth grades. In the first year of the study, our datacontracting organization, the Institute for Social Research (ISR), successfully confirmed by phone the teacher roster in 1,065 schools, and randomly selected 1,000 schools from that number. Within each school, the ISR selected one teacher at random. Teachers were mailed an initial survey in April 2005, compensated $50 per survey for their effort, and reminded of our interest in their response up to three times per wave. In the first year, we obtained a 64% response rate. Respondents were instructed to imagine themselves encountering the survey’s mathematics problems in reallife teaching situations, and to take only two to three minutes to answer each item. In the second year of the study, the ISR recontacted the initial 1,000 schools to update information. If a teacher had left the school in the intervening year, the interviewer asked for forwarding information and/or completed an Internet search for the teacher. Of the initial sample, 876 teachers were located. All located teachers, regardless of new teaching assignment or retirement status, were included, and an identical mailing/compensation procedure was used. Of this sample of 876, there were 499 (57%) teachers who returned completed surveys mailed in February 2006. Of the 2005 sample, four hundred and sixtyone, or 46%, returned both surveys. Although this is a lowerthanhopedfor survey response, it is not low given the longitudinal nature of the study design.^{2} This group forms the basis for the analyses reported below. An analysis using the mathematical knowledge measure, described below, suggests that those who responded in 2005 but not 2006 were lower in knowledge (a difference of 0.34 for algebra, and of 0.23 for number/operations) than those who responded both years. The 2005 MKT standard deviation dropped by roughly 0.10, suggesting less overall variability; however, the range of scores stayed the same. Nonresponders in 2006 were more likely to come from higherpoverty schools (free and reducedlunch eligibility of 43% as opposed to 34%) but were similar in experience. In theory, differences between the 2005 and 2006 responders should be controlled by the use of only individuals who responded in both years. While the loss of some variability limits generalization to the sample of 2005–2006 responders rather than to the national sample, as long as the sample is not truncated as a result of nonresponse, results of multivariate statistical tests should hold. MEASURES DEVELOPMENT Teacher Knowledge Instrument Development of the mathematical knowledge for teaching instrument (MKT) was described in depth in Hill (2007). I present only a brief summary here. First, we wanted to write items that would measure teachers’ content knowledge for teaching middle school mathematics, rather than their knowledge of high school or college mathematics (e.g., calculus, trigonometry, differential equations) or their pure mathematical aptitude or skill (Bass & Ball, 2003). Said another way, we wanted to capture professionspecific mathematical knowledge, or the mathematics that teachers would need to know in order to communicate this subject effectively to students. Clearly, teachers need to have “common content knowledge” (CCK)—or the mathematical knowledge that is common across mathematicallyfocused professions such as accounting, nursing, and engineering, and which in part forms the basis of the standard middlegrades curriculum. However, we also sought to assess teachers’ specialized content knowledge (SCK), or the mathematical knowledge that only teachers are likely to use. Examples include mathematical explanations for common rules or procedures; common nonsymbolic representations (or links between representations) of mathematical topics; the ability to unpack and understand nonstandard solution methods; and mathematical definitions used in accurate yet also gradelevel appropriate ways. This knowledge, like common content knowledge, is wholly mathematical; individuals taking our assessment need not know about students, or instructional methods or materials in order to answer such items correctly. However, SCK is mathematical knowledge that most nonteaching adults would not possess. Second, instrument development did not have a goal of writing a bank of items that teachers needed to know in order to teach—determining this body of knowledge would have taken years and might have yielded an instrument with poor measurement properties. Instead, developers aimed to discriminate well among individuals of differing knowledge levels. To do this, our goal was to target items such that the average item was answered correctly approximately 50% of the time, and to have items that spanned a wide range of difficulties, from relatively easy items that nearly all teachers could answer correctly to items that nearly all teachers would answer incorrectly. As a result, these instruments are not criterion referenced—there is no substantive interpretation of particular scores or performance on specific items. Third, we chose two key areas of the middle school curriculum on which to focus our items: number/operations, which comprises the majority of the curriculum at these grades, and prealgebra/algebra. There were 36 stems (problem situations) on the 2005 questionnaire and 37 stems on the 2006 form. Because some stems had multiple items beneath them, there was a total of 92 items in 2005 and 69 items on the 2006 form. In both years, items were split between number and operations (44 and 40 in the two years, respectively) and prealgebra/algebra (48 and 27^{3}). Results of factor analyses suggest the 2005 and 2006 forms have only one general factor, and thus we could consider focusing on a single overall score for mathematical knowledge. However, there are reasons to separate number/operations from algebra for analysis, including the fact that most professional development treats only one topic at a time. Despite the fact that mathematical knowledge appears quite general in the population of teachers, specific topics might be differentially affected during the course of learning opportunities. Table 1 shows the scales and scale reliabilities used in this analysis. Table 1. Scales and Scale Reliability
To understand better what our instruments measure, this project undertook extensive validation work. First, we conducted content validity checks, ensuring that our item pools provided fair coverage of the topics (e.g., number and operations) they intended to represent. Then we developed evidence for convergence and predictive validity. In one recent study, we found a correlation of 0.58 between middle school teachers’ MKT scores and the mathematical quality of their instruction, as observed over six lessons. Teachers with lower MKT scores were more prone to mathematical errors and less likely to offer students mathematical explanations, representations, and disciplinary insight. These teachers’ MKT scores were also linked to their students’ valueadded scores, with a correlation of up to 0.45 (Hill, Umland, & Kapitula, under review). This finding echoes a study done with elementary teachers, where students of teachers who answered more items correctly gained more over the course of a year of instruction (Hill et al., 2005). Finally, a separate research project has recently linked elementary and middle school teachers’ MKT scores to student outcomes in valueadded models (Rockoff et al., 2008). OPPORTUNITIES TO LEARN AND DEMOGRAPHIC VARIABLES In 2006, we asked teachers to report on the learning opportunities they had encountered in the previous 12 months. We selected learning opportunities that, based on either advocates’ rhetoric or on past evidence, have teachers’ learning of mathematical content as a significant goal. The specific learning opportunities include: • Undergraduate or graduatelevel mathematics courses. Our expectation was that teachers seeking “highly qualified” status under NCLB might take mathematics coursework to fulfill such requirements. Members of the mathematics department typically teach these courses. • Undergraduate or graduatelevel mathematics methods courses. Teachers might take these courses, typically offered through schools of education by mathematics education faculty, as part of their efforts to obtain an advanced degree or update their skills. • Institutes or workshops associated with a mathscience partnership (MSP). MSPs are typically operated jointly by mathematicians and mathematics educators, and originate from two funding streams. One, administered by the National Science Foundation, awards MSPs on a competitive basis. The other, administered by the U.S. Department of Education, awards MSP funds to states for transmission to local districts partnered with local universities. On the view that many teachers would not be able to accurately distinguish the original funding source of their MSP institute, and because these programs are both federal in nature and have similar goals, we elected to ask only about attendance at generic MSPs. • Lesson study. Teachers might participate in lesson study by forming or joining a group to study curriculum materials and research, jointly adapting or designing a lesson, teaching the lesson, analyzing student responses, and revising or reteaching the lesson. U.S. lesson study is an adaptation of the methods used by Japanese teachers for professional development (Lewis, Perry, Hurd, & O’Connell, 2006). Lesson study is intended (as designed) to be mathematically intensive for a few select topics. However, the extent to which widescale implementation follows this design is unknown. • Professional development surrounding new mathematics texts. Previous research (Cohen & Hill, 2001) identified curriculumfocused workshops as one method for improving teachers’ knowledge and facility in teaching. However, most textbased professional development is neither pedagogically nor mathematically intensive, consisting of an introduction to new products or even, in some cases, a sales pitch from publisher representatives. In addition, both the 2005 and 2006 surveys asked teachers about their class assignments, years of experience, and credential type. The 2006 survey asked about teachers’ overall levels of mathematics and nonmathematics professional development. School characteristics, in particular the percentage of students who were freelunch eligible in a given teachers’ school, were merged onto the dataset from the NCES Common Core Datafile. ANALYSIS Teachers’ answers to each year’s survey were entered into a twoparameter itemresponse theory (IRT) model. IRT returns personlevel scores expressed in standard deviations, with a mean of 0 and a standard deviation of 1. We use this metric, rather than the more readily interpretable percent correct, because differences in percentages do not always represent equal intervals, and because our items are not criterion referenced. Twoparameter IRT models were used to score teacher responses; these models give greater weight to items with strong persondiscrimination indices and less weight to items with low persondiscrimination. In addition to descriptive statistics, three types of analyses answered the questions posed at the outset of this paper. First, a probit model examined associations between teacher characteristics and attendance at different learning opportunities. Explanatory variables included school socioeconomic status, teachers’ gender and credential type, their yearone (2005) MKT score, and their 2005 teaching assignments (remedial/special education, integrated mathematics, general, algebra). Clearly, this is only half the story: teacher participation in specific learning opportunities is also determined by their availability, cost, and distance, as well as by district mandates/preferences regarding teacher attendance. However, these data were not available for the sample, a point we address in the conclusion. Second, the 2005 and 2006 forms were equated and inspected for any growth in teacher’s mathematical knowledge. Equating accounts for any differences in overall difficulty level of items between years of the assessment—items on the 2006 form, for instance—might have been more difficult than for items on the 2005 form, making it appear as if teachers lost ground as measured in raw percentages, when in reality, the item difficulty levels accounted for the differences. The main method for equating was a comparison of three linking items per subject area that were repeated on both the 2005 and 2006 forms.^{4} If percent correct among the group of teachers who took both forms was roughly equivalent for the linking items (i.e., no evidence of learning in the aggregate), the forms were equated using commonperson equating. That is, all items and all teachers from both 2005 and 2006 were entered into the same IRT calibration model, and teacher scores for these two years were derived from the item parameters produced in this calibration model. However, if percent correct on the linking items differed between the two years (as it did for one of the scales), we followed a common linkingitem strategy for equating. In this strategy, the same items from 2005 and 2006 were lined up and an estimate of the relative difficulty of each form obtained. This estimate was used to adjust the 2006 form difficulty to the 2005 level. This adjustment was then the estimate for growth in the sample. Finally, to determine whether change in teacher knowledge was related to a specific professional development activity, the analysis included a covariate adjustment model, with the 2006 MKT score as the dependent variable and the 2005 score and other independent variables as predictors. Covariate adjustment models are favored over gainscore models (ASA, 2007) because the model does not make an assumption about the slope of the relationship between the scores in each year, and because it lends itself to the investigation of curvilinear trends in the rate of change. All statistical procedures were conducted in SAS 9.1, and independent variables were standardized for ease of comparison. RESULTS TEACHERS’ LEARNING OPPORTUNITIES Table 2 shows teachers’ selfreported learning opportunities in the year between the two surveys, as reported on the 2006 form.^{5} Overall, 80% of teachers in the sample reported attending one of the opportunities listed here, a figure that closely parallels the 86% who, in a separate question, reported spending more than six hours in mathrelated professional development. This suggests the learning opportunities covered in Table 2 comprised a large proportion of the professional development available to these teachers during the 2005–2006 school year.^{6} The most popular form of mathematicsrelated professional development was lesson study, enrolling over 50% of teachers who responded to both waves of the survey. Participation in a MSP institute or workshop was nearly as popular, enrolling 48% of respondents. Mathematics and mathematics methods coursework was less frequently attended by teachers, which makes sense, given the financial and time commitments both require.^{7} Learning about new curriculum materials in a workshop format was reported by over a third of respondents. Table 2. Teachers’ Learning Opportunities by Program or Type, 2005–2006
Like other studies (Cohen & Hill, 2001; Horizon, 2002; NCES 2001), this one found that the modal professional development experience lasted eight hours or less (Table 2).^{8} This was true of every professional development category we inquired about, from coursework to lesson study.^{9} Still, a significant number of teachers reported spending over one day in most of the professional development programs listed above. Overall, 44% of the sample reported attending at least one of the listed professional development activities for more than a day. As in past studies (Cohen & Hill, 2001), this survey found that many teachers combined attendance at different types of workshops during the year under study. Although 29% attended only one option, 25% combined two and 26% combined three or more. This suggests that teacher professional development is still largely fragmented across different types of learning opportunities within a given year. Our survey also uncovered a significant amount of nonmathematics professional development; 78% reported attending such learning opportunities, and the hours spent in other professional development were roughly comparable to those spent in mathematics professional development. Overall, this paints a picture of a teacher population actively engaged in professional development, but engaged at only a minimal level of involvement, with potentially fragmented experiences. As with other studies (Cohen & Hill, 2000; Supovitz & Zief, 2000), many teachers limited their time investment to levels below what is generally considered effective by scholars. The combining of multiple professional development programs also appears in other studies (Cohen & Hill, 2000; Desimone et al., 2002), with attendant worry that such fragmentation of experiences leads to a lack of time and opportunity for indepth study and growth. PREDICTING ATTENDANCE AT LEARNING OPPORTUNITIES The next question concerns whether teachers who participate in professional development appear to differ from those who do not. Previous research (Hill, 2007; Hill & Lubienski, 2007) has shown that teachers vary substantially in their mathematical knowledge, and that teachers who work in schools serving students of low socioeconomic status (SES) perform, on average, significantly worse on the MKT measure. Yet these lowSES schools serve students who would benefit from having teachers who are more prepared than the average U.S. middle school teacher. Ideally, the teachers more in need of mathematical knowledge would not only sign up more often for the professional development listed, but also engage in the opportunities (including MSPs and content/methods coursework) that were more likely to be mathematicsintensive. To determine whether this was the case, the probit models (presented in Table 3) predicted attendance at any learning opportunity for more than eight hours (first column) and then at each learning opportunity for more than eight hours (subsequent columns). Attendance was predicted by teachers’ 2005 MKT scores, along with credentials, 2005 courses taught, and background variables. The results show that, in general, there is little tendency for these opportunities to reach where they are needed most, at least in the way that policymakers might hope. With the exception of MSPs, less knowledgeable teachers were not more or less likely either to attend these professional development opportunities generally or to attend more contentfocused opportunities within this set. As for MSPs, the tendency for lessknowledgeable teachers to attend was very slight, with a significance level of .10. Additional models checked for nonlinearity by adding dummy variables representing highscoring and lowscoring 2005 teachers (not shown), but found no general tendency on the part of either the mathematically needy or mathematically apt to enroll in any type of professional development. The exception was mathematics content coursework, where lowestknowledge teachers were more likely to avoid this offering (odds ratio = 0.22). This runs counter to policymakers’ hope that lowerknowledge teachers would enroll in professional development focused exclusively on mathematics content, but accords with the results presented in Desimone et al. (2006). Table 3. Participation in an Opportunity to Learn of More Than Eight Hours
+ Significant at p <.10. * Significant at p < .05. ** Significant at p < .01. *** Significant at p < .001. The same lack of association between predictors and attendance was also true for student freelunch eligibility (FLE). Only attendance at more than eight hours of MSP workshops/institutes and participation in mathematics coursework was predicted by the characteristics of students in a teacher’s school. And again, this is not a striking difference, with the odds of an average FLE teacher attending these workshops reaching only about 1.5 times the odds of a teacher working in a school one standard deviation lower on the FLE measure. In the ideal world, teachers of lowincome students would be directed in greater numbers toward more mathematically substantive professional development. In fact, the most consistent predictor of significant professional development experiences was the courses teachers reported teaching in 2005. Teaching an integrated mathematics program (e.g., the Interactive Mathematics Program (IMP) or Connected Mathematics) positively predicted participation in MSPs, lesson study, and new textbook workshops. Teaching a remedial math class positively predicted attendance at a textbook workshop. Teaching prealgebra or algebra slightly predicted attendance at an MSP. The absence of “supply side” variables in these models—such as the availability of local opportunities, the cost of those opportunities, district preferences, and so forth—is of concern. If such variables were related to those included in the model, bias on regression coefficients due to omitted variables could occur. However, it is difficult to imagine how omitted variable bias would result in weak or no relationships between at least one key variable of interest, teachers’ prior MKT, as the presence of lowMKT teachers should theoretically spur more local opportunities. More likely is a scenario in which decisions to enroll in professional development—whether made by teachers, districts, school personnel, or others—are simply orthogonal to mathematical knowledge. This lack of strong associations between most individual characteristics and professional development suggests that these offerings, often considered the primary mode for the upgrading of the teaching workforce, are unlikely to reach where most needed. In other words, with the exception of MSPs, there is no strong tendency for teachers with weaker MKT, or teachers of lowSES students, to enroll in professional development, or to favor the programs more likely to focus intensively on mathematics. This is discussed at more length in the conclusion. However, the lack of strong selection effects, particularly with regard to schoollevel free lunch eligibility and teachers’ prior MKT scores, is good news for efforts to identify potential learning effects. Although MSPs appear to be more likely to enroll lowSES/lowMKT teachers, for other learning opportunities there appears to be either little or no selection into professional development based on teacher characteristic. This also further allays concerns regarding survey nonresponse; although lowknowledge teachers were more likely to drop out of the study, those that did participate had little tendency to attend specific forms of professional development. DO TEACHERS LEARN AND, IF SO, HOW? Next is an examination of the two years of data for evidence of teacher learning. To assess this possibility, the 2005 and 2006 surveys contained six identical items: three in algebra and three in number/operations. Table 4 shows little change in the percent correct on three common algebra items. The lack of learning effects is this area is also suggested by Table 5, in which the 2005 score and having taught algebra during the 2005–2006 school year were nearly the only significant predictors of the 2006 score. There was also a slight tendency for teachers who scored at the 2005 extremes to score even better than expected in 2006, as shown by the 2005 scoresquared terms. None of the learning opportunities we inquired about, and which so many teachers attended, were positively related to higher levels of teacher MKT in algebra (Table 5, models 1 and 3). The effect of teaching algebra during the year of the survey was 0.34 (p < 0.001). The other significant predictor in this model was the percent of freelunch eligible students within the teachers’ school; teachers from higherpoverty schools tended to lose ground, marginally, over those who did not (standardized beta = 0.06, p =0.05). Table 4. Differences in Percent Correct, 2005–2006, for Common Linking Items
Table 5. 2006 Score Predicted by 2005 Score and Learning Opportunities
+ Significant at p <.10. * Significant at p < .05. ** Significant at p < .01. *** Significant at p < .001. The story is different for number and operations. Table 4 shows middle school teachers posted gains on all three linking items between the two administrations. Determining the true size of this gain, as measured in standard deviation units, is problematic, as having only three items in common between the two forms limited the accuracy of and methods for determining true growth. Treating the 2005 and 2006 samples as independent and using common item equating, the IRTestimated gain is 0.25 standard deviations, and a bootstrap analysis that simulated the standard errors of the equating parameters put the average gain at nearly 0.5.^{10} Neither estimation method is entirely satisfactory; both estimation methods rely solely on teacher performance on three items. A number of alternative hypotheses, including an “unlucky” choice of linking items (i.e., choosing three with large positive deviations from the 2005 percent correct), testretest memory effects for specific items, or selection effects among those who received the secondyear survey, could all inflate percentage correct scores on these particular items. Fortunately, this project had previously conducted a study nearly identical to this one, and was able to use results from it to examine typical changes in percent correct in testretest situations. This previous study, which focused on a K–6 population of nonrandomly sampled teachers, enrolled 281 teachers who took identical 24item MKT assessments at two time points, roughly 9 months apart. Figure 1 shows the distribution of differences in percent correct for individual items over this time period. On average, teachers averaged only a 1% gain on these items—statistically negligible. Twothirds of values on the second administration were within ±3% of values on the first administration. Onequarter of items had a deviation of +4% or more, and 8% of items had a deviation of 4% or less. Although there is clearly a greater chance of randomly selecting linking items with a positive than with a negative deviation, large positive deviations are still in the minority in this study. Basic probability theory suggests that we would have had to be very unlucky to select three such large positivedeviation items (0.25^{3} = 0.015, or a 1.5% chance of selecting such items by chance, assuming similar data structures). Figure 1. Previous testretest differentials in pvalue (elementary sample) Models predicting 2006 teacher performance from 2005 performance and opportunity to learn variables provided more clues to the veracity of these gain estimates. Because these scores were standardized via IRT, a comparison of 2005 and 2006 scores essentially comprised a comparison of teacher rankorder performance in these two years. Coefficients on the independent variables may or may not reflect real changes in teacher knowledge over the intervening year, but they do tell us how certain groups of teachers were becoming more or less like one another. Moreover, these models can tell us whether specific opportunities to learn resulted in better performance, visàvis the average teacher in the sample. If significant predictors are found, this suggests the gain estimates above are at least correct in direction and in relative magnitude. Table 5, model 2 shows that the original version (standardized, but entered in the increments shown in Table 2) of some professional development opportunities do predict teacher performance on the assessment. The 2005 and 2005 squared term still explain the major portion of the variance in these models,^{11} with an adjusted rsquared of 0.57. However, with opportunity to learn variables included, the adjusted rsquared rises slightly to 0.59, and both math methods courses and mathscience partnerships become significant. For math methods coursework, a oneunit increment in time investment yields a 0.06 gain in posttest performance. For mathscience partnerships, the effect is the same, yet in the opposite direction. Finally, these models show a negative effect of teaching in a highpoverty school, with such teachers tending to lose ground compared to others in the population. To investigate this further, and in particular to assess the possibility that the original scale does not properly capture the effects of concentrated professional development, the fourth column shows models that divided each learning opportunity into two variables: attended for eight hours or less, and attended for nine or more hours. Two dummy variables were created, with the teachers who did not attend any professional development program serving as the reference category. Table 5, model 4 shows that none of these learning opportunities had an effect on teacher knowledge at eight hours or less. This echoes the many studies (e.g., Garet et al., 2001) which demonstrate that shortterm workshops do little to improve teaching, learning, or in this case, teacher knowledge. Yet Table 5 also shows that of the opportunities to learn we inquired about, attending a graduate or undergraduate mathematics methods coursework for more than nine hours was related to the 2006 score. These teachers scored 0.23 standard deviations higher than teachers who did not attend math methods courses. Figure 2, which plots program effects by hours invested, shows this effect rose and peaked among those who spent large amounts of time (40–80 hours) on this form of professional learning. Many of these teachers were younger, in their third to fifth year of teaching, and presumably seeking a master’s degree or other certification that would either increase their salary or provide a basis for recertification. Figure 2. Number and operations gain by hours invested Model results are not favorable for mathscience partnerships. Individuals who enrolled in these partnerships performed, on average, 0.18 standard deviations (standard error = 0.07) below teachers who reported attending no professional development. This does not necessarily mean that the MSPs caused a direct loss of professional knowledge; one possibility based on evidence in Table 3, for instance, is that teachers from highpoverty schools were more likely to choose to attend an MSP, and that those working in highpoverty schools lost ground across this year, perhaps as a result of less challenging student curriculum and coursework.^{12} In fact, MSP attendees were more likely to be teaching remedial coursework (r = 0.07, p < .10) than the general population. Further, MSPs had no negative effect on teacher algebra scores using the yeartoyear equating procedures, and MSP attendees were slightly less mathematically knowledgeable at the outset of this program. This suggests caution in interpreting the effects of this program. Results for lesson study, new textbook workshops, and math coursework were all flat in these models. From many respects, these findings make sense. Lesson study is designed to delve indepth into specific mathematical content, such as proportional reasoning or operations with fractions; although teachers may grow on that specific content over the course of a lesson study cycle (often 3–4 months), the MKT measure may not contain sufficient items on the topic to gauge growth. Workshops on new textbooks vary dramatically, from sales pitches by publishers to indepth work around how to implement specific lessons. And math content coursework might also not align well with the MKT measure, particularly if that coursework focuses on postsecondary topics. Results from these models are fairly stable; models that included each independent variable singly, when compared to the models in which all of the independent variables were included, did not change which variables appeared to be important. Interactions between 2005 IRT scores and the variables representing high investments in math, methods, and MSPs were insignificant. Compared to the effect of the 2005 mathematics knowledge score, the effects of these independent variables were not large. They increased the rsquared by about 0.02 over models with only control variables, suggesting again that the effect of professional development on teacher knowledge was, in this set of respondents, not large. Further, the significant learning opportunity was one that enrolled relatively few teachers (7%). This, then, leads back to the problem of a true growth estimate for the population of teachers. The most conservative interpretation would be to say that the sample of teachers gained modestly—a percentcorrect difference of 5–10%, or a standardized score gain of roughly 0.25—on the three number and operations items included on both waves of the survey. These items covered key middle school topic areas—an explanation for why the division algorithm works, an explanation for why crossmultiplication works as a method for finding unknowns in ratios, and an item that asked teachers to evaluate different methods for solving problems involving proportional reasoning. However, the extent to which gains on these items represented true gains in the population of teachers across the wider domain of knowledge is unknown. There are some reasons to think that the gains seen on these items might generalize past the immediate three included on both forms to the domain of number/operations more generally. These items were chosen to represent a range of item difficulties (easy, medium, difficult), and because they each also had a high discrimination index. They represented the kinds of topics teachers might learn in contentfocused professional development, including the meaning of operations and how to evaluate nonstandard solution methods. This suggests that they are wellpositioned to track a shift upwards in teacher knowledge during this time. Item gains were at the outer edge of gains recorded in a similar, 2002 study. In addition, teacher number/operations scores are responsive to specific opportunities to learn this content. The fact that mathematics methods coursework, rather than mathematics content coursework, predicted these gains is also suggestive, as the topics covered in methods courses are more likely to match the items on the assessment than content courses, which typically deal in more advanced mathematical topics. Yet, as one colleague suggested upon viewing these results, a gain of 0.25 standard deviations per year means that the problems of mathematics teaching and learning in the United States would be quickly solved. Thus, the best guess is that the gain of 0.25 indicated by these common items most likely represents the upper bound on growth during this time. DISCUSSION AND CONCLUSIONS This study, like many surveybased studies of its kind, has numerous limitations. Response rates are typically below 50% in multiwave studies, and this was no exception. Nonresponders might differ from responders in important ways, rendering the descriptive analysis of professional development attendance particularly problematic and limiting results of the regression models to this particular set of respondents, rather than allowing a generalization to the wider sample of U.S. middle school teachers. The study also lacked any way to track the availability of teachers’ opportunities to learn—or the “supply” side of the models matching teachers to learning opportunities. Without these data, full models of teacher choice and attendance are not possible. We asked only about generic programs and formats, which limits our capacity to make claims about effective content. Further, claims about the absolute size of growth in teachers’ MKT are disputable, given the use of only a few linking items per construct. For these reasons, conclusions from this study are tentative. Nevertheless, some provisional observations can be made from the data presented here. This study compares portraits of individuals’ participation in professional development during 2005–2006 with that from older studies. Based on a comparison of evidence, it appears that patterns of teacher professional development are not strikingly different than they were 10 to 20 years ago. Although new professional development methods have become popular (e.g., lesson study) and there has been a strong move toward contentfocused learning opportunities, the patterns uncovered by studies such as Little (1989), Horizon Research (2002), NCES (2001), and Cohen and Hill (2001) remain the same. Most teachers participate in a day or two of professional development per year, and that professional development is often scattered across multiple sites. And teachers in this study reported dividing their professional development time roughly equally between mathematics and nonmathematics professional development. Without tracking the same teachers and their opportunities to learn over time, we cannot detect finegrained trends. However, the consistency of reports from Little (1989), Cohen and Hill (2001), NCES (2001), and others is striking. The associations between teacher characteristics—in particular, teacher need—and participation in professional learning also appear weak. Lessknowledgeable teachers are not more likely to spend more than eight hours in a mathematicsfocused learning opportunity, nor did prior mathematical knowledge strongly predict attendance at any specific workshop. Teachers in highpoverty schools were more likely to attend a MathScience Partnership and/or content coursework. Yet neither of these learning opportunities predicted upwards movement of participants’ MKT score, relative to other teachers in the sample. This is particularly troubling in light of the timing of this study, which occurred during a period during in which the federal government had elected to target middle school teachers without mathematics subject majors and, presumably, weaker mathematical backgrounds. However, findings are consistent with prior studies, including Cohen & Hill (2001), which found weak predictors of teacher enrollment in professional development, and Desimone et al. (2006), which found that morequalified and prepared teachers were in fact more likely to enroll in contentfocused workshops. One possible reason for the relative constancy of the professional development system is its broad support from both public policy and individuallevel incentives. Teachers typically face district and state requirements for participation in professional development. In many locations, these requirements, which enable continued employment and/or relicensure, consist of a minimum number of hours (typically 8–16 hours) and in many places, specific “bins” of professional development types that must be fulfilled (e.g., HIV training, campus safety training). Teachers who do not possess a master’s degree do possess an incentive to invest in coursework leading to that degree. Yet with this exception, seldom are deeper, more substantive investments in learning mathematics formally rewarded. Teachers who participate in 40 hours of lesson study do not receive official credit in excess of those who participate in eight hours, or those who participate in less intellectually serious pursuits. Another reason for the constancy of the professional development system might be the nature of individual and institutional forces that direct teachers to participate in learning opportunities. Observers of the system often worry that teachers do not have enough choice (Little, 1989) or have too much choice (Cohen & Hill, 2001) in crafting their own learning trajectories. This study suggests that this debate misses the mark. Instead, it is the failure of both individuals and institutions to direct teachers in need into appropriate mathematical learning opportunities that is salient. Teachers in this study vary in their baseline level of mathematical knowledge. Yet although policymakers recognize this problem, there are no means to ameliorate it. Neither individual choice nor district/school mandates will suffice in the absence of diagnostic information on teachers’ mathematical knowledge. One possibility is a plan under consideration in Massachusetts.^{13} There, teachers in underperforming schools (based on state test scores) will be required to take a mathematics assessment and, in consultation with their principal or other supervisors, design an appropriate plan of professional development. Ideally, such a plan would enable the identification of the lowestknowledge teachers, which would serve both to improve the chances that these individuals would invest heavily in learning mathematics, and also to flag these individuals for closer observation and mentoring within schools. This plan will undoubtedly be controversial as well as difficult to implement; however, given the failure of current policies to target those teachers in need of the most improvement—perhaps because of the use of proxy measures and the focus on postsecondary mathematics coursework—such a plan is worth broadly considering. In a related point, one broad concern about the professional development system is its ability to serve teachers working in the neediest schools. The evidence here, however, suggests that these teachers do not more often take mathematics professional development, do not select into the most mathematically intensive of these opportunities, and do not overpopulate the one opportunity found to be associated with 2006 scores: math methods coursework. How well this system serves highpoverty schools is a serious issue for policymakers to consider. Next, the finding that teachers who invest time in mathematics methods coursework tend to grow, relative to teachers who attend no professional development, is well substantiated by the data, but its practical interpretation comes with several caveats. The first is that this study examines only broadscale programs and formats; it is unknown whether any specific mathematics methods course is effective. In a similar vein, the lack of significant positive effects for other programs and formats may mask considerable variability; some MSPs, for instance, may be highly effective while others not. Finally, the effects of the methods coursework are relatively small, in terms of variance explained, owing to both a modest effect size and the tiny fraction of teachers who attend such programs. Nevertheless, identifying this as a potential source of growth in teacher knowledge should spark interest in this form of teacher development, and in conducting followup research to determine the accuracy of this claim. Finally, as economists would point out, there are costs and benefits to spending federal dollars on professional development, as opposed to other forms of teacher workforce improvement. Unfortunately, the design of this study does not allow a definitive answer to the question of size of teacher learning effects. But the data presented here are still instructive, both as suggestions of trends and as possible causal mechanisms. As noted above, the federal government spent roughly $1 billion in the 2002–2007 period on MathScience Partnerships, matched no doubt by district and private (teacher) spending on coursework and other forms of mathematics professional development. And teachers who responded to both waves of our survey gained up to 0.25 standard deviations in number and operations MKT over this time. To the extent that this is a true effect rather than an artifact of our choice of linking items, any effect is promising. It means that, over time, the quality of the U.S. teaching force might be improving. Yet critics might argue that other methods, such as raising licensure requirements or bringing in mathematically talented individuals from other professions (and offering them opportunities to learn MKT), might prove more costeffective. This is a pressing issue for future research. In addition to research on specific topics or programs, future largescale research might help resolve many questions that arise regarding this study’s claims. Strengthening study design would be one important step. This would include using many more common items among waves, conducting followup work with survey nonresponders to estimate the effects of nonresponse, and asking more specific questions about teachers’ learning opportunities. Another would be to extend this type of study into a true panel design, perhaps with oversampling of teachers known to be participating in specific programs, to ascertain longerterm effects, and to remove statistical difficulties associated with having only two time points of data. Finally, becoming more clear on teacher selection into professional learning opportunities by asking many more direct questions about district/school mandates, reasons for selection, reasons for the length of time investment, and so forth, would be of benefit to many. Acknowledgments The author would like to thank Dan Koretz, Stephen G. Schilling, and Judy Singer for assistance with the paper; errors remain the property of the author. This research was funded by NSF grants REC0207649, EHR0233456, and EHR0335411. Notes 1. Downloaded and totaled on December 21, 2007 from www.nsf.gov/awardsearch/. Downloaded and totaled on December 21, 2007 from www.ed.gov/programs/mathsci/awards. 2. Desimone et al. (2002) had 430 teachers in their original sample, but analyzed responses from only 207; cases were deleted for several reasons, including incomplete responses across waves. Other longitudinal studies (e.g., Heck, 2008) fail to report their effective sample size, naming only the number of completed questionnaires received. 3. Yearly scores are constructed and equated via IRT methods; the number of items per year does not affect findings. 4. We used only three per content area because the main goal of piloting was to provide a rich and diverse item bank for program evaluation. With only 35 items per form, repeating additional items would decrease the size of that item bank. 5. Results are weighted so that sample reflects national averages. 6. In fact, only 40 teachers in the sample reported having attended professional development in mathematics, but did not mark one of the opportunities listed in Table 2. 7. An unusually large number of teachers (12%) reported spending 8 hours or less in university mathematics methods or content courses. Whether these teachers reported credit hours (rather than hours), enrolled then dropped out of the program, or were simply confused by the question is unknown. 8. In a separate question, we asked for teachers’ estimate of their overall investment in mathematicsfocused professional development: 36% reported spending 6 hours or less, 32% reported attending between 6 and 15 hours, and 31% reported attending over two days. 9. Without validation work, we cannot know why so many individuals reported spending 8 hours or less in a setting (such as a content or methods course), which typically meets for about 40 hours/semester. 10. Thanks to Stephen G. Schilling for this calculation. 11. The positive coefficient on the IRT squared term suggests that teachers with lowest and highest end of the 2005 distribution gained more than those in the middle. 12. Deleting MSP from the number/operations model also makes the FLE variable fully significant. We checked alternative explanations, for instance that some MSP attendees were disproportionately no longer teaching mathematics (and thus were forgetting mathematics for teaching). This was not the case. 13. Downloaded 6/1/08 from http://www.doe.mass.edu/lawsregs/603cmr2.html?section=05. References American Statistical Association. (2007). Using statistics effectively in mathematics education research. Retrieved December 31, 2007, from http://www.amstat.org/research_grants/pdfs/SMERReport.pdf Ball, D. L. (1990a). The mathematical understandings that prospective teachers bring to teacher education. Elementary School Journal, 90(4), 449–466. Ball, D. L. (1990b). Prospective elementary and secondary teachers’ understanding of division. Journal for Research in Mathematics Education, 21, 132–44. Ball, D. L., & Bass, H. (2003). Making mathematics reasonable in school. In G. Martin (Ed.), Research compendium for the principles and standards for school mathematics (pp. 27–44). Reston, VA: National Council of Teachers of Mathematics. Bobrowsky, W., Marx, R., & Fishman, B. J. (2001). The empirical base for professional development in science education: Moving beyond volunteers. Paper presented at the National Association of Research in Science Teaching, St. Louis, MO. Borko, H. (2004). Professional development and teacher learning: mapping the terrain. Educational Researcher 8(33), 3–15. Carpenter, T. P., Fennema, E., Franke, M. L., & Empson, S. B. (1999). Children's mathematics: Cognitively guided instruction. Portsmouth, NH: Heinemann. Cobb, P., Wood, T., Yackel, E., Nicholls, L., Wheatley, G., Trigatti, B., & Perlwitz, M. (1991). Assessment of a problemcentered secondgrade mathematics project. Journal for Research in Mathematics Education, 22, 3–29. Cohen, D. K., & Hill, H. C. (2001). Learning policy: When state education reform works. New Haven, CN: Yale University Press. Desimone, L. Porter, A. C., Garet, M. S., Yoon, K. S., & Birman, B. S. (2002). Effects of professional development on teachers’ instruction: Results from a threeyear longitudinal study. Educational Evaluation and Policy Analysis, 24(2), 81–112. Desimone, L., Smith, T., & Ueno, K. (2006). Are teachers who need sustained, contentfocused professional development getting it? An administrator’s dilemma. Educational Administration Quarterly 42(2), 179–215. Even, R. (1993). Subjectmatter knowledge and pedagogical content knowledge: Prospective secondary teachers and the function concept. Journal for Research in Mathematics Education, 24, 94–116. Garet, M. S., Porter, A. C., Desimone, L., Birman, B. F., & Yoon, K. S. (2001). What makes professional development effective? Lessons from a national sample of teachers. American Educational Research Journal, 38(4), 915–945. Hanushek, E. A. (2003). The failure of inputbased schooling policies, Economic Journal 113, pp. F64–F98. Harris, D. N., & Sass, T. R. (2007). Teacher training, teacher quality, and student achievement. Washington DC: CALDER Working Paper 3. Hill, H. C. (2004). Professional development standards and practices in elementary school mathematics. Elementary School Journal 104, 215–31. Hill, H. C. (2007). Mathematical knowledge of middle school teachers: Implications for the No Child Left Behind policy initiative. Educational Evaluation and Policy Analysis (29), 95–114. Hill, H. C. (2009). Fixing teacher professional development. Phi Delta Kappan, 90, 470–477. Hill, H. C., & Ball, D. L. (2004). Learning mathematics for teaching: Results from California’s Mathematics Professional Development Institutes. Journal for Research in Mathematics Education 35, 330–351. Hill, H. C., & Lubienski, S. T. (2007). Teachers’ mathematics knowledge for teaching and school context: A study of California teachers. Educational Policy 21(5), 747–768. Hill, H. C., Rowan, B., & Ball, D. L. (2005). Effects of teachers' mathematical knowledge for teaching on student achievement. American Educational Research Journal, 42, 371–406. Horizon Research. (2002). The 2000 national survey of science and mathematics education: compendium of tables. Chapel Hill, NC: Horizon Research. Lewis, C., Perry, R., Hurd, J., & O’Connell, M. P. (2006). Lesson study comes of age in North America. Phi Delta Kappan, 88(4), 273–281. Little, J. W. (1989). District policy choices and teachers' professional development opportunities. Educational Evaluation & Policy Analysis, 11, 165–179. Ma, L. (1999). Knowing and teaching elementary mathematics: Teachers' understanding of fundamental mathematics in china and the united states. Mahwah, NJ: Erlbaum. National Center for Education Statistics. (2001). Teacher preparation and professional development: 2000, (NCES 2001–088, 2001). Washington, DC: National Center for Education Statistics. Unpublished manuscript. National Mathematics Advisory Panel (2008). Foundations for success. Washington, DC: U.S. Department of Education. Post, T. R., Harel, G., Behr, M. J., & Lesh, R. (1991). Intermediate teachers' knowledge of rational number concepts. New York, NY: State University of New York Press. Rockoff, J. E., Jacob, B. A., Kane, T. J., & Staiger, D. O. (2008). Can you recognize an effective teacher when you recruit one? NBER Working Paper 14485. Cambridge, MA: National Bureau of Economic Research. Rowan, B. (2002). The ecology of school improvement: Notes on the school improvement industry in the United States. Journal of Educational Change 3, (3–4), 283–314. Saxe, G. B., Gearhart, M., & Nasir, N. S. (2001). Enhancing students' understands of mathematics: A study of three contrasting approaches to professional support. Journal of Mathematics Teacher Education, 4, 55–79. Schmidt, W. H., & others. (2007). The preparation gap: Teacher education for middle school mathematics in six countries (MT21 report). Michigan State University: Authors. Unpublished manuscript. Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. New York, NY: Oxford University Press. Supovitz, J., & Zief, S. G. (2000). Teacher quality: Survey reveals invisible barriers to teacher participation. Journal of Staff Development 21, 25–28. Swafford, J. O., Jones, G. A., & Thornton, C. A. (1997). Increased knowledge in geometry and instructional practice. Journal for Research in Mathematics Education, 28(4), 467–483. Wayne, A. J., Yoon, K. S., Zhu, P., Cronen, S., & Garet, M. S. (2008). Experimenting with teacher professional development: Motives and methods. Educational Researcher, 37(8), 469. Wei, R. C., DarlingHammond, L., Andree, A., Richardson, N., & Orphanos, S. (2009). Professional learning in the learning profession: A status report on teacher development in the United States and abroad. Dallas, TX: National Staff Development Council. Whittingon, D. (2002). The status of middle school mathematics teaching. Chapel Hill, NC: Horizon Research. Wilson, S. M., & Berne, J. (1999). Teacher learning and the acquisition of professional knowledge: An examination of research on contemporary professional development. Review of Educational Research 24, 173–209.


