THEWORLDBANK Discussion Paper EDUCATION AND TRAINING SERIES Report No. EDT64 Examination Reform in Kenya H.C.A. Somerset February 1987 Education and Training Department Operations Policy Sh:tff The views presented here are those of the author(s), and they should not be interpreted as reflecting those of the World Bank. Discussion Paper Education and Training Series Report No. EDT64 EXAMINATION REFORM IN KENYA H.C.A. Somerset (consultant) Education Policy Division Education and Training Department February 1987 The World Bank does not accept responsibility for the views expressed herein, which are those of the author(s) and should not be attributed to the World Bank or its affiliated organizations. The findings, interpretations, and conclusions are the results of research or analysis supported by the Bank; they;do not necessarily represent official policy of the Bank. ~ Copyright ~1987 The International Bank for Reconstruction and Development/ The World Bank This report discusses a program carried out in Kenya directed to.wards the reform of the examination system, and particularly of the examination which ternilnates the basic education cycle and governs access to secondary school. The program had five specific aims: fi.rst, to improve the efficiency of the examination as a selection instrument; second, to give less-privileged pupils (including rural pupiJLs, girls, and those from low-income families) a better chance of showing their abilities and hence gaining access to secondary school; third, to encourage and assist teachers to provide all pupils, but especially those for whom basic education will be terminal, with a more relevant set of cognitive skills; fourth, to improve the overall quality of basic education; and fifth, to reduce quality difference among districts and among schools. Towards these fi,le goals, two main instruments of reform were employed. First, the content of the examination itself was changed ~ubstantially. It now attempts to test competence in a much wider range of cognitive skills, including the ability to observe and to interpret data, to reason and to solve problems, and to communicate effectively using connected prose. Whenever possible, these skills are tested in contexts similar to those in which they are li.kely to be used when the pupils leave school. Second, the examination is now used as an instrument for monitor- ing the performance of the schools, and hence for improving educational quality. The information generated is of two main kinds: overall performance information, which pr.ovides an incentive for schools and districts to improve their performance; and item response information, which provides insight into the difficulties pupils are ha\ring :ln answering the questions, and hence :fLnto the remedial action needed. Incentive infor- mation is fed back to the schools through district and school merit order lists, and guidance information through the CPE Newsletter, published annually. The initial impact of the introduction of the feedback system was to widen achievement differences among districts, because the districts which were already performing well responded to the new information more rapidly. In 1980 and 1981, however, after the system had been in operation for four years, this trend was reversed: nearly all districts which had been lagging showed striking performance gainso Content changes have in general made the examination both a better test of terminally-relevant skills and a more efficient selection instrument. But in some cases the goals of relevance and equity have proved incompatible with each other: items testing relevant skills have been answered much less successfully by pupils in low-cost schools than by pupils in privileged high-cost and private schools. Efforts are being made through the information feedback system to provide teachers in low-cost schools with the guidance they need to teach these skills more effectively. ACitliOVLEDGMEMTS So many people participated in various ways in the reform program discussed in these pages that it would be impossible to name all of them. They include Ruth Kagia, Wilfred Kimalat, Philip Kitui, Michael Mbithi, Fred Njoroge, Joe O'Connor, Jean Palutikof, Ian Roberts, Mike Savage, and Mark Sinclai.r, all of whom played key roles in formulating objectives for the reformed examination, and in devising appropriate questions to test these objectives. Joe Mbugua and Roy Lock wrote many of the computer programs needed. The leadership of B.M. Makau, first as Chief Examinations Officer to the Examinations Section, Ministry of Education, and then. as Secretary to the Kenya National Examinations Council, requires special acknowledg- ment. Without Ben Makau's commitment to the use of research as an instru- ment of change, and his active involvement in establishing new goals, new procedures, and new measuring instruments, the reforms would never have come into being. During the early stages of the project, before moving to the Ministry of Education, I worked at the Institute for Development Studies, University of Nairobi, and during the writing-up period, I was based at: the Institute of Development Studies at the University of Sussex, U.K. To the Directors, Fellows, and staff of both institutes, I am grateful for much stimulus and support. The various drafts of this document have been typed most efficiently by Margaret Ochieng and Maureen Dickson. Both developed uncanny skills in decoding my handwriting. The main financial costs of the reform program were met from the recurrent budgets of the Examinations Section and the Kenya National Examinations Council. My own participation, however, was financed by grants from the Rockefeller Foundation, New York. I am most grateful to Dr. Ralph K. Davidson, Dr. David Court, and other officers of the Foundation for their interest and commitment, and especially for their recognition that a long-term involvement was needed to produce results. This draft was completed in mid-1983~ Since then, there has been a major change in the formal structure of education in Kenya. The 7-4-2-3 structure, which was current when the reform program discussed in these pages was being implemented, has been replaced by a simplified 8-4-4 structure~ The open-access cycle has been extended from seven to eight years, and the two-year A-level course will shortly be abolished. As a consequence of this change, pupils sat the Certificate of Primary Education (CPE) examination for the last time in 1983. To replace the CPE, a new terminating examination, the Kenya Certificate of Primary Education, or KCPE, was introduced in 1985 for eighth-grade pupils. The KCPE differs from its predecessor in three important ways: first, two papers in Kiswahili have been introduced, one multiple-choice in format and the other a composition paper; second, the subjects which made up the CPE general paper are now examined in separate, multiple-choice papers; and third, pupils are required to participate in a practical project chosen from a range of subjects, including agricultu:re, art, and craft. Their work is assessed by teams of primary teachers, and the mar}f..s aw&rded count towat·ds the overall KCPE result. All of these reforms have had substantial impact. There have also been important change,s in the leadership nf the Kenya National Examinations Council. The Council now has a new Secretary and Deputy. The reform program discussed in these pages, however, has been continued and extended. The CPE program generated a great deal of quantitative and quali- tative data, only a small proportion of which is reported here. The paper is intended mainly as a description of an action program rather than aa a research monograph. I hope it will be possible to discuss research findings more fully in a later publication. In writing this report, I have benefitted from discussions with many people, but the opinions express~d are entirely my own. They should not be taken as representing the views of the Kenya Ministry of Higher Education, the Kenya National Examinatio-qs Council, the Rockefeller Foundation, the Institute for Development Studies, Nairobi, the Institute of Development Studies, Sussex, nor of the, World Bank. Table of Content's Pge No. CHAPTER 1. INTRODUCTION ••••••••••• ~••••$•••1•••••••••••••••••• 1 20 A. WHY EXAMINATIONS ARE IMPORTANT, OIJ • • • • • • • • 0 •••••••• 1 13 B. THE QUALITY OF EXAMINATIONS ••• ~ • ·• •••••••••••••••• • 13 20 CHAPTER 2. THE EXAMINATIONS SYSTEM IN IG:~A••••••••••w•••.,••• 21 29 A. THE STRUCTURE OF EXAMINATIONS •••,•••••• • • • • • • • • • • • • 21 25 B. THE ADMINISTRATION OF EXAMINATIOINS•••••••••••••••• 25 26 c. THE EXAMINATIONS RESEARCH UNIT. ,, •••••••••••• • •. • • • 26 29 CHAPTER 3. CHANGES IN THE KENYA CERTIFICATJg OF PRIMARY EDUCATION: GOALS AND MODES OF REFORM••••••••••••• 30 38 A. GOALS OF REFORM ••••••••••••••••,•••••••• •. • • • • • • • • • 30 31 B. MODES OF REFORM•••••••I'l••••••••••••••••••••••••••• 31 34 c. MODES AND GOALS: THE ANTICIPA1~ED LINKAGES. • • • • • • • 35 38 CHAPTER 4. CFE EXA..t.tiNA1'ION CHANGES: THE I~ULTIPLE CHOICK. t·~\PERS •••••••••••••••••••••••••••••• ~-• •••••• • • fl; • • '!) 39 54 Aq SCIENCE••••••••••••••••••••••••••••••••••••••••••• 39 48 B. OTHER SUBJECTS ............ ••••••••••••••••••••••••• 48 51 C. CHANGES IN SETTING AND MODERATION PROCEDURES •••••• 51 54 CHAPTER 5. CPE EXAMINATION CHANGES: THE ENGLISH COMPOSITION PAPER•••••••••••••••••••••••o••••••••• 55 - 60 CHAPTER 6. CPE PROCESSING CHAN~ES: STANDARDISATION OF SCORES AND DETECTION OF CHEATING••••••••••••••••• 61 66 A. THE STANDARDISATION OF CPE SCORES•••••••••••••••• 61 64 B. DETECTION OF CHEATINGo•••••••1••••••••••••••••••e~ 64 66 CHAPT.ER 7. INFORMATION FROM THE CP-.E: ANALYSIS AND FEI!!DBACK. 67 99 A. TYPES OF INFORMATION GENERATED••••••••••••••••••• 67 75 B. ASSESSMENT OF THE EQUITY OF THE CPE•••••••••~•••• 75 86 C. ASSESSMENT OF THE EFFICIENCY OF THE CPE•••••••••• 86 99 CHAPTER 8. EFFECTS OF THE REFORM PROGRAM[: SOME PRELIMINARY DATA•••••••••~•••••••••••••••••••••• 100 -115 A. THE EFFECTS OF THE INFORMATION FEEDBACK SYSTEM ON QUALITY DIFFERENCES AMONG DISTKICTS•••••••••• 100 ·-107 B. RELEVANCE, EQUITY AND EFFICIENCY •••• ~••••••••••• 107 -115 BIBIL·OGRAPHY ............................. q)• • • • • • • • • • • • • • • • • • • • • 116 -117 ANNEXES 1. MEAN DISTRICT STANDARD SCOKES, l98l ....... c•••••• 118 2. MEAN SCHOOL STANDARD SCORES, TANA RIVER DISTRICT 119 3. EXC.ERPTS FROM CPE NEWSLETTERS, 1980 AND 1981 •••• 120 -151 A. ENGLISH OBJECTIVE PAPER, 1980••••••••••••••• 120 -123 B. SCIENCE PAPER, 1980••••••••••••••••••••••••• 124 -127 C. ENGLISH COMPOSITION, 1981••••••••••••••••••• 128 -136 D. MATHEMATICS, 1981••••••••••••••••••••••••••• 1l? -144 E. HISTORY, 1981••••••••••••••••••••••••••••••• 144 -147 F. GEOGRAPHY, 1981••••••••••••••••••••••••••••• 147 -l!S1 4. HEAD TEACHER'S QUESTIONNAIRE, 1981 NEWSLETTER ••• 152 -153 5. ITEM DIFFICULTY PROFILE, 1980••••••••••••••••••~ 154 -15.5 CHAPTER 1 INTRODUCTION A. Why Examinations are Important " Wherever selection examinations are entrenched in an education system they arouse strong feelings. They are attacked, with good reason, on two main counts$ In the first place, e~aminations are seen as casting a thrall over the work which should be done in the schools. They thwart attempts to intr6duce more relevant curricula and more imaginative teaching rnethods. By encouraging pupils to concentrate on the rote recall of disconnected facts, or even on the memorisation of set answers to likely questions, they act as destroyers of creativity and effective thought. These are some of the more pernicious aspects of Dore•s 'diploma disease•, discussed in his perceptive book of that title (Dare 1976). Secondly, examinations are attacked for the harshness and capriciousness with \"'hich they allocate educational and career chances. No s·ingle event, it is argued, taking place over a period of only a few days, should have such profound, and often irreversible, effects on the life prospects of young people. Feelings are usually strongest about the examination which terminates the basic •open-access• cycle of education: that isr the period during which all children have the right to schooling partly or wholly at the public expense without having to meet a performance criterion. In many low-income countries, particularly those in Africa, the open-access cycle is short, so children sit the terminating examination while they are still immature, and too young to have much chance of finding employment if they fail. In the Gambia, for example, open-access education lasts only six years, and in Tanzania, Kenya and Botswana only seven years, so many children reach their first selection hurdle when they are only fourteen years old, or even younger. 1 - 2 - Given these circumstances it is hardly surprising that. proposals to reduce the role of examinations have been a keystone of plans for reform of many third-world education s,Y$tems. A conservative proposal, one which has been partly implemented in many countries, is that the nlJf"''ber of selection examinations should be reduced through extension of the open access cycle. A more radical proposal is that external examinations should be abolished altogether, and their place taken by internal assessment. 1. Extension of the open-access cycle. Over the past two decades or so a high proportion of third-world countries have extended their open-access cycle, and in consequence have been able to reduce the number of selection examin&tion hurdles which pupils must surmount in their progress from the infant room to university. In Kenya the open-acceis cycle was extended in the early 1960s from only four years to seven years., and the number of selection examinations reduced from four to three. Infant and middle-standard teachers were thus released from the responsibility of preparing their pupils for an external examinatior, and were free to explore the possibilities of less formal methods. Without this refonni it is doubtful that the emphasis on activity and child-centred methods, which was being introduced at about the same time through the New Primary Approach, would have gained such rapid and widespread ac~eptance. During the 1970s two major reports on the Kenyan education system, the International Labour Office Report (1972) arid the Report of the National Comrittee on Educational Objectives and Priorities (1976) both recommended a further extension of ~he basic cycle to nine years. If this proposal were implemented, the number of selection hurdles would fall to only two. Furthermore, pupils would be a great deal more mature when they sat their first selection examination: on average, sixteen years old rather than only fourteen years as at present. Their marks would thus give a more accurate indication of their potential to benefit from continued education, and they would be better fitted to compete for employment opportunities if they failed. (~ - 3 - Few would argue that seven years of open-access education is enough. In an ideal world, no child would be denied the right to publicly-provided education on the grounds of inadequate achievement before he had completed at least ten years f.tt school. Unfortunately, however, cost is a major constraint. Third-world countries are poor countries, and many already spend a disproportionately h,~:gh share of their meagre res';ources in supporting forma 1 education. In Kenya, well over 30% of annual government expenditure goes to meet costs of formal and non-formal education and training programmes. Any substantial increase in this percentage would endanger investment in developments which directly create more wealth and he.t~~ more employment. More money spent on education wo~~d .almost certainly mean less money available for improving agriculture, developing irrigation, or promoting industrialisation. Education does not create jobs; for the most part it can only equip individuals with some of the skills they need to compete for existing opportunities. It is conceivable, of course, that ways could be found to extend the open-access cycle without significantly increasing costs. Possibilities include increasing the proportion of untrained teachers, increasing average class sizes, lowering the sa1aries of trained teachers, and reducing the money spent on books and other school equipment. All of these measures, however, would almost certainly have detrimental effttcts on school quality, and as we shall see later in this paper, low and uneven standards of performance are already a major cause for concern. In Kenya, the amount of money spent on school equipment is barely adequate to provide exerctse books, chalk and perhaps one text- book per child each year. Any substantial reduction would leave many children without books at all. Conditions are similar in many other third-world countries (Heyneman 1980 ).2 Despite the difficulties, it seems inevitable that over the Ylext decade most third-world countries that have not already doni! so will find ways to extend open-access education. But this step, however it is taken, will not solve the problem of selection. Some examinations will be abolished, but other selection systems, for pupils a little further along the educational pipeline, will have to take their place. The need to select does not derive from factors which educational - 4 - planners can control; rather, it has its roots in the context of poverty within which all third-world education systems must function. But from this it does not follow that selection must necessarily be carried out through the agency of external examinations. We have already mentioned the possibil~ty that internal assessment, carried out by head teachers and class teachers, could be used instead. 2. Internal assessment as a substitute for external examinations? In some aff~luent countries, particularly those with strong egalitarian traditions, internal assessment has almost entirely replaced external examinations as a means of judging pupil performance. In the USA, for instance, the first major external examination that must be faced is the college entrance examination. In New Zealand, pupils are still required to sit an external examination after their eleventh year of schooling, but most university rectl"t.dts are selected simply-on the basis of their schools' recommendation. This system, which has been in operation for more than 35 year~~ ~as subjected to a detailed evaluation during the 1950s. The data indicated that schools were for the most part, identifying their ablest pupils with an impressive degree of accuracy (Parkyn, 1958 ) At first sight, the virtues of internal assessment as a replacement for external examinations in third-world countries seem impressive. Teachers know the strengths and weaknesses of each individual ~upil well, and can judge his capacity to benefit from further education, or from a particular training or employmeNt opportunity, from a much wider sample of work than is accessible to the external examiner, reading a script written by a pupil he has never met. Furthermore, teachers have the opportunity to observe pupils over extended periods of time, so their assessments are less likely to be biassed by day-to-day variations in performance than are those of the external examiner. Again, a teacher can take non-cognitive aspects of performance into account if required; whereas an examination script gives a measure of cognitive skills alone. Internal assessment has the further advantage that it is less stressful to the pupils: the anxiety and tension which inevitably build up as examination day approaches are avoided. Fina11ly, internal assessment would free the schools from the restrictive backwash effects on content - 5 - dnd teaching methods which a selection examination imposes during the last two or three years of the course. Given these advantages, it is small wonder that internal assessment has been frequently urged as a solution to third-world selection problems. Nevertheless, despite the obvious attractions of internal assessment, most third-world countries continue to rely heavily on external examinations. This is not the result of conservatism, or lack of knowledge of the alternatives. Rather, it is the consequence of the economic and social context in which third-world education systems must operate. Four major aspects can be identified. In the first place, because of the limited resources available for education, opportunities to continue formal schooling past the open-access cycle are usually sharply restricted. Consequently, competition for the available opportunities is intense. In Kenya, the structure is as followsa for government-maintained schools only: Duration Terminating Proportion Examination Continuing to next cycle Primary (open access) 7 years CPE 13% Secondary (restricted access) 4 years 0-level 20-25% Higher 2 years A-level 30-35% University CPE: Certificate of Primary Education 0-Level: Ordinary Level (or Certificate of Secondary Education) A-Level: Advanced Level (or Certificate of Advanced Secondary Education) It can be seen that only about one pupil in every hundred of those entering the first year of the o~en··access cycle (at the age of six or seven years) can expect to emerge from the upper-secondary cycle, 13 or so years later, with university-entrance qualifications. Furthermore, the severest restriction comes ~arly: at the end of the open-access cycle. In Tanzania, opportunities are even more limited. In 1979, the continuation ratio at the end of the open-access cycle was only 4.2% for boys and 2.8% for girls. It is anticipated that the impact of universal primary education will reduce these ratios even further over the next few years. - 6 - Second, competitition for ·~mployment opportunities is intense. It is a salient characteristic of third-world countries that the number of school leavers entering the job market each year far exceeds the new opport.un i ties that become ava i 1ab 1e. In Kenya, the tot a 1 number of new formal jobs, of all types and skill levels, which were created between 1975 and 1979 was only about 150,000. Over the same period, the primary schools produced about 650,000 CPE graduates who did not enter secondary school, and the secondary schools produced about 230,000 0-level graduates who did not enter higher secondary school. Third, the chances of a school leaver finding employment improve sharply as his educational and examination qualifications improve. Recruiters, especially those in the public sector, generally prefer A-level graduates to 0-level graduates, and 0-level graduates to CPE graduates. At each qualification level, opportunities usually go to those with the best examination results. In Kenya, a Careers Booklet is issued to 0-level leavers each year, listing every public-sector employment opportunity for which they are eligible to apply. Virtually without exception, 0-level passes in specified subjects, and at specified levels of performance, are stipulated for every opport1m i ty. Finally, once the school leaver has his job, his starting wage or salary will be closely geared to his educational and examination qualifications. The differentials are likely to be extremely steep. In Kenya, a university graduate can expect to start work at a salary.at least double that which will be earned by an 0-level graduate with a two or three year post-0-level training qualificat.ion. The 0-level graduate, in turn, will earn at least five times as much as the CPE graduate, assuming that the latter is lucky enough to find work at all. Given the circumstances we have just outlin£1, competition for opportunities is inevitably intense, and education becomes largely a struggle to survive. In countries such as Kenya, voluntary dropout from government-maintained schooling is common only during the open-access cycle. Once pupils have surmounted the first selection hurdle and entered the first restricted-access cycle, in a high-quality school, voluntary dropout becomes rare. Most of those who leave are forced out; sometimes by inability to pay fees but more often because they fail to perform well enough in one of the subsequent selection examinations. - 7- The situation in prosperous countries is very different. In New Zealand, for example, jobs until recently were quite easy to find, and the wages earned by a skilled manual worker with minimal educational qualifications were not a great deal lower than the salary paid to a non-manual worker with a university degree. In these circumstances, many young people left school of their own choice, soon after they reached the minimum leaving age. Education did not hold intrinsic rewards for them~ and the extra money they might possibly earn with a few years more schooling was not enough to compensate them for earnings they would certainly lose in the meantime by staying on. It was not necessary to use an examination or other mode of assessment to force these pupils to leave schQol; they wanted to go in any case. The problems of educational assessment and selection are thus much more acute in the third-world than they are in prosperous countries. Because so much more is at stake, considerations of objectivity and fairness in the allocation of opportunities become over-riding. Conversely, subjectivity and personal bias are at all costs to be avoided. It is for these reasons that external examinations are so strongly entrenched in most highly competitive education syst~ms. When objectivity is a paramount concern, the distancing of the external examiner from the candidates becomes a virtue rather than a vice. The examiner may have only a pile of scripts to work from, but at least personal factors will not affect the marks he awards. The internal assessor, by contrast, may know a great deal more about the strengths and weaknesses of each candidate, but if he attempts to take this less tangible evidence into account he lays himself open to the charge of favouritism. A judgement based on sensitive personai appraisal can always appear biassed to someone who does not have access to all the evidence, or who weighs the evidence differently. In any situation where there is extreme competition for scarce resources, formal rules must be established for their allocation: informal rules based on trust inevitably break down. When life - 8 - chances are being decided, it is not surprising that many prefer to leave the verdict to an impersonal examination, whatever its· defects as a measuring instrument, rather than trust the judgement of an assessor, however disinterested he may be. The reader may object that the categorical statements just made are contradicted by the case of Tanzania. Tanzania is one of the poorest countries in the world, and her education system one of the most selective. Nevertheless Tanzania has committed herself without reserve to the goal of reducing the role of examinations in the assessment of merit, and in the allocation of opportunities. This policy was stated unequivocally in the Musoma Declaration of 1974: The National Executive Committee (of the Party) agrees that a student's progress has to be measured, but the existing examination system encourages interpersonal competition, whereas students' progress can be measured without necessarily involving them in a competition for individual excellence. The National Executive Committee therefore directs that the excessive emphasis now placed on written examinations must be reduced, and that the student's progress in the classroom plus his performance of other functions and the work which he will do as part of his education, must all be continually assessed and the combined result is what should constitute his success or failure. In response to this directive, internal assessment has now been introduced at all three major selection points within Tanzania's education system. The ILO Report "Towards Self Reliance: Development, Employment and Equity Issues in Tanzania" (1978) contains an evaluation of the functioning of internal assessment in Tanzania (pp. 205 - 214~ and Omari and Manase (1977) discuss the system as part of a valuable critique of the whole examination. The present writer has also carried out a brief study (Somerset 1979). At the primary 'level, three main criteria are used to assess leavers, and to determine who should go to secondary school: 1. Performance in the Primary School Leaving Examination (PSLE). This is a conventional selection examination, made up of short answer and multiple-choice questions, similar in format and content to - '- those used in many other African countries, except that one p~per contains a group of questions testing political knowledge. The papers are set by the National Examinations Council and marked by teams set up by Regional Education Officers (REO's). 2. Progress in classroom work, as judged by the teachers. In practice, this assessment is based almost entirely on marks in internal examinations and tests. Record cards are maintained for each pupil, and from these the school identifies the pupil in each class with the highest overall average in the final three standards. The name of this boy or girl is sent to the REO as a 'pre-selected' candidate. 3. Committment to national goals and aspirations, as indicated by participation in school work activities. These non-cognitive qualities are assessed by the head teacher in a written report which is sent to the REO. The REO's thus have three separate assessments of each pupil from which to decide who should go to secondary school: an external assessment of cognitive skills; an internal assessment of cognitive skills; and an internal assessment of non-cognitive qualities. If the three sets of facts were given equal weighting, the primary teachers would be more important than the external examination in determining the allocation of secondary places. In practice, however, tnis is not the case; the PSLE merit order is by far the most significant factor. The present writer calculated for three of the twenty regions in Tanzania the number of 'pre-selected' candidates accepted for secondary school during the 1978 selection exercise who wou"ld not have been selected if the PSLE merit-order list had been followed. In one region, these candidates made up less than 3% of the total intake; in a second, about 2%, in the third, less than 1%. There were several reasons why these proportions were so low, the most important being that 'pre-selected' candidates were not allocated secondary places automatically, but were required to rank highly in the performance 1 ists in the exte~rnal examination as we11. 3· In most districts, standards of performance varied a ~reat deal from school - 10 - to school, and candidates from a few good schools dominated the lists. In districts where schools were of more homogeneous quality, pre-selected candidates were often limited by a further provision that they should not take up mar~ than 25% of the available secondary places.4· Internal assessments of non-cognitive qualities were even less important in selection. In the three regions, only one candidate whose examination marks qualified him for a secondary place had received an unfavourable report. It was explained that few head teachers would want to jeopardise the chances of their best pupils by writing negative comments, if only because the prestige of a primary school still depends heavily on its success in gaining secondary school places for its leavers. 5 In the exceptional case just noted, the selection panel did not accept the head teacher's assessment, but set up a committee of enquiry. This committee decided that the assessment had been too harsh, and the pupil was awarded a secondary place. It is apparent that at the! primary level internal assessment of both cognitive and non-cognitive qualities has a minimal effect on selection decisions, despite the fact that a great deal of conscientious effort is devoted to it. But at the next selection hurdle, after four years of secondary education, the ·picture· is different. Assessment of non-cognitive qualities is hardly more significant than it is at the primary level. According to the ILO Report, only 0.1% of pupils who qualified for higher secondary education on academic criteria were denied places because of unfavourable ratings on qualities such as attitudes to work, cooperativeness, leadership etc. But by contrastt teachers' assessments of cognitive qualities play a major role in selection decisio~~- The secondary schools keep detailed performance records of all theit pupils throughout their four-year course, although, as at the primary level, assessments are mainly based on examinations. 6 Each year these assessments are summarised and sent to the National Examinations Council. When the pupils have completed the four-year secondary course, their internal marks are standardised together with those they gain on the external Secondary Leaving Examination, so that e~ch set of marks contributes 50% of the final assessment. - 11 - No doubt one reason why more account is taken of internal assessments at the secondary level is that secondary teachers are more highly qualified than primary teachers, and hence are seer. as oetter judges. But situational differences are probably more important. At secondary school each class of pupils is taught by a much larger group of teachers than at primary school, both because secondary schools tend to be bigger and because they teach a wider range of subjects. Hence at secondary school the overall assessment of each pupil's achievements is very much a collective decision, arrived at by pooling the judgements of many people; whereas at primary school each teacher carries a proportionately much larger share of the responsibility. Secondary schools also differ sharply from primary schools in the degree of out-of-school contact which teachers have with pupils and their families. Because primary schools usually serve a restricted catchment area, teachers meet parents frequently. Further, in Tanzania (and in many other third-world countries, including Kenya) the primary school is often a focus for local activities, and hence an arena in which leaders compete for power. If he is to run his school successfully, the head teacher must establish effective relationships with these leaders, and must, indeed, become an influential man himself. Secondary schools, by contrast, tend to stand more aloof from the local community. Well established rural secondary schools typically draw their pupils from a very wide catchment area, mainly because selection is so severe. In such a school, 111ost or all of the pupils will be boarders, and teachers are unlikely to know more than a few par·ents well. It is clear, then, that the conditions in which inter·nal assessment is carried out at the secondary level are much more conducive to· success than those at the primary level. The secondary teacher is more distanced from the candidates than the primary teacher, and in this his role is nearer to that of the external examiner. He meets his pupils for· the most part in the limited context of classroom and school compound, and hence must base his assessments mainly on their schoolwork. Moreover, even if his judgements are overly influenced by subjective factors, such biases will be largely compensated when his assessments are pooled with those of other teachers. By contrast, the primary school teacher, and especially the head teacher, is much more exposed. He knows the parents of most of his pupils, particularly those who play active roles in local affairs. Further, he is identified as the person who1 is mainly responsible for internal assessments. - 12 - He can base his assessment of the pupil's cognitive merit on perfonnance in tests and internal examinations, but when it comes to non-cognitive qualities the evidence available to him is less concrete, and more subject to differing interpretations. Given that so much hangs on his decision, he is placed in an almost untenable situation. His dilemma is oar·ticularly acute when he must pass judgement on the son or daughter of a local leader. In any case, as we have already noted, he will hardly wish to jeopardise the chances of any pupil who might win a secondary school place. In the circumstances, it is not surprising that many head teachers avoid the dilemna by writing favourable cormtents about most pupils. Even at the secondary level, however, and even with respect to cognitive qualities alone, the application of internal assessment is by no means straightforward. Quality differences among the schools, and differences in assessment standards, create serious problems. 7 If the raw marks submitted by the schools were accepted at their face value, without adjustment, the best schools would be penalised, because they tend to judge their students more conservatively than the weaker schools. In addition, an incentive would be created for all schools to give all their students high marks. In Tanzania, a sophisticated computer program has been devised ·to cope with these difficultie~;. In effect, the marks gained in the external secondary school leaving examination are used as a yardstick to correct the biases in the internal assessments. First, mean marks and standard deviations are calculated for each school, both for the external examination and for the internal assessments. 8 Then the means and SO's for the internal assessments are~ changed, !SO that for each school they are identical to the corresponding statistics 'for the external examination. By these measure!s. the effect of internal assessment ton the final ordering of candidates is testrictjed. Both the internal assessment and the external examination can change the ordering of candidates within each school, but the standing of each school relative to other schools is determined by th·~ external exarn:ination c!lone. It is clear, thenJ that Tanzania has succeeded in devising an efficient system for the internal assessment of performance at the secondary level although not at tlhe primaJ"Y level. It should be noted, however, that even ~tt the secondary 'level, ir1ternal assessment works only because it is a part - 13 - of at total system.of evaluation which includes an external examination as well. Unless alternative methods can be found for measuring the quality of s,econdary schools, internal assessment can succeed only as a complement to an external examination, net as a substitute for it. In summary, then, it is apparent that selection examinations still have a long life in front of them in third-world countries. In this section we have discussed two alternatives: to abandon 3election altogether, and to replace external selection examinations by internal assessment. Given the shortage of job opportunities and the steep, education-linked salary differentials in third-world countries, some form of educational selection is essential. Without it, educational costs would escalate, and these could be met only by cutting into resources which would be better employed in investments which would directly create more jobs. The experience of Tanzania has sho\'m that internal assessment can become an important element in selection, although probably only at the secondary school level. Nevertheless, selection examinations cannot be abandoned entirely. Because of the effects of variations in school quality, internal assessment can be viab·le only as one part of a total system of evaluation which also includes an external examination. But must it necessarily follow that the negative effects of examinations which we listed in the first par·agraph of this paper are inescapable? Is it in~evitable that examinations will continue to block efforts at curriculum reform in third-world countries, encourage rote memorisation, and discourage creativity and effective thinking? In Chapters 3 to 8 of this report we shall attempt to establish that the answer to these questions is no: the negative effects listed are not the necessary consequences ()f examination9 as such, but only of bad examinations. B. The Quality of Examinations. The Jobs and Ski 11 s Prclgranrne for Africa ( JASP~\) has recently completed a study of formal qualifications and school leave!r unemployment in Africa which includes an analysis of selection examin~tions in eight countries (JASPA 1981). A useful overview and synthesis of the same material has also been published (Little, 1981). While the qual~ty of examination papers studied varied from very good to very bad, the overall standard was poor. - 14 - From the &na1yses presented, a continuum of different types of examination can be discerned. Towards one pole are examinations in which the predominant influence comes from specialists in testing and measurement. The primary school leaving examinations set in Sierra Leone, Ghana and The Gambia are near this pole. For convenience, we may refer to them as measurement-oriented examinations. Towards the other pole are examir~ations i'n which the predominant influence comes from professional educators~ The primary leaving examinations in Tanzania. Somalia and in Kenya up to about 1973 are near this pole. We shall refer to them as curriculum-oriented examinations. Near the middle of the continuum are the primary leaving examinations set in Zambia, and in Kenya si nee about 197·4. In these examinations, the influence of neither group of specialists seems to predominate. In general, curriculum-oriented ·examinations are commoner than measurement- oriented examinations, particularly at the secondary level, where the contribution to setting of subject specialists becomes crucial. The measurement-oriented primary-level examinations covered in the JASPA survey were of good quality in a technical sense. Much skill and care had gone into their construction and presentation. It seems· 1ike ly that the performance characteristics of the items {difficulty level, discrimination) had been determined by statistical analysis. But the range of content over which knowledge and skills were tested was narrowly limited: in all three countries it was confined to English and mathematics~ Competence in areas of the curriculum such as health, nutrition and agriculture, which are specially relevant to terminal pupils, was not tested at all. This must inevitably have unfortunate backwash effects on the amcJnt of time teachers are prepared to devote to these important topics during the final years of the open-access cycle. Further, the testing of reasoning skill was confined to so-called 'aptitude' tests, and based on nan-curricular materials. The implicit assumption seems to be that the ability to reason effectively is not a skill which teachers should attempt to develop through ordinary school subjects, presumably because it is seen as largely impervious to the effects of teaching. - 15 - The curriculum-oriented primary-level examinations included in the survey had more severe problemsa As we would expect, they covered a wider range of school subjects than the measurement-oriented examinations. Typically, science,biology, hi'story and geography were tested in addition to mathematics and English. But the range of intellectual skills tested through these additional subjects was constricted in the extreme. With few exceptions, the items set tested simply the ability to recall memorised knowledge. Further, the technical quality of the papers was low. For example, items sometimes had more than one correct answer, or none at all. In other cases, the correct answer could be identified on the basis of some extraneous characteristic (length, specificity, etc.). It is not difficult to suggest reasons why third-world examinations are so often inadequate to their purpose. With good justification, professional educators in curriculum development centres, teachers' colleges, universities and ministY'ies of education tend to have ambivalent feelings about selection examinations, especially examinations which=tenninate ba·sic education and bar many children from acce·ss to secondary school. They see these examinations as at best a necessary evil, to be tolerated but not encouraged; at worst, as a barrier to progress which should be swept away. They prefer to devote their energies to work which they regard as more creative: the development of new curricula, the in-servicing of teachers, the preparation of new teaching materials. Perhaps for similar reasons, examinations have also received little attention from the World Bank and other funding agencies. But the need to assess and to select is inescapable. In some countries• the dilemma is partly resolved by handing over much of the responsibility to testing and measurement experts. These specialists usually produce examinations of goQd technical quality, but judging from the JASPA survey they tend to be more sensitive to the role of examinations as selection instruments than to their role in promoting effective teaching of useful knowledge and skills over the full range of the curriculum. Alternatively, and perhaps more commonly, examination development tends to be neglected. The preparation of examination papers is often relegated to part-time or even spare-time work. But in the schools, examinations are never neglected. For the last two years leading up to any selection examination, the effective curriculum of the class is defined not by the official syllabus or the official t~xtbooks, nor by - 16 - what the teachers were taught during their last in-service course; but by the content of the most recent selection examination papers. Badly set examinations can thus sabotage the most dedicated efforts to improve education through curriculum reform and similar means. An example from Kenya in the ea,·ly 1970s is instructive. At this time, the syllabus in use for the teaching of primary science laid much emphasis on the development of both scientific process skills and scientific concepts: Lessons should be based on children's observations. One of the teacher's tasks is to put the children into situations where they can observe. Another is to try to demonstrate the principles underlying what the children see. A third is to help the children record what they see, and begin to draw conclusions from it. (Kenya Primary School Syllabus, 1967, p. 111) But the science questions being set at the time for the Certificate of Primary Education reflected none of these concerns. Almost without exception, they tested knowledge of isolated fragments of factual material. They thus gave no indication as to whether the pupil had started to develop any of the process skills specified in the syllabus. Moreover the facts tested were often highly-specialised, and quite outside the experier.ce or understanding of primary school children. Examples of the questions set at this time are given in Chapter 4 (pp. 40-41). Not surprisingly, most upper primary teachers had abandoned the syllabus and concentrated their efforts on rote memorisation. A high proportion used one of the privately-published CPE guidebooks as their science textbook. These guidebooks provide a comprehensive coverage of all factual material examined in past CPE papers, and are brought up to date each year. One teacher had built up an item bank of about 600 knowledge items, culled largely from old CPE papers. Every week h1e devised a science test from this bank for his Standard 7 pupils. His CPE results were among the best in the district. Similar observations can be made about the science questions set for recent primary school leaving examinations in Botswana and Tanzania. In both countries, nearly all the questions tested factual recall, although not so many of them required knowledge of technical terms as was the case in Kenya. .• - 17 - Questions testing higher-level scientific ~kills were virtually absent. The case of Tanzania is particularly striking, because the examination seems to have been unaffected by the imaginative efforts which have been made since the Arusha Declaration in 1967 to reshape primary education as an instrument of rural transformation. The science section from one of the papers set for the 1978 Primary Leaving Examination is given in full on pag~ 18 . 9 Clearly, fe~ of these questions tested knowledge relevant to the primary school leaver. Most of the content was drawn from the fields of physical science and formal biology; very little from fields such as agriculture, nutrition, or health. Moreover, only one of the questions (No.45) tested any skill apart from the ability to remember factual material. No attempt was made to assess whether pupils had developed any of the process skills such as the ability to apply old knowledge to new situations, the ability to gather new information and interpret it accurately, or the ability to infer valid conclusions from the analysis of information. All these are skills essential for autonomous functioning within any environment, but especially one undergoing r~pid change. It is expecting a great deal of teachers to ask them to commit themselves wholeheartedly to a programme of radical educational reform if the main criterion against which their competence and success is measured remains unreformed. 10 There is no intrinsic reason, however, why examinations should lag so far behind other aspects of education, as they do in the cases we have just discussed. If examination development received the same resources of professional skill and time as is given to, for example, the development of new teaching materials, examinations could be a spearhead of educational reform, instead of a barrier to it. Teachers respond very quickly to changes in examinations; they will respond just as readily to changes for the better as to changes for the worse. The point was made succinctly in a letter written to the Kenya National Examinations Council recently by an A-level Physics teacher. He was reacting to a newsletter distributed to schools which stressed that, in future, practical examinations for A-level Physics would require the candidate to demonstrate planning, decision-making, and problem-solving skills, instead of merely the ability to follow instructions: - 18 - I wholeheartedly agree with the comments on practical work at A··level; simply following instructions is not what should be expected. The 1980 examination suggests that there wi 11 be even greater moves away from this barren type of practical examination in future. But why do teachers teach their students simply to follow instructions and not to think about the practical work? I would suggest it is quite simply because that is what the examination has required in the past. Many teachers know that their students should be able to do more than th~ but their main concern is to get the students through the examination - and we can see that the teacher is judged on the examination performance of his students. Tanzania Primary School Leaving Examination: Science Items, 1978 li. A frot eM I he IHtU. • IM4 11114 111 .. ~tr ~ICIVII 41. All pl1nets 10 rau11d tilt A. tt has ern that tnlltlt It te SH -"'"• 111 111ter • • A. aoot1 hnd I. 11rtll I, It has lungs end Skill 11tlld1 It USIS fDf' brHUtllll llfllle c. SUI'I 01\ ,.,, • .,.,. ' " Wlltr D. stars C. It tau specl1l leta tlllt tnlll1t It to swl• 111 Wiler 44. To foretell lllllt the wtiUitr will I.e 111 tl!t futwre ~r •llrvlnt D. Its fou4 Is , _ . 011 111114 M4 111 ••ter the pr11111t trlftd Is k1101111 as li. P1111t ruou gruw lCIIIIrft A. rortclltlllf the wether A. light I. tueSIIIIf tilt wtltlltr I. grntt1ll011al Pltl1 C. Ill l.. tlllt the wtllhtr C. Wiler 0. knowlnt the ••ther D. soli 45. A ~or of ZD kt. wtlttlt sits l •trtl fr• tile fu1c,... 1tf e lever. ll. Iron h prnentc!d 'fr• r111th1t ~r petnllnt It with lllere •Ill another boJ of lO lit. wtlljllt sit for the l11ter to be 111 equt 1 tbrl111l A. petrol A. l •tres I. hroslne I. S •tres C. yru't C. 4 •tres 0. ott D. Z .. tr11 lt. lht requlreeent for e!lfl to t11tcll Is usuellr 46. Sol1 that rtlllns water for • lont tt• II A. consunt lltll A. clar sol1 a. constMt Jtgtlt I. 1o•sull C. prnence uf .,tiler to w1rw ttl• C. un4r soil II. • .. c.tllne ••• for Ult purpose of llltclllllf etfl D. 1oU ~lth 1 lot of coapoi lte 40. An lnstn-nt that ..hi use of tilt pr.,.rtln ef Jttflt h 47. Miller h uld to lie ciiMicallr chlllftd If A. • looldnt tl•n A. Ill weltflt Is ct1111~d I. 1 torcll I. It c111not lit chlllll4 IIIlO tls wltl111l Uate 111111' c. • l-rl C. Its stilt has coaplettiJ ch1119H u. sun IJO'J9111 D. Its 1111 has Lhlnt.. 41. A true Insect hiS tllrtt .. 111 p1rt1. llllctl do rou tlllnlt II 41. Choose • st1t111111t wlllch Is Ul'ltrue 1ft Insect In the foiiCIIIIIII llstl A. 1 spider A. soUI'Id trneh 111 all stlt@S of .. uer I. IOuM trnth Ill WICUIII IIIIIJ I. • gran hopper C. • centipede C. the SPftd of sound d@ptn4s 011 the ae41111 111 ""'"' the sotMd trnth D. • pupa D. sOWid It reflectt4 42. ~ ~!i!!~lc lnstr.,_,t used for ne,lllf .. or dCIIIII electrlcltr 4t. 111111 two Ul'll Ike ..,net lc poles art brought 1111r e1cll ottl@!r A. I tr1ns foratr A. thtJ pu\11 IICII other I. thtJ , .. tach otlltr I. • tr~nslstor C. tiler pyll eec.h other c. an ..,lifter D. thtr col II.,. repe.tttiiJ D. I ~ntntw SO. "t•ln C Is 1011114 111 1artt q11111tltles In A. Ie-. I ,..,,.., I pllllllllpltl I. c. D. ,. . . . ·-1 . . . 1 or1111111 8lllfOII1 1111111111 •-•~ Wlllfll, tlllttriMt 111 - 19 - Footnotes 1. There are a number of countries, none of which is familiar to the present writer, which ran their education systems without recourse to formal selection examinations. It would be invaluable to have detailed studies of the mechanisms by which access to continued education and to formal employment in these countries is controlled; and of the efficiency and equity of these mechanisms in comparison with examinations. 2. Educational qualifications are, of course, themselves derived from examination qualifications, and are thus a close proxy for them. At the end of each cycle of the education system, those who score the highest marks in the terminating examination use their examination qualifications to continue their education into the next cycle, while those with poorer results use their qualifications to attempt to find employment. Thus an A-level leaver, for example, is by definition a person who performed well at 0-level, even though his A-level results may be indifferent. 3. A pre-selected candidate was awarded a secondary school place only if his performance rank was equal to or less than double the number or places available for the district. For example, if 23 places were available, a pre-selected candidate gained a place only if his rank order was No.46 or better. 4. Unless, of course, any additional pre-selected candidates beyond the 25S quota would have won a secondary school place on their PSLE results alone. In most districts there was in 1978 only about 1.5 secondary places avai1ablr. fc~ every class of primary leavers, so pre-selected candidates would have taken up ~~ore than 60% of secondary places if these two provisions had not been enforced. 5. The only results issued after the PSLE are lists of pupils awarded secondary places from each primary school, so parents and the general public have no other measure with which to gauge how well the pupils have been taught. 6. Each pupil, working as a member of a small group, contributes to a practical project which counts towards the internal assessment. All members of the project group receive the same mark. But in 1979, project work made up only 10% of the total internal mark, as compared to SO% for school examinations and 40% for class work (mainly tests and written assignments). 7. In Kenya, quality differences among secondary schools are huge. In an external examination, the poorest student at one of the best schools will often score higher marks than the best student at one of the poorest schools. 8. The standard deviation is a measure of the extent to w~ich scores ~atter from the mean. - 20 - 9. Separate examination papers are set for each of the 20 regions in Tanzania. This means that for science alone, 300 different items must be written every year. For the examination as a whole, the total is no ·ltess than 4 ,000! The enonnous burden of this work is a major reason for the deficiencies of the items discussed in the text. The original-· \' version of the paper was in Kiswahili, but it was translated into English for an Erlglish-medium school in the region where it was used. iO. Omari and Manassi's (1977) critique of the examination includes a useful discussion of the technical qualities of items set for the Primary School Leaving Examination. - 21 - CHAPTER 2. THE EXAMINATIONS SYSTEM IN KENYA. A. ihe Structure of Examinations The structure of the examination system in Kenya has already been briefly referred to in Chapter 1 (see pp.l5-16). There are three main selection hurdles which pupi'ls must sunnount to remain within the government maintained system for the full thirteen year course and then enter the national university: the Certificate of Primary Education (CPE) examination at the end of the primary cycle; the Ordinary-level (0-leve1) examination (also known as the Kenya Certificate of Education or KCE) after the secondary cycle; and the Advanced-level (A level) examination (or the Kenya Advanced Certificate of Education, KACE) after the final upper-secondary cycle. The Certificate of Primary Education terminates the open-access cycle. In theory pupils should sit the CPE after seven years of primary schooling, when they are 13 or 14 years of age. In practice, however, many pupils . '. repeat one or more years while they are at primary school and so take eight or even nine years to complete the cycle. In the rural .areas, the average age of pupils sitting the CPE is about 15 years; in the cities and towns, about one year younger. In 1962, the year before independence, there were fewer than 28!000 candidates for the examination which was the predecessor to the CPE. Over the next few years there were spectacular increases as Kenyans sought to qualify themselves for the newly-available opportunities in middle-level and higher-level employment. By 1967, candidates had risen to more than 138,000, a five- fold increase in five years. After 1967, however, the growth in enrolments fell back rapidly until by the mid-1970s, it averaged less than 3% per annum. By 1979 there were 270,000 candidates, representing about 60% of the age cohort. Then in 1980 there was a sudden sharp rise to 328,000, an increase of 21.5% over 1979, followed by a further rise to 348,000 in 1981. 1 These increases were the consequence of the first of two recent major reforms - 22 - in primary educat·ion: the progressive abolition of school fees starting in 1974, and the abolition of all non-fee charges (building levies, equipment levies, etc.) in 1979. The 1980 CPE group was the first to have passed right through the p~~imary school without being required to pay school fees at any stage. When.' the impact of the abolition of non-fee charges has worked its way through the primary system by the mid-1980s, it can be anticipated that there tl'lill be not far short of half a million pupils completing the open- access cycle each year. Five school subjects are tested in the CPE: English, Mathematics, Science, History and Geography. English is examined over two papers: an objective, multiple-choice paper consisting of 50 items, and a comp.osition paper which tests candidates' ability to write connected prose. Mathematics is examined .- in a single 50-item multiple-choice paper. The remaining three subjects, science, history and geography, are examined together in a single paper known as the General Paper, which is also multiple-choice in format. Forty items are devoted to science, twenty-five to history and twenty-five to geography. Four alternative answers are provided for all multiple-choice questions. Candidates record their answers to the three multiple-choice papers on specially-printed answer sheets which are sensed optically by a document reader. The responses are then processed by computer. The use of multiple-choice format and computer marking goes back as far as 1966. It was dictated by necessity. As we have already noted, primary enrol- ments were growing rapidly at the time, in response to the changed political climate brought about by internal self-government in 1962 and independence in 1963, and the expansion of educational and employment opportunities for Kenyans which followed. By 1966 there were already more than 100,000 candidates for the examination, and the old manual system of marking could no longer cope. Now, with three times as many candidates, the examination is processed in just four weeks. At first, the entire examination was converted to multiple choice, the onl~ writing exp•!cted of candidates being to record their names, the names of their schools, and their index numbers. This change, however, produced un· fortunate backwash effects: because there was no longer an incentive to teach - 23 - pupils how to write continuous pl~ose, many schools stopped doing so. 2 For this reason, an English Composition paper was re-introduced into the examination in 1973. The marking of the composition paper requires a major feat of organisation. More than 1000 examiners are needed, so that the work can be completed 'tJithin the time available. The Kenya Certificate of Educatit)n (0-level) examination comes at the end of the four year secondary courses, after a total of eleven years formal education. It is the successor to the Cambridge Overseas 0-level examination, which was the terminating examination for secondary education in Kenya until the early 1970s. In format, the examination is still based closely on the Cambridge model, but the syllabi for many subjects have been changed to reflect Kenyan conditions and pri or·i ties. Candidates offer a wider range of subjects for 0-level than for CPE: the usual number is eight. The only compulsory subjects are English and Mathematics~ Performance is categorised in each subject into nine grades with grades 1 and 2 being the highest (distinctions) and grade 9 the lowest (fail). The candidate's grades for his best six subjects are totalled to give his grade aggregate. If his aggregate is 23 points or better he receives a Division I pass; if it is betwe~n 24 and 33 points, a Division II pass; and if between 34 and 43 Points, a Division III pass. Most of the questions set for 0-level examinations require essay-type answers, although some papers contain multiple choice items. During the early and mid-1960s the 0-level was an examination for the fortunate few. Those who perfonned well were assu.red of continued education to A-level, while even those with poor marks could be aimost certain that they would be recruited either directly into employrr1ent, or into a training course leading to employment. The end of this era of virtually automatic access to further opportunities can b~ pinpointed accurately. A follow-up study of 0-level leavers from each of the years 1965, 1966, 1967 and 1968 was carried out by Kinyanjui (1974). He found that in the 1965 cohort, only 2% of leavers were unemployed after one year. In the following two cohorts the proportion was even lower: only 1%. Among the 1968 leavers, however, there was a sudden jump to 14%. Over the past 15 years 0-level has developed into a mass examination, not unlike the CPE. In 1965 there were fewer than 6 ,000 candi·dates, but by 1970 there were more than 19,000, and by 1981, more than 93,000. Much of this increase is - 24 - accounted for by rapid expansion in the unaided school system. Ex,;tct data are not available, but over half the 1981 candidates were from unaided schools. A few unaided schools offer high-cost education of superior quality to th~ urban elite, but most are second-chance institutions, catering for primar~ school leavers who fail tc perform well enough in the CPE to gain entry to government- maintained secondary schools. They are of two main types: Harambe:e schools, run on a self-help basis by local communities mainly in rural areas; and private schools, run on a comnercial basis mainly in the cities and towns. With a few striking exceptions, standards in these schools are extr'~mely low. Most present only an educational facade» behind which pupils are provided with little more than a four-year period in which to adjust their aspirations downwards. The Kenya Advanced Certificate of Education (A-1 eve 1) tenni nates tht:! upper secondary course, and is thus the final external examination in the fonnal school system. Following the English model, the upper secondary cour,·se is essentially a two-year specialist preparation for university entrance rather than the culmination of a general education. Each pupil studies three, or at most four, subjects; and if he is wise he chooses them with university entrance requirements in mind as much as his own abilities and 'inter11~sts. For example, if he wants to study engineering at university, he must offer mathematics, physics, and either chemistry or additional mathematics at A-level; whi 1e if he wants to study medicine. he must offer chemistry, bi ol og,y and either mathematics or physics. Pupils hoping to enter pure or applied! science ::nurses lt university usually drop all study of larguages and the humanities, while those hoping to enter non-science courses usually drop all stu~y of the sciences. But the O-leve1 examination results do not establish a basis for such a neat dichotomy: many of the ablest pupils perform equally well both in the sciences and in the languages and humanities (Gakuru 1977). By forcing these pupils into premature choice, the present A-level system leads to many unfortunate career deci si ens, and to much 1oss of ta 1ent, espe!ci a11y of talent spanning a broad spectrum of intellectual skills. 3 Unlike CPE and 0-level, A-level has not become a mass examination, mainly because government has, until recently, restricted the growth of upper secondary provision to keep it in line with the absorptive capacity of the university. As late as 1975 the number of A-level car1didates was only a little over 4,000. Over the past few years expansion has been more rapid, but even in 1980 there were fewer than 10,000 candidates. As a consequence, A-level leaver:s who do not - 25 - enter university still have fair prospects of finding employment. Many of them pre-empt pre-service training opportunities which in earlier years would have gone to 0-level leavers. The Certificate of Primary Education, 0-level and A-level are the three major examinations controlling access to mainstream formal education in Kenya. In addition, there are numer·ous other more speci a 1i sed ex ami nat; on s. One of these, the Primary Teachf!rs Qualifying Examination, is central to the issues discussed in this paper. It came into being during the mid 1970s when the examining and certification of graduates from the 18 primary teachE~rs • colleges was centralised!. Candidates,who number about 5,000 each year, are examined in twelve teaching subjects and in professional studies; in addition their practical teachin!~ skills ar@ assessed. The papers in most subjects have been revised drastically over· the past few years, with two main aim!; in view; first, to encourage the: call eg~~ to give more attention to the prof1:~ss ion a 1 preparation of teachers and less to subject content; 4 and second, to bring the ex ami nation into 1i'ne with changes in the CPE •. The reformed CPE., which we shall discuss in the1 following chapters, attempts to assess whether pupils at the end of the open~access cycle have developed a range of relevant intellectual skills as well as a body of knowledge; the reformed teclchers• examination attempts t(l ensure that newly-qualified teachers have mastered a repertoire of methods through which they car. help pupils to develop these skills. The Kenya Junior Secondary Examination (KJSE) should also be mentioned briefly. Originally it function1ed as a tenninal examination for unaided secondary schools Which lacked the f~cilit~·es t.C' offer a course longer than two years • .A small Proportion of the mos1: successful candidates were recruited into government- maintained secondary !ichools at the third-form level and others into tE~acher education. Now, howe'l/er, most unaided schools offer the full four-year· course, and most recruits to teacher education have comp 1eted 0-1 eve 1 , so the examination is declining in importance, although there are still over 50,000 candidates each year. B. The Administration of Examinations. Until 1980 there were two quite separate bodies responsible for the setting and administration of examinations in Kenya: the Examinations Section of the Ministry of Educaticm, and the East African Examinations Co'uncil. The Examination Section controlled the two lower-level school examinations, the - 26 - CPE and KJSE, and also the Pr·imary Teachers• Qualifying Examination. Higher- level examinations, inc·luding 0-level and A-level, were conducted by the East African Examinations Council. a parastatal body set up by the now-defunct East African Community to run a common examinations system for the three East African countries: Kenya, Tanzania and Uganda. Before the establishment of the Council, the Cambridge Overseas Examinations Syndicate ran higher-level examinations in all three countries. Tanzania withdrew from the Council in 1974, but Kenya and Uganda continued to run the common system for a further SiX years. By 1980, however, differences between the education systems of the two remaining participant countries had grown to such a point that it was no longer possible to maintain cc>n111on examination papers and c.o~m~on· standards. The old Council was dissolved, and in Kenya a new body, the Kenya National Examinations Council (KNEC) c;arne irato being to take its place. At the same time the old Examinations Section was merged with the ~ew Council. Thus, although the geographical scope of the KNEC is narrower than that of the old Council, its professional sc1ope is much wider: all external examinations be 1ow the university 1eve 1 now come under its aegis. C. The Examinations Research Unit The first tentative steps towards using research as an instrument for reforming examinations were taken in 1971, when a pilot item analysis of some CPE papers was carried out. This analysis led to the introduction of verbal reasoning items in the 1972 English paper. Full-scale item analysis of CPE started in 1972, and has been continued ever since. By 1975, the impact of research could be seen in all CFE p4pers. This e!rly research was conducted from the Kenya Institute of Education and the University of Nairobi. By the mid-1970s, however, the volume of work had increased to a point where it could no longer be continued on a part-time basis. In 1977, a research unit was established within the Examinations Section at the initiative of the Head of the Section, with a staff of four research officers. In 1980 the Unit became a constituent of the new Examinations Council. By 1981 its staff had increased to nine professional workers, six of whom were Kenyans and three expatriates. The main responsibility of the Unit when it was first established was to continue and extend the programme of applied research into examinations, - 27 - using item analysis and other methods. Results were to be made available. both to the examination setters, to help them develop better questions, a.nd to the schools, to provide them with guidance as to the new demands being made by the examinations. A progranme of research into educational issues of more genera 1 concern was a1so undertaken,, 5 But the Unit was to be concerned with action as well as with research. Its members were expected to take 1eadi ng ro 1es in setting up a prograrmne c1f examinations reform, and then in carrying it out. Two major workshops, held in 1978 and 1979, were central to the reform programme. Participants included curriculum development specialists, school inspectors, primary teacher·s' college lecturers and secondary school teachers, as well as examinations research workers and other examinations officers. The two workshops followed a corr.mon pattern: at each, the first few days were spent in discussing and agreeing on a set of goals for a major examination, and the following week in writing questions and items of the types needed to meet the new g()als. · Since the workshops, the Research Unit has been involved in a wide r·ange of activities directed towards examinations refonn. Research workers have participated regularly as members of examination-setting teams; additionally, they have assisted in the preparation of new examination syllabuses, in the estalishment of new processing procedures, and in the training of examination markers, both through formal courses and as leaders of marking teams. Outside the Council, participation in numerous in-service courses has provided opportunities to explain the examination changes to teachers, tea,cher educators, and field officers. The Unit has also undertaken tasks not directly connected with the research and refonn progrannes. For instance, officers have helped with the establishment of new recruitment procedures for primary teacher trainees, and in the work of a task force, established by the Kenya Institute of Education to examine problems of mathematics education. The whole ~f Oec;ember and much of January each year are given over to helping with the routine processing of CPE and 0-level results, a task so huge that virtually every o,fficer of the Examinations Council is involved. The emphasis given to applied activities has increased over the years since the Unit. was established. Indeed, th,ere have been times wher1 these activities have threatened to swamp the Unit's c.ore research functions. - 28 - Because of its origins within the old Examinations Section, the work of the Research Unit has until recently been centred largely on lower-level examinations, and more particularly on the Certificate of Primary Education and the Primary Teachers' Qualifying ~examinations. Since the Unit became part of the new Council in 1980, howf!Ver, studies of secondary-level examinations have been initiated. By late 1981, research reports on 0-level English and physical science, and A-level physics, had been completed and sent to schools, and i nvesti gati ons of 0-1 eve 1 mathematics, chemistry and hi story were under way. Some of these studies have led to substantial changes in the content of the relevant examination papers. But it will not be possible for some time yet to subject secondary-level examinations to the same detailed scrutiny as has been applied to the CPE. The range of subjects to be covered is wider; and furthermore, research workers with specialist knowledge of particular subjects are· needed for effective analysis. In addition statistical data are less easily accessible, because secondary level examinations are marked mainly by teams of examiners rather than by computer. Thus we sha 11 focus our attention in the remaining chapters of this report on the CPE.. First we shall outline, in Chapters 3-7, the major changes which have been brought into effect with this examination; and then, in Chapter 8, we shall evaluate the results of some of these changes. - 29 - Footnotes 1. Over the same period (1962-1981) there was a quiet revolution in the participation of girls in primary schooling and in the terminating examination. Between 1958 and 1962 fewer than 24 girls sat the examination each year for every 100 boys, and this proportion showed no tendency to rise. In 1963, however, the trend turned sharply lpwards, and has continued rising ever since. In 1967 there were 41 giris for every 100 boys, in 1970, 50 girls, and in 1981, nearly 76 girls. In the better-developed rural areas (although not in the cities and towns) girls have already achieved virtual parity with boys in CPE enrolments (Somerset, 1977). 2 ...... a casual inspection of some of the children's workbooks in Standard VII will reveal in most subjects lists of question numbers at the side of the page and then merely the corresponding answer letter, a, b, c or d ... (King, 1974, p 127). 3. As an attempt to bridge the gap between the 'two cultures', A level candidates are required to sit a general paper along with their main subjects. The questions set offer a wide choice of topics, including current affairs, scientific developments, social change, traditional culture, etc. But the paper counts as only half a subject, and few university faculties pay much attention to the results when making sele1ction decisions. As a consequence, generalstudies occupies only a marginal place in the work program of most upper secondary classes. 4. One major reason for the bias towards subject content (particularly secondary-level content) and away from teaching methodology is that few primary teachers' college lecturers have had primary teaching experience. The majority have taught at the secondary level only. This in turn is due to qualification requirements: almost no primary teachers have the formal educational qualifications needed to become teachers' college lecturers. 5. Research which has been carried out under this heading includes studies of the effects of socioeconomic background factors on CPE performance, differences between boys and girls in access to schooling and in achievement, and the effects of previous teaching experience and 0-level grades on the success of teacher trainees. - 30 - CHAPTER 3. CHANGES IN THE KENYA CPE : GOALS AND MODES OF REFORM. We shall discuss the reform of the Kenya CPE examination from two aspects: first, the goals towards which the programme of reform has been directed; and second, the modes through which the progranwne has been carried out~ The main goals, and the main modes, can be summarised as follows. Goals of reform Modes of refonn 1. Allocational goals 1. Changes in examination questions (a) Efficiency 2. Introduction of information (b) Equity feedback system 2. Educational goals {a) Incentive information (b) Guidance infonnation (a) Relevance . (b) Overall quality (c) Distribution of quality The programme has been directed towards five main goals, which can be grouped into two sets: allocational goals and educational goals. The two allocational goals concern the functions of the CPE as an allocator of secondary school chances. The reforms have aimed to produce an examination system which allocates these chances more efficiently, and more equitably. The three educational goals concern the backwash effects of the CPE on the functioning of the prima~ schools. The reforms have aimed to produce a system which would promote the tear.hing of relevant skills and knowledge in the schools, an improvement in the overall quality of education, and a reduction of quality differences among schools and geographical areas. Towards these five goals, two major modes of reform have been employed. First, radical changes have been made in the types of question asked in the examination. A much wider range of cognitive skills is now tested, and t~e content through which these skills are tested is substantially different. Secon~, .an infoMmation feedback system has been set up based on performance data from - 31 - the examination. The information provides schools both with guidance as to how they can best prepare pupils to meet the new intellectual demands being made of them, and with an incentive to do so. A. Goals of Reform 1. Allocational goals Each year the CPE examination is used as a means of identifying a small group of primary school leavers (at present about 13 out of eve~ 100) to receive the privilege of four years continued formal education at the public expense. (i) Efficiency. Questions asked in the examinat·ion should discriminate effectively between the abler pupils who are most likely to succeed at secondar·y school, and pupils of lower ability. , (;;) Egu;tl· The questions should be fair towar~Yunderprivileged groups, including rural pupils (particularly those from the arid and semi- arid areas), girls, and those from less-favoured socioeconomic backgrounds. Individual items will inevitably show biases, but over the examination as a whole no particular group should be unduly favoured. The methods used to assess the efficiency and equity of CPE items will be discussed in Chapter 7. 2. Educational goals Educational goals concern the backwash effects of the CPE on the functioning of the primary schools. As we have seen, during the last 2-3 years of the open-access cycle the topics which teachers choose to teach and the methods by which they teach them are determined more by the nature of the terminating examination than by the specifications of the formal curriculum. The aim has been to harness the backwash effects of the examination to constructive purposes, so that the aims of the curriculum developers are buttressed and not sabotaged. (i) Relevance. Questions asked in CPE should provide teachers with an incentive to attempt to develop in their pupils relevant skills and knowledge. CPE should be neither specifically a selection examination nor specifically a tenninal examination; it should PrOmote the development of a set of competenci!s which wi 11 be useful both to ~~csa '-lho enter secondary school and to those for whom basic education is terminal. - 32 - (ii) Overall quality. The CPE system should encourage and assist teachers to deve 1op effective methods for tea chi ntg re 1evant ccmpetenci es so that the overall quality of basic education is im~roved. (iii) Distribution of quality. The CPE system should provide particular encouragement and assistance to the lf!SS successful schools, so that qua 1ity differences among schools and ge•Jgraphi ca 1 areas are reduced. B. Modes of Reform 1. Changes in examination content The first major steps in the reform of the CPE examination system involved changes in the types of questions asked. Until the early 1970s most questions, except in mathematics, tested little more than the candidate's abil1lty tc1 remember factual material. Further, the facts tested were mainly isolated fragments: in history and geography, names, dates and places; in science!, definitions of technical terms; in English, grammatical rules and use of idioms. Moreover, the content of the questions was heavily detennined by the secondary school selection functions of the examir.ativn rather than by its tenni na 1 functions. Much of the materia 1 tested was more appropriate to a junior secondary course than to an upper primary course; very little of it was of any practical usefulness to a primary-school leaver (Somerset 1974, pp. 170-1). The reformed examination tests a much broader spectrum of cognitive skills, most of them skills which can be applied in a wide range of contexts, both in and out of school. Many questions, for example, test decision making skills; pupils are expected to understand and analyse given information, and then to draw valid conclusions from it. Other questions indirectly test observational and experimental skills: they give a big advantage to pucils who have carried out practical investigations using the resources of the local environment. In the composition paper, pupils are asked to show that they can communicate effectively, using connected prose which is imaginative as well as accurate. Remembered knowledge is still needed to answer many questions, of course. But three main changes have taken place in the testing of knowledge in recent years. First, questions have tended to focus on the cause-and-effect - 33 - relationships between facts rather than on facts as specific, isolated entities. More emphasis has been given to the context of meaning in 'llhi ch facts are embedded than to the facts themselves. Thus the questions tend to ask "why.:! and "how", instead of "who", "where ... "when 18 and "what". The second change, closely related to the first, is that many questions now require candidates to apply remembered knowledge in new, unfamiliar situations. Finally, wherever possible emphasis is given to testing inform~ation which will be particularly relevant to tennin~l pupils. This change is especially evident in science: much stress is given to topics such as nutrit·ion, child care, soil conservation, fi,-..st aid, sources of energy, the use of improved seeds and fertilisers, etc. The changes in examination content wi 11 be di s·cussed in' more detai 1, and illustrated with examples, in Chapters 4 and 5. 2. Jntroduction_of the infonnation fe{;~back system. The second main method of reform has been to use the CPE examination not merely as a tool for selection and certification, but as a source of information about the strengths and weaknesses of pupils' performance, and hence as a means for improving the quality of basic education. An examrtnation can be a highly effective instrument for monitoring the performance of a school system. An enormous amount of useful information can be generated as a by-product of examination pror.essing at negligible cost, particularly if processing is carried out by computer. Until the last few years, however, this infonnation was not even generated for· CPE, let alone analysed and made available to the schools. The conception of what should be contained in a set of examination 'results' remained extremely limited. Schools, inspectors and education officers expected to receive nothing more than a computer printout listing the candidates in each school and the grades they obtained in each of the subject areas (English, Mathematics and General subjects) together with a list of those accepted for secondary school. ThP. feedback information which is now provided from analysis of examination data is of two main types. (i) Incentive information consists mai.nly of lists giving _overall performance means in each of the three main subject areas (English, mathematics and general subjects) and in the exa~ination as a 'llhole, for each school within a district, and for each district within the Republic. - 34 - In all ltsts, schools, or districts are ranked in performance order, from top to bottom. School performance lists are sent to the district field officers for distribution to the schools, but the district performance list is made available nationally, to the press, the teachers' union and to Minist~ officials as well as to the district field staff and the individual schools. A sample district list (for 198'1) is given at Annex 1 and a sample schools list {for Tana Rver District, one of the smallest, in 1981) at Annex 2. (i1) Guidance info~ation is based mainly on analysis of performance in individual 4uestions rather than on overall perfo~ance. The most important guidance feedback document is the CPE Newsletter, which is distributed annually to all schools, field officers and professional educators, and to the press and the teachers• union. ·The Newsletter has two main purposes: first, to explain to teachers the changes taking place in the exanrination, both in the content of the items and in the skills being tested; and second, to· identify key topics and skills whi'ch are causing candidates particular difficulties, and to suggest ways in which teachers might help pupils tackle them more successfully. Extended ex~erpts from the 1980 and 1981 Newsletters are reproduced at Annex 3. The annex includes the full 1980 reports for the English objective paper and for science, and the 1981 reports for mathem~tics, histo~. geography and the English composition paper. It will be apparent that the introduction. of the information feedback system had to wait until the reform of the examina.tion itself was well under way. There would have been no point in establishing incentives for schools to improve their perfonnance in an examination which tested little more than rote recall; to do so would have been merely tC) exacerbate the alreid.y prevalent tendency for teachers to prepare their pup'ils by crarmting them with facts. Nor did the schools need guidance as to thu methods they should follow to prepare their pupils for such an examination: as we have noted, the more successful had already devised highly efficient methods, based on protracted revision, frequent mock examinations, and the use of conmercially published CPE examination guides. Thus the introduction of examination feedback was de'tayed until it was clear to everyone that changes were taking place, and that new preparation methods were needed. The first guidance information was sent to the schools in 1976, and the first incentive information in 1977. The full feedback system which has been developed since then will be discussed in Chapter 7. - 35 - C. Modes And Goals: The Anticipated Linkages We have now outlined the ~~o main modes of change which have made up the programma of CPE reform, and the five main goals which it was hoped the programme would· achie¥e. Figure 3.1 shows in schematic fo~ ar.~icioated linkages between changes and goals. Fi ure 3.1. ro ranne : the Modes of Reform Go&l s of Refonn (a) Educational (b) - A11~cational 1. Changes in examination questions {a) Content More efficient allocation {b) Skills 2. Introduction of 1nforma~ic~ \ - ) Higher quality feedback system education More equitable allocation (a) Incentive / (b) Guidance Reduced quality differences among schools/districts The examination content changes were directed towards both of the allocational goals, efficiency and equity, and towards one of the educational goals, relevance. It was not anticipated that the examination changes ~Y themselves, would help improve the quality of upper primary education or reduce ·the qua11 ty differences among schools and geographi ca 1 areas. (i) Eff1c1encl. It was hoped that, by imp~ving the technical quality of the CPE items and by tapping a wider range of cognitive skills, the efficiency - 36 - of the ex;tm1nation as a measuring instrument, and hence as an allocator of secondary-school chances, would be improved. (;;) Equity. Two changes, it was hoped, waul d make the examination fairer to the less-privileged: first, the inclusion of a substantial proportion of questions testing reasoning ability and other higher level skills; and second, the basing of questions, wherever possible, on material which was within the experience of most pupilsc It was anticipated that these changes would give pupils from less developed parts of the country, from poor-quality schools, and from underprivileged socioeconomic backgrounds a better chance of showing their talents, and hence of winning secondary school places. (iii) Relevance. It was hoped that teachers would respond to signals from the reformed examination by revising both their teaching methods and the content of their lessons, so that pupils would receive. a more relevant education dur1 ng the fi na 1 year· of the primary course. The introduction of the information feedback system was directed towards the improvement of the overall quality of upper primary education, and towards the reduction of quality differences among schools and geographical areas. It was also hoped that the feedback system would have positive effects on the relevance of upper primary education. (i) Overall guality. Guidance 'ir:fonnation, it was hoped would help improve overall quality in three ways: first, by making explicit the new competencies which pupils needed to master in order to succeed in the reformed examination; second, by identifying areas of weakness in the perfo~nce of pupils which needed special attention; and third, by suggesting methods by which teachers could tackle these areas of weakness. Incentive information, it was hoped, would help by promoting competition among schools in each district, and among districts in the country as a whole. (11) Reduction of quality differences. The incentive information now available makes it easy for district inspectors, teachers' advisory centre tutors and other members of the district-level primary school support te~s to identify the weaker schools. It was hoped that this would lead to more effective remedial action. Similarly, the relative standing of each district in the Republic as a whole is now widely known, to·local and national political leaders, professional educators, and to the general public. It was hoped - 37 - that this information would lead to initiatives towards strengthening the district support teams in the weaker di siuL ••c: calkd arill. Ali lrtAc!ol dil'c1rll lllhii~Krt will! lite llilliK• jtlke ,.,aM~.-. rallwr tlwt •il• ltuuW .-..: and H'uor.lcd lu• •c•ulh iu lhc full..l,,in& ch:ul: (iJ lllt'l........ k 1'8Un fw ,..., INihin IO - . - . kmun sulutiu11 wlutiotl liit Babin who drite• bullkd mill Nt' MOlt' lihli to suHrr fuotn di;mhon. juk-e ul uiiC't •illtCIIf milk oh~p - ·- ·---· --·- - - - - - - (iiiJ bottkd milk don not l'Otttaill ,._.... water,.,. rbuna babic:li. hun~ pinL hUll\ J;IO:CII I )'C:S IIU 110 )'U )'CI 110 U\ina the da;ut, "l•kh unc ur lite fullu.,·itet ~...... ia anrdl 110 110 110 )'CI fi•J It l1 clanpn to fctd Wllicll ul fliCK arr CONKt.«:n 1 bahr wicll bna.. 11111•. A. liJ ltMIIiit ot~lv. A . t.I.IL ;md lillolf' sulu1iu11 are llolll ICilh. D. s."'l' ••lluthm i• I t..:IK alld k:tiMIII juice • - .w. II. liiit aNd (ivJ oetlr. (:. A>l.-, sululiouaiMI vi~~ecar ate lttulll MICI. C. (IJ, (iit attd Ci,·J Dlllr. U. VIIKI;ai i• ~" ac:i.ol anti 11:- juicc ia 1 line. U. Cit. Ciit allll Clii) Dlllr. IJ. S:aid run to tiM: naatkc:t to IMir a filh. lie IKllic:rs lltal tile poillkr olllle ltaiiiiC'C II . . a ..... ltl Ut.ar•.au• 1'. \\'ltc~t lhr i1la it 11bccd itt lite pH, 1'-c poillcf is 11 . . . . . ill Dilp. . Q. Jl. 5c.,..illl a1 KiiMII at'KAICII Stalin" itt i:rM)'I Cllritd uut 111 np.-tHel('llf 1.0 Unci ttut •l~e:tl.-1 Whk:h oaiC uf liar llllluwi"l is ll~e IIIOlilliLrlr wclahl ullllc illa1 ,._..._ cuua. c11ly llldlfl"arillllhr I'IMiatolial iMnu •·t...W liH~ al~e:a•·it• "'lfl. 111.-il Jcwltt Wt'lt' II follow\. A. ll.f! IOOa. II. l•a :ZWa ]• C. I •a 100& WI:IGII r Of COrTfJN C:tO!• 11"1 t.:!l.(kiRAl'IS l'f:M 111:c r ARI: 1 U. I la 400 I Sad pbll!cd II SI:UI of 6a·m• afln (;) .......i .., (..wdlr ul ~!a~elll .... ..... oflo!ll (rarlr Mart U•ubaf1c1 nl 111111 llllfl I:IHe\ {11"-~11.: nl Ju11rJ .... ~ t- 1----------·-------··-··--------····· No.,..., •""' lhula lWla ------1------ t:u La .............red twice 1210•1 tJOll lf.O ... -------·--------------------. ·------ ----· ~-· PlaMli~t•artd four lilftn 1410 •• 1no•a 210 ... JIICialna ,,.,. 11trtr results, •l•kh onr of thr f~•·int j, the MUSr 1\II'OKIASr ad•k:" 11ricuhlll'll offkt'n JlloukJ tior fannrn ..... lfO.,·int ..UihNI? 61. K itNolid• I••• a dry lir: un wilh holes in it as •hown lA lhc diaaralll. J:;u:h lu•lc h:u • diamdu ur I 111m. \VItida CIIIC ollhc fuUowilll •uiaiUIC:I can Kiproeiclt acparala A. (..OIIOtl wail ~lultlld he tlbnttd as ,.,,,..liS lhr kH11 tain\ bca;an. """lll•is lin c:an 1 0. II is bellrr lo tolanl cOIUIII w:cds itt Mar than ift Junt'. C. COIIOft pl:attts should br IPII)'td •• lt'a.. ftpour tink'1 dmin11hr J;10•in11 )>('.:1\CHI. P. II ili bett~rr lo ,,.., cutiOft plaftf$ t•icc than not to spr•r th•·m ;at all. lt. Hh,..i waftll lo allanl uaabr attd beans, but kas 1101 CINIUih nwnr~· 10 bu)· fcrlilhct. Whkh out TIN CAN of 1M fulkn•it11 wUI be l.EASf ~taclul itllllliLU., hi• wil ,._c rruilcl A. AIW.al -IIINt'. D. COIIIfiiKI. I JOLES C. Ask fnw f'11n. D. (.'hlfcoal. A. \Yior11l ..,.., .Nard with powtlctr4 ull. •· lka111 IINit'd willo peat. (', lkHt llli•c• willl rice. •• Wlu·AIIoHH •••··ol willtpraa. - 42 - 2. Reca 11/ of term1 no 1O!,Sl. Not only do the 1972 ar.1d 1973 questions test recall almost exclusively, but they also focus narrowly on specific factual material, and especially on terminology. The concern is with the vocabulary of science rather than with its application. Moreover, many of the tenns are technical: sedimentary, sa 1i ne, barometer, capillarity, etc. In the 1979 and 1981 papers, by contrast, only a few technical terms are used, and their role is quite different. Question 61 uses the terms 'acids' and 'bases', but candidates are not expected to have memorised their meaning. A definition of both is provided in the stem of the item. The terms simply provide ct convenient shorthand to facilitate the testing of a qcite aiffere~t set of skills (reasoning from given data). Question 79 requires knowledge of the word diarrhoea. The word is complex and difficult to spell, but it refers to a phetl\omenon which is quite familiar to all CPE cancfidates, especially those from low-income backgrounds. Further, synonyms are available in al1 mother- tongue languages. Hence the term is not technical in the sense that terms such as 'barometer' and 'telescope' are technical, referring as they do to objects which most pupils can learn about only from books. Diarrhoea is one of the c:orrmonest causes of infant sickness and death in Kenya, so it is important that pupils should learn about its causes and treatment during open-access education. In the absence of an acceptable simpler word, there is no alternative to using the scientific term. 3. Relevance of Knowledge tested. The kntowledge tested in the 1972 and 1973 papers reflects a preoccupation with the selection functions of the CPE, virtually to the exclusion of its functions as a terminal exalrination. But even for secondary school entrants the relevance of much of the knowledge tested must be questioned. Secondary school pupils need, of course, to know many specific facts, including definitions of terms, but unless they also develop an understanding of broader cause-and-effect patterns, which link facts together and give them meaning, their store of information will be of limited usefulness. Items testing knowledge of causes and consequences are entirely lacking. For the terminal pupil, the only item which tests knowledge in a clearly- relevant topic area is No.SO, from the 1972 paper. But most of the potential re'levan1ce is lost because again the item focusses narrowly, like most of the - 43 - others, on terminology. It is far more important for pupils to understand the reasons for so11 loss, its consequences for agriculture and for water supplies, and the steps tttat can be taken to prevent its than it is for them to know that such loss i$ called erosion. There are, of course, no objective criteria available which allow us to distinguish categorically between knowledge which is "re1evant 11 and "non-relevant 11 for any group of school pupils. A value judgement is always involved in making the distinction. Nevertheless it seems likely that most judges would agree that questions 77 and 79 - the only knowledge items ~mong the six quoted from the 1979 and 1981 papers - both test factual information relevant to a high praportion of primary school leavers. In recent years, it has become increasingly common for mothers of low-income families in Kenya to abandon breast feeding in favour of artificial feeding. The setting team, therefore, considered it important that school leavers should know the main reasons why breast feeding is generally preferable. Similarly, the cost of artificial fertilisers has risen steeply in Kenya in recent years, largely as a result of oil-price rises, so it was felt important that leavers should know the low- cost alternatives. The format of these two questions should be noted. In each case, three correct statements are given with only one incorrect statement, and the task of the candidate is essentially to identify the incorrect statement (in No.77 he must also decide whether·three or only two of the statements are correct). This mode of presentation is used frequently with knowledge questions because it enhances the usefulness of CPE papers as a supplementary learni.ng resource. It is inevitable that in subsequent years teachers will use these papers repeatedly during examination preparation, and also in writing items for 11 mock" examinations. Whenever a teacher revises questions 77 and 79 with his class, he will be ,..ented with three of the most important reasons why breast feeding is preferable to bottle feeding, instead of only one; and with three low-cost alternatives to artificial fertiliser, again instead of only one. In third-world countries, up-to-date textbooks and other learning resources are often in short supply in the schools, and teachers must often make do with obsolete materials, or even with none at all. Examination papers should , therefore, to the maximum extent possible, teach as well as test.' A selection of further knowledge items from CPE science papers since 1975 testing useful information of various kinds is given on page 44. A Selection of CPE Science Items 1975-1980 Testing Relevant Knowledge 1915 Good fa.--rs dip or spray their cattle to kill cattle ticks. The 1t71 lhere was an outbreak of foot-and-.aulh disease on Katuko's far• . .. 1. reason they do this Is because She was told not to .ave her cattle fro. her fa~. lhe ••In reason A. ticks ••ke holes In the skin of cattle and this reduces the value h•r thIs was lhlt of the leather A. It ~ould be easter to sell cattle on lhe far• B. ticks feed on the blood of cattle. which .. kes tha. weak I. sick cattle would die H .aved C. D. cattle ttch carry diseases which are dangerous to hYUn beings ticks carry ..,luaus which ~•n kill cattle (Mo.U) C. the cattle would beco.e healthy If they re~~alned on the fa•• D• .avlng the cattle off the far• would spread the d'sease (No.CO) 1911 There Is a sw.-p at the far end of MUturt school c~und. ltthul's 1979 Cut dry grass Is often placed between rows of coffee tr(les. Three of hu.e Is on the other stele of the: swa.p. H he walks through this the following are reasons for doing thts. Which one h HOI a reasonl sw~~~p on his way to school eedl 41y, .. ldl one of the following diseases Is he .ast likely to getJ A. The grass r~duces the speed with Nhlch the rain stdkes the soiL I. Water evaporates fro. the soli 110re slowly A. Bilharzia C. The grass protects the coffee beans fra. pests. I. lwuhlorkor 0. The grass helps to prevent weeds fro. growing around the coffee (. Rickets trees. (Nu.56) D. S..Upoa (llo. 41) 1979 Diarrhoea Is 1 disease that kills •any babies In KenyoJ. When b•bles 1911 loske's .ather gave four reasons why she wanted to replace the old have dlarrhoee they lose a lot or woJler and foods. One correct grasss roof on her l1ouu with 1 • • ..ut1. (corrugated Iron) roof. way to treat babies with diarrhoea h to Three of her reasons are correct. llaldt one Is IIROIIGl A. keep th.. wrapped up &nd wa~ so that they swe•t out the sickness A. ~tll. roof keeps the house cooler 1. give thM drinks of boiled cold water containing su.e sugar and I. Snakes and rats cannot hide Ia ~roofs easily a little salt C. CleM water can be collected fr• the ..U.U roof C. give th• solid foods containing plenty of carbchydr.tes 0. Mlbatl roofs last lflftger thaa grass roofs 0. give th• very little food or w1ter until the dlarrhoe• s.tops (No. 5l) (Nu.lO) 1911 Less th&n ll of Kenya's land area Is covered with forests. Four 1979 Wangarl lives In 1 village with uny feople. fro. whl(.h one of the possible reasons why parts of leRya shoul~ be kept for forests ere ylven following sources can W.ngarl collect the best drlnklnCJ waterl J::'- IN!IC!". Only thrH of the renons are correct. Pick out the one wh da J::'- Is 1101 correct • A. A talg river nearby. I. The rain fro. her ..batt roof. A. forests help prevent soil erosion C. lhe dae near the village. I. forests help the soli to store water 0. A swa.p on her ,.,.., C. Forests provide rau .. terla~s for paper -.nuf1cture (No.89) 0. forests help prevent the spread of sleeplnt stckneis (No. ) 1980 Three of the following help to prevent the spread of cholen. Which 54 one does IIOT1 1917 Hadlja put clay around the sIdes of her J.J!!., b~avlng the boles opeR, aiMI let It dry. lhls . .de the JJla work a.xh br.etter because It A. Using ptt latrines or toilets. I. Putting oil on stagnant w•ter. A. ..de the J.l.t.p heavier t. Washing hands before e1tfng. I. a11o.ed wood to be buriled In the JJla D. leeplnt food covered. (No.5l) C. reduced the loss of heat fro. the sides 0. l11creasecl the flw of the air to the Jj!! · (llo. ) 1980 ... lch one of the following should be done i.-edlately If • child 59 burns her hancl 011 • J.1Ul 1977 Wanjlh worh In 1 health clinic. A .ather brings 1 s..11 child suffering fr• .ld~b.IJH:bL lhe .ather says she cannot afford to buy silk for her child. What should WMJiku edvlse her to dol ••t., eggs or A. Put the chfld's hand 1n cold w1ter. I. lurst the blister to let the liquid out. C. Wrap the hand tightly tn a bandage. A. Give the ch!ld bananas or oranges fr• her sha.ba to eat each day D. Cover the burn with cooking fat. I. lrla9 the child to the clhalc each week for I11Jecttons (Ho.09) C. Feed the child with plenty of posho or ugall 0. Use 110re beans or groundnuts in preparing the child's food. (No.,B) 1980 When NJerl was cook lng undu.l on a 11u. the hot oil tn the iUhtr.ll. caught fire. Whlcl1 one of the following should NJerl do to put out 1918 Every year • lot of food ts lost In leftra because weevils destroy the the ftrel beans stort!d by hr•rs. Three of the fo11wlng are good t~~ays to store A. hh the iiWu:.il. off the fire. beans so that they are not destroyed by weevils. Which ine Is lOT 1 good I. Put 1 big cover over the sufurl1. WoJyl C. Pour water on the uf.ud..l. A. In bays st•chd on the floor of 1 store 0. 11 ow very hard on the f1111es . (No. 90) •• c. I• sealed pots •bed llfllh ash I• sealed gnaarles ..de frc. wattle, -.1 ancl cow dunt - 45 - 4. Testing of scientific skills. As we have already noted, items testing cognitive skills of various kinds now predominate in the CPE science paper. Following Bloom•s (1956) widely- used taxonomy, skill items can be categorised into three main types. Comprehension items involve essentially routine mental operations. The sequence of steps to be carried out is either specified in the questions, or is familiar to the student through previous practice. Thus a question which required pupils to make a temperature or rainfall reading from a graph would be a comprehen:rion item! and so would another which required them to calculate distance on a map fr·om the scale. -"t I In application and analysis questions, by contrast, the sequence of steps needed to work out the answer is not specified, nor is it familiar to pupils through previous practice. The candidate must first decide for himself what steps are needed, and in what order, to solve the problem. Thus· questions of these types involve decision making and problem-solving skills: essentially they test the pupil's ability to think effectively. It is not always easy to separate comprehensionitems on the one hand from application and analysis items on the other. The distinction, as we have seen, hinges mai.nl~ on familiarity. Any sequence of mental operations becomes routine if repeated often enough, and no longer requires decision-making and reasoning. Thus a question which required application or analysis skills in the CPE examination might involve only comprehension if asked in the 0-level examination, four years later. Similarly, the distinction between application and analysis questions is by no means always easy to draw. Application questions involve the use of remembered knowledge in novel situations to solve a problem or draw a valid conclusion. Question 63 from the 1979 paper would be classified as. an application item. To answer it successfully, pupils need to be familiar with the uses of a sieve, either from practical observations at school or from out-of-school experiences. They must also be able to estimate the approximate sizes of peas, beans, rice grains and particles of wheat flour and powdered salt, using metric units. With this knowledge, they must then group the substances into two size categories (above and below 1 mm.) and hence d~cide which pair of substances the sieve could separate. - 46 - Question 61 would be classified as an ana]ysis item. Like No. 63, it involves reasoning to solve a problem. But the candidate must work with unfamiliar data, provided as a part of the question, rather than with remembered knowledge. He must start by analysing this information, and identifying the relationships implicit in it. In categorising CPE items, the distinction between application and analysis skills has not been found especially useful. Corrmonly, the two groups have been combined into a single category, and described as reasoning items. Their es~ential characteristic is that the candidate must set. up for himself an appropriate sequence of cognitive steps to solve a problem or reach a valid conclusion. Question 78 deserves s;pecial conunent. Like questions 61 and 63, it is mainly a test of reasoning, but in addition a further element is involved. In most reasoning items, only one of the four alternative answers presents a valid conclusion from the given information. But with No. 78, Jll four conclusions are valid. The candidate must decide which conclusion it would be most important for agricultural offiicers to stress in giving advice to fanner~L In the Bloom taxonomy, such questions are categorised as evaluation items. The proportion of reasoning items in CPE science has increased steadily since they were first intr·oduced in 1974. They now make up between on~~-third and one-half of the items each year. The main reason "Dr this heavy !;tress is that science provides more opportunities than other subjects (with the exception of mathematics) for the testing of decision-making and problem- solving skills ~terminally-relevant contexts. Clearly, the most important tas;k facing the primary teacher is to develop pupils' ability to think effectively. But it is not c~rtain to what extent the skills involved transfer readily from one situation to another. For example, a pupil may have learned to solve quite successfully the intellectual puzzles based deliberately on non-curricular content typically contained in verbal and non-verbal reasoning tests; but this encapsulated experience may not necessarily enable him to tackle more effectively a reasoning problem concerning crop yields (such as No. 78), or the correct weight of a fish (No. 62). Hence reasoning skills are tested, wherever possible, in contexts similar to those in which they are likely to be needed. - 47 - A further point should be made about questions 62 and 78. Although both are reasoning~ q_•Jestions, in each case the correct answer also teaches termirta lly-relevant knowledge. Few CPE 1eavers can look forward to formal jobs; most must generate economic opportunities for themselves in the "infonnal" sector, as fanners, craftsmen or traders. For those who work as farmers, knowing how to check that scales are correctly set when they sell their coffee, maize, milk or other farm produce will be vital to their income- earning prospects. Illiterate and poorly-educated fanners are often cheated in this way. The same knowledge is, of course, essential to craftsmen and traders, and, indeed, to anyone who ever buys or sells' goods by weight. For those in cotton-growing areas, it will be useful to know that yields can be increased far more by early planting (which costs nothing) than by repeated spraying (which is expensive). 2 Behind the specific content of these questions lies'a'more general message: low-income families cart find at least some ways to use their limited resources more effectively. The same message underlies the content of No. 62 (it is cheaper as well as better to breast-feed babies rather than to bottle-feed them) and No.79 (wood-ash, farmyard manure and compost can be used as low- cost alternatives to artifical fertiliser). A number of further exa~ples can be seen among the items quoted on page 41 , including No. 59 from the 1977 paper (cooking stoves are more efficient and burn less fuel if the metal sides are insulated with clay) and No.68 from the same year (low-cost plant foods such as beans and groundnuts can be used to cure kwashiorkor if animal products such as eggs, milk or meat are too expensive). It wi11 have been noted that underlying the emphasis now given to the testing of reasoning skills in the CPE is the assumption that the ability to think effectively is a skill that is teachable, no more immune to the effects of good or bad teaching than any other skill. Hence, reasoning questions are included in the examination not merely to identify thinking skills, but also to encourage teachers to attempt to develop them. Empirical evidence in support of this view has been presented elsewhere (Somerset 1977, Makau and Somerset 1978), and a more detailed investigation of the same issue is now under way. 3 - 48 - 5. Use of non-verbal media for presenting information. Another striking difference between the 1972-73 papers and those for 1979 and 1981 is in the use of non-verbal media for the presentation of information. In the earlier papers, words only are used, but in the 1979 and 1981 papers, charts or diagrams supplement words in four of the six items quoted. In the papers as a whole, the proportion is over 80% in both instances. But despite these changes, the 1979 and 1981 papers certainly make greater demands on the verbal reading and comprehension skills of the candidates than the 1972 and 1973 papers. To compensate, the time allowed per question has been more than doubled,from 48 seconds per question in 1973 to 100 seconds in 1979. Judging from the answer patterns, most candidates have enough time to complete the paper (which of course includes history and geography as well as science) in the time allowed. Even in the last few items of the paper, the proportion of candidates failing to give an answer rarely reaches as high as 2%. 4 6. Observation and Experiment. In a multiple-choice paper it is, of course, difficult to assess directly to what extent pupils have been involved in a progr~e of observation and experiment. Wherever possible, however, questions in recent papers have been written in such a way that pupils with practical experience have a substanti~l advantage. Question 63 from the 1979 paper, for instance, would be much easier for pupils for whom the process of sieving was familiar, and question 62 from the same paper more difficult for candidates who had never carried out weighing experiments or at least observed how different types of balance work. It is hoped that the backwash effects of questions such as these will promote a more exploratory approach towards the teaching of science in the primary schools, and a more intensive use of the resources of the local environment. B. Changes in Other CPE Papers. 1. Geograp~. Reform of the geography paper started in 1976, two years later than science. The changes made have been closely parallel. At the knowledge level, the number of items testing isolated facts has been sharply reduced! and the number testing understanding of cause-and-effect relationship~ - 49 - increased. Before 1976, few items tested geographical skills: the,~/ now make up .between one-third and one-half of the items each year. A high proportion of the skills items involve mapreading and interpretation. Typically, a sequence of six to eight questions is written based on an imaginary l11rge-sca 1e map of a sma 11 1oca 1i ty, cont\1 i ni ng features 't~h i ch most pupils should be familiar with either from everyday experience or from geographical field trips: roads, bridges, schools, markets, post offices, boreholes etc. Each year, a different type of locality is chosen: in one year, a tea-growing area; in another, a semi-arid pastoral area; in a third, a coastal area. The geography se:tion of the 1981 CPE Newsletter given at Annex 3 contains a detailed discussion of one such sequence. 2. History. Remembered knowledge continues to be more important in history than in other CPE subjects. Nevertheless, since reforms started in 1976, efforts have been made to reduce the role of recall in the items set. In the 1980 paper, for example, four o'f the 25 questions required candidates to work with new information, presented through a map or table, which they had almost certainly never seen before. In other questions, candidates were expected to ·synthesise facts they had remembered into a new perspective, to demonstrate understanding of patterns or trends in history. These changes are discussed in more detail,and illustrated with examples, in the history section of the 1981 CPE Newsletter, given at Annex 3 (pp. 144 to 147). 3. Mathematics. The introduction of changes in the CPE Mathematics paper cannot be pinpointed to a particular year. An ~nalysis of the 1971 paper carried out for the ILO Mission in 1972 indicated that, as with science, the paper was almost entirely a tool for secondary selection. Little attention was paid to testing the numerical skills which would be needed by primary school 1eaver~'L Paradoxically, however, it was found that the i terns which tested secondary-level skills were, on average, much less effective in identifying the most capable candidates than items which tested basic numeracy skills ( ILO, 1972j Somerset, 1974). Starting in about 1974, the proportion of items testing these more advanced topics (mainly formal algebra and geometry) was gradual'ly reduced. Instead, more questions were asked testing candidates' ability to apply basic - 50 - number skills to the solution of practical number problems. Three years later, however, the situation was complicated by the need to test a new .. modern" mathematics syllabus. This new syllabus, which included topics such as set theory, number bases, and transformation geometry, had been introduced into Standard 1 in all schools in 1971, and extended each year to the next standard as the 1971 intake moved through the primary system. The first cohort thus reached CPE in 1977. In that year, 17 of the 50 items tested 11 modernn topics. When the performance patterns were analysed it was found tnat, somewhat contrary to expectation, pupils found most of the modern items relatively easy; whereas by contrast performance in number application problems (especially problems involving money, time, and metric units of measurement) was weak. It seems that teachers had been devoting too much time to the more abstract parts of the new syllabus and neglecting topics of practical relevance. 5 For this reason, the number of 11 modern 11 items in the paper was reduced rapid'iy; from 17 in 1977, to 13 in 1978, and only 4 in 1979. In 1980, they were omitted from the paper altogether. Correspondingly, a renewed stress was given to application problems: they numbered 12 in 1978, 20 in 1979, and as many as 27 in 1980. This change in emphasis, and the reasons for it, were discussed in detail in both the 1979 and 1980 Newsletters. Nevertheless, performance in these questions was still relatively weak even in 1980, as can be seen from the data given in the extract from the 1981 Newsletter at Annex 3 {seep. 129). \ \ ~ 4. English. The changes in the English objective paper started as early as 1971, three years before those in science. In that year, twenty verbal reasoning items were included in a paper of 100 items. Twenty of the remaining items tested comprehension of passages of connected prose, and the remaining 60 mainly knowledge of grammar,syntax, and usage. Similar proportions (60% grammar and syntax, 20% comprehension 20% verbal reasoning) were maintained until 1975, when the balance of the paper was changed sharply in favour of comprehension,at the expense of items testing grammar, syntax and idiomatic usages. These changes were designed to take account of the context in which rural pupils hea~ and usP. the English 1anguage. - 51 - It was felt that in a country in which most people, if they speak English at all, use it only as a second or third language, the terminating examination should stress its role as a medium through which knowledge and ideas can be received and communicated, rather than as a medium for everyday conversation. Hence questions testing unusual words, phases or idioms were eliminated from the paper altogether. Examples of such expressions from the 1973 and 1974 papers include "plucked up his courage", "in their teens" and "not of his standing". Similarly, it was decided that in certain cases, questions testing points of grammar where the commonly-heard Kenyan usage differs from the correct usage in standard English should be omitted. Since 1975, approximately 50% of questions each year have tested comprehension; 20-30% grammar and syntax; and 20-25% verbal reasoning. Examples of the types of comprehension question set in recent papers can be seen in the extract from the 1980 CPE Newsletter given at Annex 3 (see pages 120 to 151). A more detailed account of the changes made in the English paper has been given elsewhere (Somerset 1977). C. Changes in Setting and Moderation Procedures. Before the introduction of the item changes just outlined, the setting and moderating of CPE items was relatively straightforward. Each year an examiner was appointed for each of the five subjects, who took sole responsibility for preparing a draft paper. In most subjects the same examiner set the questions over a period of several years. The draft papers were reviewed by moderating committees, made up of senior ministry officials and external moderators. A moderation meeting for a single subject usually ran for about half a day. The first major change in these procedures came in 1975, when team setting for the science paper was introduced. This change was a direct consequence of the introduction of questions testing higher-level skills in 1974. It became apparent irmtediately that writing the new questions imposed enormously incr~ased demands on the setter, in terms both of professional skill and of time. Recall questions of the type being asked before 1974 could be written quickly: an experienced setter could expect to complete a paper of 40 such questions in less than a week. But the more complex questions asked in increasing numbers since 1974 require much more work. The most demanding items are perhaps reasoning questions based on real - 52 - empirical data, such as No.78 from the 1981 paper discussed on p. above. Five distinct steps are involved in preparing such an item: first, initial conception; second, identification of suitable data from published sources; third, re-arrangement and simplification of the data to make it readily comprehensible to primary school leavers; fourth, construction of a draft item around the data: and fifth, final editing. The entire process can easily take 3-4 working days - nearly as long as it took to prepare a complete CPE science paper prior to 1974. Before item writing starts, the setting team reviews the results of the statistical analysis of the previous year's paper and decides what changes are needed. It may be, for example, that questions of certain types were too difficult, or unduly biassed against pupils in Yural schools or schools in low-income urban areas. Special attention is paid to questions which discriminated inefficien~y between more-able and less-able pupils, and an attempt made to identify the reasons why. The team then draws up a blueprint for the paper they are about to prepare, specifying the approximate proportions of items which should test each of the important intellectual skills and each of the major topic areas. It might be decided, for example, that not more than 40% of items should test straightforward recall of knowledge, and that most of these should be in the fields of nutrition, health and agriculture. A rough allocation of tasks is then agreed on. Each setter undertakes to focus his efforts on writing questions to cover specific areas of the blueprint. But in practice this division of labour is only very partial. In recent years scarcely any of the questions finally accepted into the examination papers have been the unaided work of single setters. Typically, the setter produces only a first draft of his question, which is then reviewed in detail by the setting team. Sometimes the team edits the question on the spot and produces an agreed version; but more often the setter is asked to prepare a seconG draft, incorporating the suggested changes. On occasion this iterative procedure is repeated sever~l times. The edited questions are then submitted as a draft examination paper to the moderation committee. It is, of course, the aim of every setting group to have its paper accepted by the committee with only minor modifications, but in science this has never happened. In fact, the questions - 53 - receive just as thorough a review during moderation as they do during the setting meetings. It is unusual for the science moderation meeting to be camp 1eted in 1es s than two days. Before the start of the CPE reform program, the total time taken to write and moderate a science paper of 40 items was probably not more than 10-12 ~~. man-days. In recent years the same processes have taken at least ten times as long: a minimum of 100-120 man-days. Teams of setters for CPE subjects other than science were not formally appointed until 1978. From 1975 onwards, however, setters increasingly sought help from professional colleagues on an informal basis. The composition of the CPE setting teams varies a good deal from subject to subject. A typical team consists of 5-8 members, drawn mainly from among teacher educators, curriculum developers and school inspectors. A professional officer of the Examinations Council usually acts as coordinator. Participants in the first item-writing workshop, held in August 1978, provided a high proportion of the initial recruits. Each year an attempt is made to include at least two new setters in each team as trainees. It is clear, however, that more effective methods will need to be devised to identify individuals with the qualities needed to become skilled item writers. The ad hoc methods used until now have proved quite inadequate. Shortage of skilled setters is at present the single most severe constraint on the institutionalisation and extension of the examination reform program. - 54 - Footnotes 1. When this mode is used, it is often necessary for the candidate to make a response which is negative in some sense: he must choose the answer which is faise, least important, different from the others, least likely, etc. Question 79 is an example. Although these ne~atfve items tend to be somewhat more difficult than the others, they are equally effective in discriminating abler pupils from the less able. To reduce misinterpretation, the key words are printed in bold type. Since 1980 upper case bold has been used, as can be seen in items 77-79. 2. The trends are very similar for most other annual crops. If labour is~short, the opportunity costs of planting cotton at the beginning of the rains may be high, because food crops must also be planted at this time. 3. When reasoning items were first introduced, the complaint was sometimes heard from teachers that the science paper had becane just an intelligence 11 test", containing no "real science". Sometimes the corollary was drawn that because intelligence is largely inherited, there was little a teacher could do to prepare pupils for such a paper. The role of the teacher in developing thinking skills was discussed i~ the 1979 Ne~s1etter in the following terms: Until quite recently it was often said that reasoning ability, or 11 intelligence, is mainly inherited. Some children inherit high intelligence from their parents, while others inherit low intelligence; and education can make very little difference. Recent research, however, has shown that this is false. Some child- ren are certainly more successful in reasoning tasks than others, but a good teacher can improve the reasoning skills of all his pupils, With most of the reason1ng que~t1ons in the 1978 General paper there were big differences ira performance from school to school. For instance, among the rural schools, the proportion of pupils correctly answering the question about food chains at Lake Nakuru which we have just discussed ranged from well over 70% to under 30%. Some of the schools where the pupils answered very well were in isolated areas, where only a few of the parents had received any education, so it is clear that the pupils• good performance was due mainly to the skill of their teachers, and not to their family backgrounds 11 • 4. Although, of course, it is likely that some of the weaker candidates resort to guesswork when time is nearly up. 5. Essentially, number application questions are very similar to reasoning questions. In each case. the candidate must decide for himself which cognitive steps are needed, and in which order, to solve a problem. In the real world, situations in which an appropriate sequence must be worked out are, of course, much commoner than situations in which the sequence is specified. - 55 - CHAPTER 5. CPE £XAMINATION CHANGES: THE ENGLISH COMPOSITION PAPER As we rjoted in Chapter 2, ~<~hen the CPE took. over from its predecessor in 1966~, the changeover in fonna t from short-answer and essay-type questions to multiple-choice was complete. The new examination was cles·igned to be processed almost entirely by computer. Human intervention was needed only to correct technical hitches: an answer sheet which failed to pass tht"ough the document reader', for example, or an incorrectly- entered index number. Administratively the change brought enormous benefits, but it soon became clear that professionally its costs were unacceptably high. I~ many, perhaps most, schools the teaching of the skills of continuous pros~ writing fell rapfoiy into decay, because tbst~ sk.ills were no longer needed to score high marks. For this reason, ~n Eng1 ish. ,compositiort paper was restored to the examination in 1973. In most years it has consi.sted simply of one cor.npulsory essay topic. Initially, composition marks were not weighted h~avily; in 1973 and 1974, for instance, they accounted for only 30 per cent of the total marks awarded for English. When, however, the standardisa.tion system was set up in 1976 (see Chapter 6)_ the weightings were set in such a way that the composition paper and the objective paper each contributed equal variance to the total English mark. Thus it is now impossible to gain a high 2cote in English without perfonning well in the composition paper. The paper is marked by teachers, working in pairs. They number more than 1000. Until rccentlt half the examiners were primary teachers and half secondary teachers, but now nearly all are primary teachers. The two examiners making up a team read each script separately, then compare their assessments and decide on a final mark. The allocation of scripts is arranged so that no examiner marks essays from schools in his own province. Over the past two years, both the criteria and the methods used for assessment have been changed radically. Previously a two-tier system of assessment was used. The examiners first de~ided whether the facility - 56 - shown by the candidate in tbe writing of English prose made him an a,cceptable candidate for entry to a Government-maintained school, a borderline candidate, or an unacceptable candidate. They then a.warded him an exact mark, within a range specified in advance for each of the three categories. In making their assessments they were guided by a formal mark scheme, which listed points to be considered. These centred mainly around accuracy in language use: correct gra11111ar, spelling, punctuation, sentence structure, and use of vocabulary. Because the points were generalised, the same marking scheme was used from year to year. The secondary school entry criterion has now been dropped, partly because it was felt that CPE candidates should be judged as primary school leavers r~ther than as secondary school recruits, and partly because tbe c:riterion was vague almost to the point of being meaningless. Secondary schools differ widely in the.quali.ty of the recruits they attract, so there is no common standard of competence which they all regard as 'clcceptab 1e' . The main criterion for assessment is now a series of 10 standard essays. These are chosen by the chief examiners in advance, from the 'live' scripts written by the actual CPE candidates. They are selected to represent the fUill range of perfonnance, from the best essays to the poorest. Each examiner receives a photocopy of these essays, and uses them as his main yardstick for assessment. Further, the qualities required of an essay to score high marks· have been cru!nged: more emphasis is now given to the flow of ideas and to originality. It was felt that the formal rules of English language usage - grarrmar, syntax, punctuation, etc - could be tested adequately in the objective paper; and thclt therefore composition marking should concentrate on aspects of conmunication skill which cannot be tested through multiple-choice items. For· 1980, three main criteria were used: accuracy, fluency and imagination. These were specified in more detail as follows: Accuracy: 1. Correct verb tenses and tense agreements 2. Accurate use of vocabulary 3. Correct spelling and pronunciation - 57 - Fluency: 1. Words in correct order 2. Sentences connected and paragraphs in sequence 3. Ideas developed in a logical sequence Imagination: 1. Imaginative but appropriate use of words and phrases 2. Variety of sentence structures 3. Creative ideas In support of these changes, the essay topics chosen for the paper have increasinolv tended to encourage creative rather than descriptive writing. The topics are generally presented in one or other of two ways. Sometimes, a sequence of four drawings is printed in the examination paper, and candidates are asked to write a story centred around the events shown in the drawings. rn other years a less structured mode is used: candidates are simply given the opening few sentences of a story and asked to complete it in their own WJrds. !t has been found that these modes not only encourage livelier» more imaginative writing, but they also make it more difficult for candidates to write essays which they have memorised in advance. This occurred frequently when stereotyped composition topics such as "My home" or "My schoo1•• were set. The standards of profess iona 1 competence now t,~;_; ryg required of the examiners are much higher than they were in the past, so the period of initial training has been extended. Examiners spend at least a day discussing the me~its and demerits of the 10 standard essays before marking starts.- As background for the discussion, the chief examiners prepare in advance written notes on each of the standard essays, in terms of the thre,_ criteria. Much attention is given during discussion to the subtle points of difference which make one essay worth a few marks more, or a few marks less, than the essay adjacent to it on either side in the standard scale. Because the frame of reference they are using is a set of real CPE essays, rather than a set of generalised points in a formal marking scheme, examiners very quickly begin to relate the discussions to their own work as teachers of English, and hence to formulate ideas as to how they could - 58 - improve standards of prose writing in their own schools. The work thus becomes more than marking exercise; it is an intensive period of in- service training as well. For many new examiners, the quality of the essays in the upper part of the standard scale comes as a revelation. Thus, when they return to their t.lassrooms they take with them not only new ideas and skills, but also a new standard Of reference against which to judge the work of their own pupils. During marking of the 1980 compositions a survey of examiners was conducted which established that most examiners were being recruited from the more accessible schools, particularly from those in urban areas. Moreover, a high proportion of the schools taking part sent more than one examiner. For 1981, steps were taken to ensure a more equitable geographical coverage. District quotas were established, based on the number of schools, and from each district only the schools with the best CPE results in the previous year were invited to participate. Only one examiner was accepted from any one school and normally he or she was expected to be the English teacher. With these measures it is hoped to achieve three main aims. First, the efficiency of the marking will be improved, because only successful English teachers will be recruited. Second, the in-servicing benefits to be gained from taking part in marking will be more widely distributed. 1 Third, an additional incentive to improve English teaching in the primary schools will be created, because good performance in CPE will be rewarded by participation in the marking exercise in the following year. One difficulty in setting up the new system has been to devise composition topics which, while encouraging the exercise of the imagination, are of sufficiently broad appeal to allow pupils from all types of school and socioeconomic background to show their talents. A possible solution might be to provide alternative topics, but this would create other problems. Training examiners would, of course, be more complex, and marking slower. More important, there would be substantial difficulties in establishing equivalent mark scales among the various topics, especially if they were very different from each other. Annex 3 includes a section from the 1981 CPE Newsletter discussing the 1980 composition paper in some detail (see p.t28). The topic set for the year is given, together with six of the standard essays from the series - 59 - of ten which formed the main yardstick for asse~sment. The wide range of quality in the written work produced by CPE candidates is illustrated vividly by these essays. The ,two best essays in the series (Nos. 9 and 10) would have received high marks if submitted as 0-level work; while the poorest (No.l) was not an essay at all; it consisted of sentences copied repetitively from the examination paper. It is still too early to say with any confidence whether or not the changed assessment methods, and the use of the marking exercise as an opportunity for in-servicing, has had any real effect in improving the overall quality of prose writing in the schools. The statistical evidence is ambiguous. The mean CPE composition mark changed hardly &t all between 1979 and 1981, but the mod~ mark (that is, the commonest mark) rose from only two out of 40 in 1979, to six in 1980, and to ten in 1981. This suggests that the proportion of very weak candidates, who cannot write a coherent essay in English, may be dropping; but a systematic analysis of a sample of essays from the three years would be needed to establish the point. Despite the use of standard scripts CPE marking is based essentially on subjective judgement, so statistical evidence is of limited significance. Apparent changes in performance may be due to changes in marking standards, or in the difficulty of the topics set. Nevertheless, it is clear from impressionistic evidence that the changes are having a marked impact in individual schools, particularly in schools from which a teacher has been recruited for the marking exercise. A high proportion of these teachers take home the set of stan~ard compositions used in marking, and use them in their teaching during the following year. In one school visited recently the marking of Standard 7 compositions is now shared by all teachers. Each teacher takes responsibility for ma~king the compositions of five pupils, and for discussing the work of each pupil with him individually. Pupils are rotated among the teachers each month. In another school the best essays written by the CPE class are displayed in a wall newspaper. This provides pupils in all the upper standards with additional reading material, and CPE candidates with both an incentive to improve their writing skills, and a set of models to help them do so. One final piece of impressionistic data should be recorded. When the present system for setting and evaluating the CPE compositions was established, a major concern was to encourage awareness of the more holistic qualities in prose writing, such as the flow of words, sentences - 60 - and ideas, and of the use of imagination. In the past there had been a strong tendency for teachers to focus their attention on identifying specific errors in grammar and syntax while marking an essay. Sometimes the total mark awarded was based entirely on the number of errors identified. In consequence puoils tended to produce essays which were short, unadventurous and uninteresting. Since the changes were introduced, however, it has become evident that the development of creative writing skills among primary school pupils is not really constrained by lack of imaginative ideas, as might have been supposed from reading compositions written in earlier years. With encouragement, many candidates are capable of producing lively, original essays. A high proportion of them are handicapped, however because they f lack the basic language skills needed to communicate effectively. Composition No. 4 from the 1980 standard series, given in Annex 3c, is a good example of an essay written by such a candidate. Footnote 1. To he 1p spread the benefits to 1ower-qua 1i ty schoo 1s, examiners wi 11 be encouraged during the marking exercise to in-service non-participat;ng teachers from neighbouring schools (as well as from their own school) when they return home. Zonal organisations, each consisting of about 5 to 10 schools from a compact area, already exist in many districts for the purposes of cooperative teaching and testing~ These organisations could be the basis for such in-servicing. - 61 - CHAPTER 6. CPE PROCESSING CHANGES: STANDARDISATION OF SCORES AND DETECTION OF CHEATING In this chapter we shall discuss briefly two major changes in CPE processing pr~cedures which, while not central to the reform programme, have been-important for its success. The first of them is standardisation of scores, and the second, the use of a computer system to detect cheating. Standardisation is carried out by many examination authorities as a routine part of results processing, but as far as is known the cheating detection system is unique. A. The Standardisation of CPE Scores \ ' Standardisation is a procedure which involves adjusting the raw marks scored in an examination to allow for differences in difficulty, and differences in the extent to which the marks scatter. It is a valuable component of e~~mination processing, but unfortunately it often produces results which are easily misinterpreted. In the standardisation process, differences in difficulty among the papers are measured in tenns of differences in the mean (average) raw mctrks scored by al·~ candidates, and differences in scatter in terms of the standard deviation (SO). In the CPE, the two measures vary considerably, both from subject to subject and from year to year. There is a consistent tendency for the SD to be higher in mathematics than in the other subjects, particularly English. In 1978, for example, the SO for mathematics was 9.40; for English, 7.88. In 1980, the corresponding figures were 9.22 and 7.05. Mathematics marks, in other words, tend to scatter more than English marks. Standardisation of examination results involves converting the raw marks so that the means and standard deviations for each paper are identical. For 1 CPE the mean is set arbitrarily at 50 points and the SO at 15 points. It should be stressed that the values chosen for the two statistics are arbitrary, although onc:e they have been detennined they should, of course, remain constant from ye~ar to year so that the standard scor·,~s are comparable. - 62 - In the most widely used standard scale, the so-called IQ scale, the mean and standat"d deviation c1re set at 100 and 15 points respectively. Another con111only used standard scale ·is the T-scale, with a mean of 50 and standard deviation of 10. T-scores are used quite con111only for standardising examination results. In Botswana, for examp·le, they are used for the Primary School Leaving Examination. Standard scores have four main advantages over raw marks, all closely related to each other: 1. A standard scar~! always conveys exact infonnation as to the standing of a ·pupil relative to other pupils sitting the same examination~ whereas a raw mark does nat. For example, we know that the pupil with a standard score of 80 is two standard deviations better than average, because 80 is 30 points (or triO standard deviations) higher than 50. From his raw mark we can infer nc,thing, unless we know also the raw mean and standard deviationo Simi 1arly, we \~oul d know that a pupi 1 with a standard score of 35 was one SO below average, and another with a score of SO~ exactly average. 2o Further, provided the $COres are distributed in a nonnal (bell-shaped} curve,. a candidate•s percentile ranking can be estimated from his standard score using the· table of nonnal-curve functions given in most statistics texts5 For exampld, the pupi1 with a standard score of 80 (M + 250) will have scored higher than about 97~ per cent of candidates, whereas the pupil whose score was only 35 (!M - lSO) will have b~.aten only about 16 per cent of candidates. A standard score of 95 (M + 3SO} w:11 place a pupil among the top three candidates in each 1000. 2 3. From these properties it follows that standard scores can be used to compare directly the performance of a candidate or group of candidates from subject to subject or from year' to year. For example, if a certain district averages 55 points in mathematics in one year and 60 points in the following year, its perfonnance improved by one-third of a standard deviation relative to the country as ;a whole. 4. Standard scQres are essential if marks from several papers are to be surmted to give a tota1 mark, and it is desired that each paper should contribute equally (or in a fixed ratio} to that total. The contribution of each paper is detennined by its scatter; not, as is sometimes thought, by the average mark or the total mark possible. 3 Because standardisation equalises the standard deviations, it also equalises the contribution of each paper to the - 63 - total. If CPE marks were not standardised, mathematics would in most years contribute more to the total mark (on which selection to secondary school is mainly based) than English or the subjects of the general paper. But standard scores have an important weakness, which is often lost sight of. A standard score is a measure of relative performance. It tells us how the candidate or group of candidates performed in comparison to other candidates, but gives us no i nforma ti on as.. to the abso.l ute 1eve 1 of perfonnance. Because the mean and standard deviation are adjusted to fixed and arbitrary levels, a standard score is likely to be positively mis~eading if interpreted to indicate whether achievement in a particular subject is good or bad. Overall performance in English may be improving from year to year, or overall performance in Mathematics deteriorating, but these trends will not be reflected in the standard scores. Standard scores can be especially dangerous when the overall level of performance is poor, as is unfortunately the case very often in tnird-world countries, including Kenya. The mean standard score is usually set at 50, but the mean raw mark, expressed as a percentage, is often much lower than this. Hence the standard scores are systematically higher than the raw percentage marks. But because standnrd scores are almost identical in appearance with percentage marks, they arc easily confused with them. Many users of examination results have only a hazy understanding of standar1 scores, and tend to interpret them as if they were the more familiar percentage marks. If they do this they receive the impression that performance is much better than it really is. Standard scores are especially prone to be higher than raw marks in the below-average range. In 1980 CPE mathematics, for example, a raw mark of 61 per cent converted to a standard score of 62, but a raw mark of 30 per cent converted to 38, and a chance-level mark of 25 per cent to 35. Hence the baseline for ilssessi.ng mathematical competence in 1980 was 35 standard score points: a computer, generating answers from random numbers, could have been expected to score at about this level. A similar situation was found with the mathematics paper 1n the Botswana Primary Leaving Examination of 1976. In this paper, the baseline score was as hign as 40 points (Somerset, 1977). - 64 - It is clear, then, that standard scores can be most misleading if interpreted without reference to the raw marks from which they were derived. In Kenya nearly 50,000 individual pupils, or nearly 15 per cent of aJl candidates, scored below 36 standard score points in the 1980 CPE mathematics paper. Effectively, these pupils showed no evidence in the examination of having learned any mathematics at all. The failure of these pupils cannot be attributed to lack of mathematical ability: they were heavily concentrated in a relatively small number of schools. Most of these schools were of extremely low quality; in fact, in more than 60 the average mathematics standard score was below 36! Members of the district support teams will have been able to identify these schools from the mean standard score lists, but it is likely that in many cases the full seriousness of the situation will not have been appreciated~ Standard scores are essential when the results from different papers must be combined to give an overall total, and they are useful for comparing the relative performance of a candidate or group of candidates from subject to subject or from year to year. But for planning and evaluation purposes standard scores are quite inadequate. For this reason, standard scores are used in CPE analysis only when the focus is on the .~ompa rison of the performance of i ndi vi dua 1s or gl"·oups. They are thus used for the secondary school selection lists, and for the schools and district mean scores lists. But when the focus is on evaluation of the quality of performance, as in the CPE Newsletter, raw mar·ks are used. The 1981 Newsletter included a discussion of the differences between the two types of measure, but it is likely that many teachers and fie~d officers continue to interpret standard scores as raw percentage marks. In all probability,the misunderstanding will continue until it has been possible to explain the differences at district-level in-service courses. 4 B. The Cheating Detection System. Selection examinations which terminate open-access education in less-developed countries such as Kenya are typically mass examinations, sat by huge numbers of candidates in many small centres. Problems of administration and super- vision are consequently acute. Moreover for many candidates these examinations constitute the main, perhaps the only, chance to escape from ~overty. With the rewards to examination success being so high and the penalties to failure so severe, there are strong incentives to giving candidates some illegitimate assistance. - 65 - By 1975 it was evident that cheating in CPE was becoming increasingly common. In some schools, for example, the marks of most candidates were identified or nearly identified in some subjects. It was therefore decided to introduce a cheating detection program into the CPE processing system. The details of this program are kept confidential for obvious reasons, but essentially it works by searching the item response profiles in each school for evidence of unusual patterns. When such patterns are detected, the program prints them, and they are then checked manually. The program also calculates a suggested deduction, made up of a correction element, which compensates for the marks gained by cheating, and·a penalty element. Lists of cheating schools are sent to District Education Officers. When the program was first applied, it was found that nearly ten per cent of schools were involved. In a number of districts so many candidates cheated that they would have taken up all the available secondary school places had they not been detected and their marks reduced. Moreover, CPE performance in most of these districts was well below average, suggesting that in these districts cheating had become not merely a supplement to teaching as a way of helping pupils enter secondary school, but in some cases a substitute for it. Clearly the impact of the CPE reform program in these districts would have been severely reduced if this situation had continued unchanged. Teachers who were making little real effort to prepare their pupils for the examination would hardly have responded to the item changes, or to the information feedback system, by providing their pupils with more relevant education of higher quality. Similarly, allocation of secondary places could hardly have become more efficient or more equitable if the marks ascribed to many candidates were not a genuine measure of their performance. The detection program produced an immediate effect. The proportion of schools involved dropped sharply, and is now less than 2 per cent. In many districts where cheating was formerly widespread it has been eliminated entirely. Most of the remaining cases are confined to two geographical areas. It is difficult to obtain data as to the incidence of cheating in similar examinations in other countries, but it is certain that the situation which obtained in Kenya up to 1975 is by no means unique. Given the circumstances in which these examinations must be conducted, it is probably unrealistic - 66- to suppose that cheating ·can be eliminated altogether. But the Kenya experience over.the past five year~ indicates that it can be contained to the point where it is no longer a serious problem. 5 Botswana has recently introduced a cheating detection program based on the Kenya program. In its first year of operation several cases were i denti fi.ed. Footnotes 1. The conversion formula is as follows 15 SS = 50 + (X - M}'SD" Where X = raw mark obtained by the candidate M= mean raw mark SO = standard deviation of raw marks Thus in a CPE paper in which the mean raw mark was 20 and the standard deviation 10, a candidate who scored 40 would receive a standard score of 15 so + (4o~2o}rrr = so 2. About one candidate in every 2000 can be expected to score 100 standard score points or higher, but for CPE any such pupils are awarded a score of 99. The scale is thus confined to two digits. 3. This perhaps becomes clearer if the effects of a paper witn an SO of zero are considered. In this paper, the marks have no scatter: they all fall at the mean value. Thus adding the marks from the paper to the marks from other papers has no effect on the overall ordering of candidates: it simply adds a constant. The paper contributes no variance to the aggregate, and this is the case irrespective of the mean or maximum mark. 4~ With hindsight, it would probably have been preferable to have chosen a higher arbitrary mean for the standard-score system when it was bei.ng set up; perhaps 100, as used for IQ scores. This would have made the two sets of marks quite distinct, and avoided their being confused. 5. This discussion of cheating is, of course, relevant to our previous discussion of the viability of internal assessment as a means of selection at the end of the open-access cycle (see pp 4-13). - 67 - CHAPTER 7 INFORMATI"ON FROM THE CPE: ANALYSIS AND FEEDBACK. A. Types of Information Generated. The scope for computer analysis of performance in the CPE English composition paper is limited, because only the total mark gained by each candidate ·~s input to the computer. Hence detailed analyses of the strengths and weaknesses of the candidates' prose writing must be carried out manually. Because of the time and effort needed, only small samples can be worked with. But for the multiple-choice papers the situation is quite different. The ca.puter stores the specific response of each of the 350,000 candidates to each of ~he 190 multiple-choice items, so the possibilities for analysis are 1n theo~ at least unrestricted. In practice, of course, data analysis follows certain set routines, which, once established, do not vary much from year to year. Table 7.1 sets out the main ways in which CPE performance data is summarised. 7.1. CPE Performance Information: Main Types. Level of analysis Item response avera 11 perfonnance data datu· by individual pupils (raw responses) (always available) by school 1976 by district and province 1976 1976 by sex (boys or girls) 1977 1976 by cost category and location (high 1973 1973 cost and low-cost , rural and urban) for Kenya as a whole 1976 1S76 - 68 - Essentially there are two main ways in which raw CPE data can be analysed: either the· responses can be co1.mted across items, to give overall performance data; . or they can ~e counted a.cross pupi 1s to give i tern response data. These counts can be ~ade either for all candidates, or for sub-groups, by categories such as school, district, province, sex etc. Traditional CPE 11 results lists"1 consist of overall perfo_rmance data at the individual-pupil level. These data have always been available, of course! The years in which various other types of analysis were carried out for the first time are indicated in Table 7.1. Overall performance data provide~ mainly incentive infonnation. A school, for example, can compare its performance with that of other schools in the same district~ or a district its perfonnance with other districts in the country as a whole. Item response data, by contrast, are mainly a source of guidance information, particularly if the full response profile, showing the proportion of pupils choosing each of the alternative answers, is available. Properly interpreted, such data can provide teachers and the primary school support teams with a great deal of information as to why performance in certain topics is weak, and hence as to what remedial a~tion is needed. The two main dociJllents through which incentive infonnation is fed back to the schools are the district merit order list and the schools merit order lists. Both were generated for the first time after the 1976 examination. There is, of course, only one district list for the whole country, but separate schools lists are prepared for each district. Sample district and schools lists are given at Annexes 1 and 2 respectively. For the first two years, distribution of the district and schools lists was restricted. Senior officials at the Ministry headquarters received copies, and also district education officers and other field staff making up the primary school support teams, but no public annotJncement was made. After the 1978 examination, however, ~twas decided that the results should be given wide publicity. The district merit order list was made available to the press, and district education officers were asked to ensure that every school l~arned its average score in the three subject areas and its perfonnance ranking within the district. - 69 - These statistics created a great deal of interest, and CPE performance innediately became a major public issue. Politicians, ministry officials, t~~5hers• union officials, administrative officers and newspaper editors jofh~~9 in a lively debate as to the causes of poor performance and the remed'fes for it. In many districts, trophies were established to give recognition to outstandingly successful schools. After the issue of the 1979 results during the last few days of the year, the debate was resumed at perhaps an even higher level of intensity. These events were followed in 1980 and 198'1 by some striking changes in the perfonnance status of many districts, and also changes in the extent to which school quality (at least as measured by CPE performance) varies from district to district. We shall discuss these data in Chapter 8. The major instrument through which guidance i nfonnation is fed back to the schools is the CPE Newsletter. The first Newsletter, issued on a trial basis in 1976, was only ten pages long, and printed on flimsy paper. The impact it produced was rather disappointing: in many schools it was filt!d away unread, as s·imply another routine circular. After a gap of three years, a new Newsletter was issued in 1979. This time, care was taken to present the material in an attractive and permanent form. The text ran to 20 pages and was enclosed within a cardboard cover illustrated-with a colour cartoon by a well-known local artist. This edition looked like a proper book, similar to many on sale in the bookshops, and was read much more carefully in the schools. The same format was repeated in 1980 and 1981. The 1980 edition ran. to more than 40 pages, and the 1981 edition to more than 60. Each primary school in the country receives a copy of the Newsletter, through the District Basic Education Office. In addition, copies are supplied to all members of district and primary school support teams, and to all professional officers of the Ministry, including those at the Kenya Institute of Education and in the Inspectorate. Each of the 18 primary teachers' training colleges receives enough copies to form a class set, for use during professional studies and teaching methodology courses. In 1981 , the print order was 15,000. 1 It is a great deal more difficult to set up an effective feedback system for guidance. infonnation than it is for incentive information. Incentive - 70 - information is essentially straightforward: the same set of facts is required !'ear after year, in the same format. Moreover, the data need little explanation: every teacher is familiar with performance meri ·t order 1is ts, and knows how to interpret them. Once a vi ab 1e computer systc!m has been set up, it can be 1eft to produce the necessary printout each year more or less automatically, as part of routine processing. Human intervention is needed only to ensure that the lists reach all the different groups for which they are intended. ~utguidance infonnation is different. In the firs~ place, the amount of examination data which can be generated at the individual-item level is enormous, and it would be quite impracticable to supply all of it to the schools. Drastic selection is necessary. For CPE, this is done with the five major goals of the reform programme in mind. Items singled out for special attention in the CPE Newsletter nearly always have one or more of the following characteristics: i. they test specially-relevant skills or knowledge ii. they are answered correctly by only a small proportion of candidates iii. they cause particular difficulties to pupils in under- privileged groups. Secondly, the interpretation of guidance data is by no means always obvious;. As we have already noted, information as to why perfonnance is weak: in a ~articular question comes mainly from the item response profilE!. In a skillfully-written multiple-choice item, each of the distractors (incorrect answers) will have been selected by the item writers because it taps an error likely to be made by pupils in tackling a question of that type. Hence if the question is answered poorly, the response profile will generally provide clues as to the direction remedial work should take. But the inter"pretation of the:se clues is not always self-evident: even to the! item..,writers them .. selves, the reasons why pupils choose one alternative answer to an item rather than another are sometimes puzzling. Clearly, teachers cannot be expected to understand item-response profiles without help. In the Newsletter an attempt is made to explain the meaning of the data, and to make explicit its implications for classroom work, in simple, easily~understaod language. - 71 - It will be usef~1 tc illustrate the process with a concrete example. The 1980 CPE Newslette~~ included a discussion of Questions 31 and 32 from the 1979 CPE Mathematics paper. The excerpt begins with the questions themselves, and the response profiles. u. the roUowiJat iaformatioe to aaswer Questloas 31 ... 3l. A price list in a dulca showed the following prices: Kinaho u... Tea 2-kg Tin sh 24.60 2-lcg Packet sh 5.35 1-kg Packet sh 7.30 )'j 1-lcg Tin sh 12.90 ! 1-lcg Packet sh 3.15 rkg Packet sh 3. 70 i-kl TiD sb 6.55 100..1 Packe!t 7Sct 31. Muthoni bought a 2-kg tin of Kimbo, a 2-kg packet of Unga and half a kilogram of tea. How much did she pay altopther? A B c D sh 33.65 sh 33.60 sh 35.15 sh 35.80 Percentage of candidates giving 79.9% 9.1% 6.4-% '+. 2% each answer 32. If the shopkeeper had only 1-lcg tulS of Kimbo and only 100-g ~ackets of te~ ~?w much mor; would Muthoni have had to pay for her 2 kg of Kimbo, 2 kg. of Unga and half a Kllogram of tea. A B c D Percentage of sh 1.20 sh 1.25 sh 0.05 Nothing candidates giving 10.1% 28.2% 9.0% 51.4% each answer The first of these problems was quite easy. The information to be read from the list was straightforward, and only one operation (additipn) had to be carried out. Nearly 80\ reached the correct answer. But the second question (no. 32) was much more difficult. It was answered correctly by only 28. 2\ of candid'ates. The questlon required more operations than no. 31, but judging from the answers gi~en, this was not the main reason performance was poor. More than half the candi- dates thought that Muthoni would pay nothi1ng more if she bought two 1-kg tins of Kimbo instead of one 2-kg tin, and five 100-g packets of tea instead of one half-kg packet. These candidates failed to look care.fully at the information given in the pric:e list. They simply assumed tha~t: a ~-kg tin of lus ways, checks on the equity of the CPE examination. - 76 - Among the 1980 science questions. one of the most deviant was No.53, which tested candidates• observation of rainbows. The three districts in which this item was answered best were Busia, Kwale~ Bungoma, which are all below- average in overall CPE perfonnance. But they are all districts where rainfall is plentiful, so presllllc~bly their candidates had more opportunities for observing rainbows. Twc) of the three districts in which the item was mc)st difficult are semi-arid. Another item which showed strong biases tested candida1tes• knowledge of methods for preventing cholera. In three distr·icts where there had been a recent outbreak of the disease the proportion answering correctly was well over 80%, as compared with less than 75% of rural pupils generally. Most items testing knowledge of modern farming methods (e.g. the use of hybrid seeds, the prevention of soil erosion, etc.) tend to favour pupils from the small (but densely-populated) high-potential agricultural areas~. 1. Rural-rural biases Biases among rural districts occur most frequently with science and geography items, because in these subjects the content of the questions draws most heavily on the out-of-school experience of the candidates. These biases are assessed from the district item difficulty profiles, which give the difficulty indices (the percentage of candidates answering correctly) for each item in each district. The profile for the science questions in the 1980 examination is given at Annex 5. The scatter of indices for any one item is, of course, affected by the overall quality of education in the various districts as well as by any biases in the particular item. It would be quite feasible to partial out the quality effects statistically, but a more straightforward method has been preferred. For each item, the three top and the three bottom rural districts are identified and marked by a colour code: a red circle for a top district, a blue circle for a bottom district. When this has been done, the overall Perfonnance rankin~ is af.Jparer.t visually: the high-scoring distrirts show cl row of red circles, and the low-scoring districts a row of blue circles. Items which deviate from the overall trend are easily identifiable, particularly in cases where a row of red circles is broken by a blue circle, or a row of blue circles by a red circle. - 77 - Clearly, such biases cannot be eliminated except at the cost of losing much relevant content. What is done, as already mentioned, is to attempt to achieve a balance over the examination as a whole, so that the biases are as far as possible self-cancelling. But in practice it has p~·oved extremely difficult to write items which positively discriminate in favour of candidates from arid and semi-arid districts, which are the least-developed parts of the country. The most successful item in this respect in the 1980 science paper was Noa66. 66. Karanja saw this footprint in the mud: The footprint could have b~en made by a A. dog B. donkey C. hen D. goat This item was most often answered correctly in Samburu (85.3%), Kitui {82.5%), Keiyo Marakwet (82.0%) and Baringo (82.0%), compared to a national average for rural schools of 74.3%. (In Nairobi low-cost schools, only 64.6% answered correctly). All of these are low-rainfall districts, in which subsistence and cash incomes differed more in the keeping of livestock than on agriculture. Only one (BariTlgo) was among the top ten districts in 1980. But another item which w~..:; "1 so intended to favour the pas tara l areas was 1ess effective: - 78 - 62. Here is a picture of a bull. Three of the pictures below are different from the: picture above. Which one is the SArttE? Samburu was again at the top of the list (83.6%), with Tana River, another semi-arid district, in third place (83.3%). But sharing top place with Samburuwas Nyeri, which is perhaps the most highly-developed rural district in Kenya, and one in which zebu catt 1e of the type sho,wn in the picture are becoming increasingly uncommon. It is also consistently the top-ranking rural district in overall CPE performance. Furthenmore, Turkana,which is almost entirely pastoral, was the bottom-ranking rural district, consistent with its overall perfonnance ranking of 37th. It seems that with this item, the ability to analyse material presented pictorially, a general skill now - 79 - tested in many CPE questions, was at 1east as important us familiarity with the specific content. All the infonnation needed to answer correctly was present in the pictures. Item No.66, by contrast, could not be answered from the given data: it required pupi 1s to have actually observed the foot- prints of various animals, either during science fieldwork or as part of out-of-school experiencea The item which has most successfully discriminctted in favour of disadvantaged rural districts dates bac~ as far as 1977. 49. Kam.au s~w the new moon shining very low in the sky. Wbich one of the follo\ving did it look like? c. ___._.____________________ The district in which this question was answered best was Turkana, which, as we have seen, is one of the lowest-ranking in overall performance. The proportion of pupils answering correctly was 43%. Both e·conomically and educationally, Turkana is one of the least-developed parts of the country. Subsiste~ce is based on pastoralism , and only a small minority of children attend school at all. It seems that in Turkana, pupils may give relatively greater attention to natural phenomena such as the moon and stars because of the paucity of man-made structures and objects. - 80 - Turkana was followed by Kilifi, Wajir, Kwale, Garissa, Mandera, l.amu and Marsabit districts, all with between 35% and 39% of pupils answering correctly. A~)ain, these are all poorly-developed districts, mainly arid or semi-arid, wi'th below-average school participation rates. Further they are all, with the exception of Masabit, predominantly Islamic in religion. Ob!:servations of the moon thus have special significance: the ending of the month of Ramadhan, for instance~ is signalled by the sighting of the new moon. In the slightly more favoured pastoral areas and also in the ma1r·ginal agtri cultura 1 areas perfonnance was poorer: the proportion of correct answers varied between 25% and 34%. But in the high-potential areas, extending in a continuous belt from Meru district in Eastern Province to Lake Victoria, per·fonnance was poorer still: the proportion did not rise above 24%, and in Nyeri, the top-scoring rural district overall, was as low as 18%. Urban pupils also found the question difficult. In Nairobi city only 23% of pupils in low-cost schools answered correctly. But in the high-cost schools, whi tch in avera 11 CPE grades far outstrip most schoo 1s in the country, the proportion was only 14% - the poorest performance of all. In these schc1ols the conrnonest answers were A (37%) and B (39%), which suggest that many pupi'ls answered from their reading of books published in Britain or the United States, whi~h show the crescent of the new moon pointing to one side, rather than from their observation of the new moon as it appears in Kenya, with its horns pointing upwards. It can be seen from these examples that in some cases appropriate choice of content can produce a performance bias in favour of the less privileged parts of the country. But given the need to produce! an examination which is broadly relevant to as many pupils as possible, t.he scope for such positive discrimination is ljmited. It would, for example, be very difficult to just·ify eliminating from the e;