Page images

the sun itself to volatilize carbon-why, even if the small comets said, in the Philosophical Transactions, to be throwing off the incandescent vapour of carbon every night they were under observation, even in a dark and cold sky, had been taken thence and placed on the very surface of the sun itself, and had experienced there not only the heat which that other comet had experienced of earth's x 47,000, but earth's x 300,000, they could not have shown a pure carbon-spectrum.

As our sun, according to Father Secchi, ranks only among the yellow stars, and they are supposed not to be so hot as the white stars, perhaps the vapour of carbon may exist glowing and incandescent in Sirius, which is so noted a member of the latter class of stars. We may, too, perhaps be privileged to see the actual and real spectral lines of carbon there, in any good telespectroscope-but with the drawback that, however plainly the lines may appear in themselves, we cannot recognize their chemical origin and assign them their true name, because neither has man ever yet volatilized pure carbon, nor has any angel (in default of theory) ever told us the wave-lengths of carbon-lines when the carbon has been volatilized by a higher power.

Hydrocarbon compound it is given to man to volatilize and spectroscope; and he should be thankful for its many admirable uses; but as to the spectrum of the pure carbon element being seen in the base of the flame of every little candle made and set alight by human hands, it would be well if certain modern men, and the secret committee of the Royal Society in particular, were to come forward openly and confess with deep contrition in the words of ancient Job,

"I have uttered that I understood not; things too wonderful for me, which I knew not."

"Wherefore I abhor myself, and repent in dust and ashes."

IV. Statistics by Intercomparison, with Remarks on the Law of Frequency of Error. By FRANCIS GALTON, F.R.S.* MY object is to describe a method for obtaining simple statistical results which has the merit of being applicable to a multitude of objects lying outside the present limits of statistical inquiry, and which, 1 believe, may prove of service in various branches of anthropological research. It has already been proposed (Lecture, Royal Institution, Friday evening, February 27, 1874), and in some degree acted upon (Hereditary Genius,' p. 26), by myself. What I have now to offer is a more complete explanation and a considerable development of previous views.

*Communicated by the Author.

Phil. Mag. S. 4. Vol. 49. No. 322. Jan. 1875.


The process of obtaining mean values &c. now consists in measuring each individual with a standard that bears a scale of equal divisions, and afterwards in performing certain arithmetical operations upon the mass of figures derived from these numerous measurements. I wish to point out that, in order to procure a specimen having, in one sense, the mean value of the quality we are investigating, we do not require any one of the appliances just mentioned: that is, we do not require (1) independent measurements, nor (2) arithmetical operations; we are (3) able to dispense with standards of reference, in the common acceptation of the phrase, being able to create and afterwards indirectly to define them; and (4) it will be explained how a rough division of our standard into a scale of degrees may not unfrequently be effected. Therefore it is theoretically possible, in a great degree, to replace the ordinary process of obtaining statistics by another, much simpler in conception, more convenient in certain cases, and of incomparably wider applicability.

Nothing more is required for the due performance of this process than to be able to say which of two objects, placed side by side, or known by description, has the larger share of the quality we are dealing with. Whenever we possess this power of discrimination, it is clear that we can marshal a group of objects in the order in which they severally possess that quality. For example, if we are inquiring into the statistics of height, we can marshal a number of men in the order of their several heights. This I suppose to be effected wholly by intercomparison, without the aid of any external standard. The object then found to occupy the middle position of the series must possess the quality in such a degree that the number of objects in the series that have more of it is equal to that of those that have less of it. In other words, it represents the mean value of the series in at least one of the many senses in which that term may be used. Recurring to the previous illustration, in order to learn the mean height of the men, we have only to select the middlemost one and measure him; or if no standard of feet and inches is obtainable, we must describe his height with reference to numerous familiar objects, so as to preserve for ourselves and to convey to strangers as just an idea of it as we can. Similarly the mean speed of a number of horses would be that of the horse which was middlemost in the running.

If we proceed a step further and desire to compare the mean height of two populations, we have simply to compare the representative man contributed by each of them. Similarly, if we wish to compare the performances of boys in corresponding classes of different schools, we need only compare together the middle boys in each of those classes.

The next great point to be determined is the divergency of the series-that is, the tendency of individual objects in it to diverge from the mean value of all of them. The most convenient measure of divergency is to take the object that has the mean value, on the one hand, and those objects, on the other, whose divergence in either direction is such that one half of the objects in the series on the same side of the mean diverge more than it does, and the other half less. The difference between the mean and either of these objects is the measure in question, technically and rather absurdly called the "probable error." Statisticians find this by an arithmetical treatment of their numerous measurements; I propose simply to take the objects that occupy respectively the first and third quarter points of the series. I prefer, on principle, to reckon the divergencies in excess separately from those in deficiency. They cannot be the same unless the series is symmetrical, which experience shows me to be very rarely the case. It will be observed that my process fails in giving the difference (probable error) in numerical terms; what it does is to select specimens whose differences are precisely those we seek, and which we must appreciate as we best can.

We have seen how the mean heights &c. of two populations may be compared; in exactly the same way may we compare the divergencies in two populations whose mean height is the same, by collating representative men taken respectively from the first and third quarter points of the series in each case.

We may be confident that if any group be selected with the ordinary precautions well known to statisticians, it will be so far what may be called "generic" that the individual differences of members of that group will be due to various combinations of pretty much the same set of variable influences. Consequently, by the well-known laws of combinations, medium values will occur very much more frequently than extreme ones, the rarity of the latter rapidly increasing as the deviation slowly increases. Therefore, when the objects are marshalled in the order of their magnitude along a level base at equal distances apart, a line drawn freely through the tops of the ordinates which represent their several magnitudes will form a curve of double curvature. It will be nearly horizontal over a long space in the middle, if the objects are very numerous; it will bend down at one end until it is nearly vertical, and it will rise up at the other end until there also it is nearly vertical. Such a curve is called, in the phraseology of architects, an "ogive," and is represented by O G in the diagram (fig. 1), in which the process of statistics by intercomparison is clearly shown. If n= the length of the base of the ogive, whose ordinate y represents the magni

tude of the object that stands at a distance x from that end of the base where the ordinates are smallest, then the number of

[merged small][graphic][subsumed][ocr errors]

objects less than y: the number of objects greater thany::x:n-x. The ordinate m at represents the mean value of the series, and P, q at and, taken in connexion with m, give data for estimating the divergence; thus q-m is the divergence (probable error) of at least that portion of the series that is in excess of the mean, and m-p is that of at least the other portion. When the series is symmetrical, q-m=p-q, and either, or the mean of both, may be taken as the divergence of the series generally. No doubt we are liable to deal with cases in which there may be some interruption in the steady sweep of the ogive; but the experience of qualities which we can measure, assures us that we need fear no large irregularity of that kind when dealing with those which, as yet, we have no certain means of measuring.

When we marshal a series, we may arrange them roughly, except in the neighbourhood of the critical points; and thus much labour will be saved. But the most practical way of setting to work would probably depend not on the mere discrimination of greater and less, but also on a rough sense of what is much greater or much less. We have called the objects at the,, and distances p, m, and q respectively; let us sort the objects into two equal portions P and Q, of small and great, taking no more pains about the sorting than will ensure that P contains p and all smaller than p, and that Q contains q and all larger than q. Next, beginning, say, with group P, sort away alternately to right and left the larger and the smaller objects,

[ocr errors]

roughly at first, but proceeding with more care as the residuum diminishes and the differences become less obvious. The last remaining object will be p. Similarly we find q. Then m will be found in the same way from the group compounded of those that were sorted to the right from P and to the left from Q.

There are not a few cases where both the ordinary method and that by intercomparison are equally applicable, but in which the latter would prove the more rapid and convenient. I would mention one of some importance to those anthropologists who may hereafter collect data in uncivilized countries. A barbarian chief might often be induced to marshal his men in the order of their heights, or in that of the popular estimate of their skill in any capacity; but it would require some apparatus and a great deal of time to measure each man separately, even supposing it possible to overcome the usually strong repugnance of uncivilized people to any such proceeding.

The practice of sorting objects into classes may be said to be coextensive with commerce, the industries, and the arts. It is adopted in the numerous examinations, whether pass or competitive, some or other of which all youths have now to undergo. It is adopted with every thing that has a money-value; and all acts of morality and of intellectual effort have to submit to a verdict of "good," "indifferent," or "bad."

The specimen values obtained by the process I have described are capable of being reproduced so long as the statistical conditions remain unchanged. They are also capable of being described in various ways, and therefore of forming permanent standards of reference. Their importance then becomes of the same kind as that which the melting-points of well-defined alloys or those of iron and of other metals had to chemists when no reliable thermometer existed for high temperatures. These were excellent for reference, though their relations inter se were subject to doubt. But we need never remain wholly in the dark as to the relative value of our specimens, methods appropriate to each case being sure to exist by which we may gain enlightenment. The measurement of work done by any faculty when trained and exerted to its uttermost, would be frequently available as a test of its absolute efficacy.

There is another method, which I have already advocated and adopted, for gaining an insight into the absolute efficacies of qualities, on which there remains more to say. Whenever we have grounds for believing the law of frequency of error to apply, we may work backwards, and, from the relative frequency of occurrence of various magnitudes, derive a knowledge of the true relative values of those magnitudes, expressed in units of probable error. The law of frequency of error says that "mag

« PreviousContinue »