Page images
PDF
EPUB

It is clear from (63) that the change of volume is unaltered by moving the centre of force from any one point to any other point which lies on the same equipotential surface of the solid.

In applying (63) it must be borne in mind that in general a body cannot be in equilibrium under the action of a single centre of force, and that when equilibrium exists under the joint action of a number of centres of force it is to a certain extent arbitrary to regard the effect of one as independent of the others. In the case of a planet, for instance, the "centrifugal force" in the orbital path balances the sun's attraction. If we regarded the planet as an elastic solid describing a circular path, and neglected for the time being the effects of self-gravitation and rotation about an axis, we should find its elastic change of volume given by two terms. One of these terms would be of the type (64), depending on the mass of the sun; the other would depend on the angular velocity in the orbit. The persistence, however, of the planet in its orbit implies a necessary relation between the gravitational and "centrifugal" forces, so that the two terms in the expression for dv could be combined into one, containing as we choose the angular velocity or the gravitational

constant.

§ 18. In obtaining (63) we treated the source of the gravitational forces as wholly external to the solid. In a self-gravitating homogeneous spherical "earth" of radius a, we should have

X/x=Y/y=Z/z=−g'p/a,

where g' is "gravity" at the surface. Substituting in (2) we easily deduce

[blocks in formation]

As I have repeatedly remarked elsewhere, the application of such a formula to the actual earth-or other principal planet-leads to numerical results which cannot be reconciled with the fundamental hypotheses of mathematical elasticity unless k be much larger than in known materials under the conditions existing near the earth's surface.

In small bodies, on the other hand, the effects of selfgravitation are very small. For instance, for a sphere of iron (p=75, k= 15 × 108 grammes wt. per sq. cm.) of one metre radius we should get from (65)

or

Sv/r2x 10-14.

=

Sa = -7 x 10-12 mm.

[To be continued.]

LIII. On Lines and Planes of Closest Fit to Systems of Points in Space. By KARL PEARSON, F.R.S., University College, London*.

(1) IN many physical, statistical, and biological investigations it is desirable to represent a system of points in plane, three, or higher dimensioned space by the "best-fitting" straight line or plane. straight line or plane. Analytically this

consists in taking

[ocr errors][ocr errors][merged small][merged small][merged small]

where y, x, 2, 1, 2, . . . în are variables, and determining the "best" values for the constants ao, ai, bi, α, α1, A2, A3, ... An in relation to the observed corresponding values of the variables. In nearly all the cases dealt with in the text-books of least squares, the variables on the right of our equations are treated as the independent, those on the left as the dependent variables. The result of this treatment is that we get one straight line or plane if we treat some one variable ast independent, and a quite different one if we treat another variable as the independent variable. There is no paradox about this; it is, in fact, an easily understood and most important feature of the theory of a system of correlated variables. The most probable value of y for a given value of a, say, is not given by the same relation as the most probable value of a for a given value of y. Or, to take a concrete example, the most probable stature of a man with a given length of leg 7 being s, the most probable length of leg for a man of stature s will not be l. The "best-fitting" lines and planes for the cases of z up to variables for a correlated system are given in my memoir on regression †. They depend upon a determination of the means, standard-deviations, and correlation-coefficients of the system. In such cases the values of the independent variables are supposed to be accurately known, and the probable value of the dependent variable is ascertained.

(2) In many cases of physics and biology, however, the "independent" variable is subject to just as much deviation or error as the "dependent variable. We do not, for example, know a accurately and then proceed to find y, but both x and y are found by experiment or observation. We observe x and y and seek for a unique functional relation between them. Men of given stature may have a variety

a

*Communicated by the Author.

+ Phil. Trans. vol. clxxxvii. A, pp. 301 et seq.

of leg-lengths; but a point at a given time will have one position only, although our observations of both time and position may be in error, and vary from experiment to experiment. In the case we are about to deal with, we suppose the observed variables-all subject to error-to be plotted in plane, three-dimensioned or higher space, and we endeavour to take a line (or plane) which will be the "best fit" to such a system of points.

Of course the term "best fit" is really arbitrary; but a good fit will clearly be obtained if we make the sum of the squares of the perpendiculars from the system of points upon the line or plane a minimum.

For example:-Let P1, P2, ... Pn be the system of points with coordinates x1, Y1; x2, ‚Y2;...n yn, and perpendicular distances P1, P2, · Pa from a line A B. Then we shall make

...

U=S(p) a minimum.

If y were the dependent variable, we should have made Sy-y) a minimum

(y' being the ordinate of the theoretical line at the point a which corresponds to y), had we wanted to determine the best-fitting line in the usual manner.

[merged small][merged small][ocr errors][merged small][merged small][merged small][ocr errors][merged small][merged small]

Now clearly US(p2) is the moment of momentum, the second moment of the system of points, supposed equally loaded, about the line AB. But the second moment of a system about a series of parallel lines is always least for the

line going through the centroid. Hence: The best-fitting straight line for a system of points in a space of any order goes through the centroid of the system.

[ocr errors]

Now let there be n points each fixed by q
,..., and let

&q=S(x)/n, &q=S(x2)/n...xq=S(xg)/n. fix the centroid, or the mean values of the variables;

o2x,=§ (x ̧2)/n—Ã2, o3⁄4x ̧=S(x,2)/n — ï¿2, . . .

[ocr errors]

variables

[ocr errors][merged small][merged small]

fix the standard-deviations (errors of mean square), or indirectly the moments of inertia or second-moments about the axes of coordinates, through the centroid parallel to the axes of the variables a, ...q. And, lastly, let

x2

[ocr errors][merged small][merged small][merged small][merged small]

for all pairs of values of u and v from 1, 2, 3,... q, fix the correlations of the variables, or indirectly the products of inertia or product-moments about the axes.

Now let 1, 42, 43...lq be the generalized direction-cosines of a plane at perpendicular distance p from the origin. We shall have

[merged small][ocr errors][ocr errors][merged small]

Further, if U be the sum of the squares of the perpendicular distances of the system of n points from the plane

4x1 +12x2 + lzNo3 +...+1qq=P, •

we require to make a minimum of

U=S(4jX1+ lqXq+lzxz + . . . + lqxq−p)2, .

...

(v.)

(vi.)

by variation of 1, 2,... lq, p subject to (iv.). Differentiate first with regard to p and we have

4S(x1)+l2S(x2) +lzS (No3) +...+1qS(xg) —np=0;

:. p=4&1+l22+...+lgπn, •

[ocr errors]

(vii.)

which shows us from (v.) that: the best-fitting plane passes through the centroid of the system.

Now vary (vi.) and add to it Q times the variation of (iv.), Q being an undetermined multiplier. We have, by equating to zero the coefficient of dlu,

4S(X1Xu)+l¥S(X2%u) + . . . + luS(xu3) + . . . +lqS(xqxu)
−pS(xu)+Q/u=0.

Or, substituting for p from (vii.) and using (ii.) and (iii.): 10x ̧¤x«¥x ̧xu+l»0x ̧¤¤μ3ï ̧ïu+...luxu3 +...+[qσxq¤xu¥zuzq

[blocks in formation]

+lq©xqXXu?£u*q

[merged small][ocr errors][ocr errors][ocr errors][merged small]
[merged small][merged small][ocr errors][ocr errors][merged small]

Multiplying each type equation by its corresponding lu, adding together and remembering (iv.), we find

[blocks in formation]

where Um is the minimum value of U.

[ocr errors]

Now let Σ be the mean square of the residuals, or

S(1x1 +lgx2 + . . . +lq.¤g—p)2.

Σ2=

[blocks in formation]
[ocr errors]

and a physical meaning has been given to Q, Q/n is the mean square residual,"-i. e., the quantity, the square of which is the mean square of the residuals.

The type equation (viii.) may now be written :

[ ̧¤x ̧ ̄xux ̧xu+l20x ̧ ̄xμÏx ̧x«+ · · + lu(o2xu− Σ2)

...

+lq©xq©x«?xqxu=0.. (ix.) We can eliminate the I's and dividing out row and column of resulting determinant by the corresponding σ, we have:

[merged small][ocr errors][merged small][ocr errors][merged small][ocr errors][ocr errors][ocr errors][ocr errors][ocr errors][ocr errors][ocr errors][merged small][ocr errors]

as a determinantal equation to find 22. We must choose the least root of this equation, for the mean square residual must

« PreviousContinue »