PHPCon Poland 2024

# stats_stat_correlation

(PECL stats >= 1.0.0)

stats_stat_correlationReturns the Pearson correlation coefficient of two data sets

### Beschreibung

stats_stat_correlation(array `\$arr1`, array `\$arr2`): float

Returns the Pearson correlation coefficient between `arr1` and `arr2`.

`arr1`

The first array

`arr2`

The second array

### Rückgabewerte

Returns the Pearson correlation coefficient between `arr1` and `arr2`, or `false` on failure.

`undefined for me, thus I've implemented my own correlation which is much faster and simpler than the one provided above.function Corr(\$x, \$y){\$length= count(\$x);\$mean1=array_sum(\$x) / \$length;\$mean2=array_sum(\$y) / \$length;\$a=0;\$b=0;\$axb=0;\$a2=0;\$b2=0;for(\$i=0;\$i<\$length;\$i++){\$a=\$x[\$i]-\$mean1;\$b=\$y[\$i]-\$mean2;\$axb=\$axb+(\$a*\$b);\$a2=\$a2+ pow(\$a,2);\$b2=\$b2+ pow(\$b,2);}\$corr= \$axb / sqrt(\$a2*\$b2);return \$corr;}`
`Please note that this function is reserved for two arrays with continued numbers inside (just integers).I tested this function and found that it calculate the Pearson's Correlation Coefficient of two arrays.---Here's suggested documentation:stats_stat_correlation — Calculate the Pearson's Correlation Coefficient of two arrays of continued numbers.Parameters:arr1 = array (integer1a, interger2a ...)arr2 = array (integer1b, interger2b ...))(Note that the count of elements in two arrays must be equal)Return value: Pearson's Correlation Coefficient in decimal format (ex. 0.934399822094)Code examples:<?php// Provided by admin@maychu.net\$array_x = array(5,3,6,7,4,2,9,5);\$array_y = array(4,3,4,8,3,2,10,5);\$pearson = stats_stat_correlation(\$array_x,\$array_y);echo \$pearson;?>`
`I tried to use this function, but got a not-defined error. Anyway, I have created a set of functions to replace this: <?php//Since Correlation needs two arrays, I am hardcoding them\$array1[0] = 59.3;\$array1[1] = 61.2;\$array1[2] = 56.8\$array1[3] = 97.55;\$array2[0] = 565.82;\$array2[1] = 54.568;\$array2[2] = 84.22;\$array2[3] = 483.55;//To find the correlation of the two arrays, simply call the //function Correlation that takes two arrays:\$correlation = Correlation(\$array1, \$array2);//Displaying the calculated Correlation:print \$correlation;//The functions that work behind the scene to calculate the//correlationfunction Correlation(\$arr1, \$arr2){ \$correlation = 0; \$k = SumProductMeanDeviation(\$arr1, \$arr2); \$ssmd1 = SumSquareMeanDeviation(\$arr1); \$ssmd2 = SumSquareMeanDeviation(\$arr2); \$product = \$ssmd1 * \$ssmd2; \$res = sqrt(\$product); \$correlation = \$k / \$res; return \$correlation;}function SumProductMeanDeviation(\$arr1, \$arr2){ \$sum = 0; \$num = count(\$arr1); for(\$i=0; \$i<\$num; \$i++) { \$sum = \$sum + ProductMeanDeviation(\$arr1, \$arr2, \$i); } return \$sum;}function ProductMeanDeviation(\$arr1, \$arr2, \$item){ return (MeanDeviation(\$arr1, \$item) * MeanDeviation(\$arr2, \$item));}function SumSquareMeanDeviation(\$arr){ \$sum = 0; \$num = count(\$arr); for(\$i=0; \$i<\$num; \$i++) { \$sum = \$sum + SquareMeanDeviation(\$arr, \$i); } return \$sum;}function SquareMeanDeviation(\$arr, \$item){ return MeanDeviation(\$arr, \$item) * MeanDeviation(\$arr, \$item);}function SumMeanDeviation(\$arr){ \$sum = 0; \$num = count(\$arr); for(\$i=0; \$i<\$num; \$i++) { \$sum = \$sum + MeanDeviation(\$arr, \$i); } return \$sum;}function MeanDeviation(\$arr, \$item){ \$average = Average(\$arr); return \$arr[\$item] - \$average;} function Average(\$arr){ \$sum = Sum(\$arr); \$num = count(\$arr); return \$sum/\$num;}function Sum(\$arr){ return array_sum(\$arr);}?>`