The correlation coefficient between variables is a measure of how well the variables are related. This function is available in Excel with the CORREL function. But what if you’re using PowerPivot and still need to calculate the correlation coefficient for your data in DAX? While the formula isn’t available to date in DAX, you can build the function by identifying your x,y, and n as stated in the formula for
Pearson’s Correlation Coefficient.
In my case, to ensure data integrity as well as my sanity, I built the nominator and the denominator and then divided the two if the denominator was not 0. In all formulas below, Transactions is my n, Price is my x, ShipQty is my y, and MainData is my Power Pivot ‘table’. I have another table, Product, related by ProductID, that contains a distinct selection of the products in MainData.
Transactions:
COUNTA([Price])
Nominator:
([Transactions]*(SUMX(RELATEDTABLE(MainData), (MainData[Price]*MainData[ShipQty])))) - ((SUMX(RELATEDTABLE(MainData), MainData[Price]))*(SUMX(RELATEDTABLE(MainData), MainData[ShipQty])))
Denominator:
if((([Transactions]*(SUMX(RELATEDTABLE(MainData), (MainData[Price]*MainData[Price]))))-(SUMX(RELATEDTABLE(MainData), MainData[Price])*SUMX(RELATEDTABLE(MainData), MainData[Price])))*(([Transactions]*(SUMX(RELATEDTABLE(MainData), (MainData[ShipQty]*MainData[ShipQty]))))-(SUMX(RELATEDTABLE(MainData), MainData[ShipQty])*SUMX(RELATEDTABLE(MainData), MainData[ShipQty]))) < 0, 0, SQRT((([Transactions]*(SUMX(RELATEDTABLE(MainData), (MainData[Price]*MainData[Price]))))-(SUMX(RELATEDTABLE(MainData), MainData[Price])*SUMX(RELATEDTABLE(MainData), MainData[Price])))*(([Transactions]*(SUMX(RELATEDTABLE(MainData), (MainData[ShipQty]*MainData[ShipQty]))))-(SUMX(RELATEDTABLE(MainData), MainData[ShipQty])*SUMX(RELATEDTABLE(MainData), MainData[ShipQty])))))
Correlation Coefficient:
ROUNDDOWN(IF([Denominator] = 0, 0, [Nominator]/[Denominator]),1)