Determine the Fraction of Rows that Meet a Condition (Ex: Calculating Accuracy on a Test Dataset in SQL)
Let's say you have a table that contains a row of expected values and a row of predicted values, and we want to determine the accuracy. What we need to do is find cases where the expected values are equal to the predicted values, count them, and divide them by the total number of rows.
Here's the SQL used to generate that. Note that I only want to find the accuracy on my test data (this dataframe was generated using a K-nearest-neighbors machine learning model. Check out this community post here if you'd like to learn more about how to build this dataframe), hence the where clause.
In the below SQL, ESTIMATED_TARGET is the predicted data, ACTUAL_TARGET is the expected result
select sum( case when ESTIMATED_TARGET = ACTUAL_TARGET then 1 else 0 end ) * 1.0 / count(*) as accuracy from [knn_model] where DATASET = 'test'
You can leverage a similar calculation if you're trying to determine what fraction of your rows meet a certain criteria!