Normalizing Parameters with SQL

If you have multiple metrics that you want to weigh together, it may make sense to normalize the data so one metric (that is measured on a larger scale), doesn't disproportionately sway the composite score.

This SQL snippet (which uses CTEs), is one way you can normalize data in SQL. all_plays simply counts the number of plays per user_id. parameters allows us to pull the maximum and minimum number of plays per user_id.

Finally we normalize and rescale the data in the final select statement, with the minimum games played set to 0 and the maximum set to 1. Everything else falls in between these 2 extremes.

Any other methods you prefer to normalize data?

with
  all_plays as (
    select
      user_id
      , count(*) as number_gameplays
    from
      gameplays
    group by
      user_id
  )
  , parameters as (
    select
      max(number_gameplays) as max_plays
      , min(number_gameplays) as min_plays
    from
      all_plays
  )
select
  user_id
  , (max_plays - number_gameplays) * 1.0 / (max_plays - min_plays) as normalized_plays
from
  all_plays, parameters
Reply Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular
Like Follow
  • 1 mth agoLast active
  • 28Views
  • 1 Following