Generate True Count From Start/End Date Range

I work for a company that uses a subscription revenue model, where subscriptions are stored in postgres as start and end timestamps. For our insurance & billing department, it's important to know how many assets were actually out on any day of the week, but there are a couple issues:

  • The date ranges for a subscription are variable from 1 to 28+ days
  • The start and end timestamps can be retroactively changed by ops

If you're starting with a start and end timestamp (e.g. '2017-09-05' & '2017-09-15') and want to extrapolate all rows between those ('2017-09-05... '09-06'...'09-07'...'09-08'...)

Start by generating a table of dates in a given range,

FROM ( SELECT
  date( generate_series('2017-09-01'::timestamp, now(), '1 day') as days
) as all_dates

Then left join in your events table on both the start and end dates.

LEFT JOIN ( SELECT
  e.sub_id
  ,e.start --example: '2017-09-05'
  ,e.end --example: '2017-09-15'

  FROM events e
) as subs
  ON (subs.end >= all_dates.days) AND (subs.start <= all_dates.days)

This leaves you with an extrapolated table of all the days between the two dates for quick and easy aggregation.

4replies Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular
  • Thanks Alex Selby ! Our subscription data is structured the same way and we do much the same thing, except we have a view full of dates so that we can run these queries on Redshift. (When Redshift supports generate_series, I'll be able to retire happy.)

    It starts with the "numbers" view:

    select
      row_number() over(order by 1) - 1 as nmbr
    from
      any_large_table
    order by
      nmbr
    limit 10000001

    This produces a table like: 

     

    Then to make our "all_dates" view, we do:

    select
      dateadd(day,nmbr,'2013-01-01') dte
    from
      [numbers]
    order by
      1

    This gives us a table of dates going back to 2013, early enough for any analysis on our own data. 😉 You can adjust the start date to taste:

     

    Then instead of joining to "generate_series" we join to "[all_dates]" in our subscription analysis SQL. Hope this helps!

    Reply Like 1
  • When I need a numbers table and don't have a generate_series function I do it with a random table like so :)

    with numbers as (select
      row_number() over(order by created_at)
    from
      [table_with_a_timestamp]
    limit 10000)
    Reply Like
Like1 Follow
  • 1 Likes
  • 1 yr agoLast active
  • 4Replies
  • 1013Views
  • 4 Following