Automatically Transpose Any Table with SQL Only

Congratulations, if you've reached this page, you're about to gain a very valuable tool. For all of you that do not have Python/R on your site and cannot transpose a table in 1 simple line of code, I'm happy to present to you a custom solution that will allow you to easily and automatically transpose tables using SQL only.

To do this, you'll need the following:

  • A Periscope Site with Cache
  • A cached table or a materialized view
  • 20 Minutes to follow this walkthrough

As some of you may know, transposing a table manually using SQL only is a tedious process involving unions and case/when statements. For example, a guide on stackoverflow shows that a 6x2 table requires at least 18 lines of code to transpose. As the number of columns and rows increases, the code to transpose increases as well.

In this guide, we will build a dedicated dashboard that will generate code to transpose any table. Assuming you don't feed it gigantic tables, this should work for most use cases!

The finished product will look like the image below. Simply select your schema and table along with the column that should be the index for the new table. Copy the text from the updated chart and paste it into a new chart to get the transposed version.

 
Steps

  1. Create new dashboard
  2. Create a grandparent-parent-child trio of dashboard filters
  3. Create a materialized view
  4. Create the transposer chart

Step 1: Create a New Dashboard

This is the easiest step. We can create a new dashboard from the menu at the left, following these instructions. It's a good idea to name it something like Table Transposer so that others know how to use it. You might even add an extra chart that includes instructions for your users. Go ahead and remove any default filters from this dashboard 

Step 2: Create a Materialized View

Normally, system tables don't like to join to non-system tables. However, we need a way to enumerate the columns and tables of the schemas on the cache. Luckily, there is a table in Redshift which can provide this information and be cached for future use and cross-database joins!

Let's go ahead and create a new view with the following SQL:

select
      svv_columns.table_schema,
      svv_columns.table_name,
      svv_columns.column_name,
      svv_columns.data_type,
      svv_columns.ordinal_position
    from
      svv_columns
    where
      svv_columns.table_schema not ilike 'pg_%'
      and svv_columns.table_schema != 'information_schema'
    union all
    select
        view_schema::varchar as table_schema,
        view_name::varchar as table_name,
        col_name::varchar as column_name,
        col_type::varchar as data_type,
        col_num::int as ordinal_position
    from
      pg_get_late_binding_view_cols()
        cols(  view_schema name
             , view_name name
             , col_name name
             , col_type varchar
             , col_num int)

Right now, we don't need to worry about exactly how this query works, but I encourage you to give it a look over to find out for yourself!

Step 3: Create Filters for the Dashboard

This part is a bit more complex. We'll go through the three filters one by one, starting with the grandparent filter that references the view we just materialized.

Filter 1: Schema

First, we need something that lets users choose a schema and excludes system tables. Use the following code to generate a filter called schema:

select distinct table_schema, table_schema
from [redshift_tables]

And make sure to enable radio button selection under More Options:

 
 

Filter 2: Table

The second filter will be a child of the first filter. This will let users select a table from the previously selected schema. Use the following code to generate the the filter values:

select distinct table_name, table_name, table_schema
from [redshift_tables]

This filter needs to be set as a child of the previous filter as well as set to radio style buttons:


 

Filter 3: Column

The third and final column represents the index about which the table will be transposed. In other words, the data in the transposed table will have one row for each value of the column selected with this filter. It will make more sense when you give it a try yourself, trust me!

select distinct
  column_name
  , column_name, table_name
from [redshift_tables]

This will also need to be set up as a child filter:
 

Step 4: Create the Main Chart

Okay, now that we have our infrastructure set up, we have to create a very clever chart that will output SQL. That's right, the display of this chart isn't data, but SQL Code that can be copied and pasted into a new chart. We'll change a couple chart settings too. For now, start with the following query:

-- --yes_cache
select col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9 from (
select
  'select value_description,' col_1
  , null col_2
  , null col_3
  , null col_4
  , null col_5
  , null col_6
  , null col_7
  , null col_8
  , null col_9
  , 1 as sort_order
union all
select
  'max(case when'
  , '[column]'
  , '::text = trim(translate('''
  , replace([column], '''', '''''')
  , null
  , ''', chr(9),''''))then value::text else null end)'
  , case when regexp_count(SUBSTRING([column]::text, 1,1), '[a-zA-Z]') = 1 then REGEXP_REPLACE([column]::text, '[^a-zA-Z0-9]+', '_') else 'col_' || REGEXP_REPLACE([column]::text, '[^a-zA-Z0-9]+', '_') end
  , null
  , case
    when ROW_NUMBER() OVER(ORDER BY [column]) != (select count(*) from [schema].[table])
      then ','
    else null
  end
  , 2 as sort_order
from
  [schema].[table]
union all
select
  'from ('
  , null
  , null
  , null
  , null
  , null
  , null
  , null
  , null
  , 3 as sort_order
union all
select
  'select ' || '[column],'
  , column_name
  , '::text value, trim('''
  , null
  , column_name
  , ''') value_description from'
  , null
  , '[schema].[table]'
  , case
    when ROW_NUMBER() OVER(ORDER BY column_name) != (select count(*) from  [redshift_tables] where
  table_schema = '[schema]'
  and table_name = '[table]'
  and column_name != '[column]')
      then 'union all'
    else null
  end
  , 4 as sort_order
from
  [redshift_tables]
where
  table_schema = '[schema]'
  and table_name = '[table]'
  and column_name != '[column]'
union all
select
  ') group by value_description'
  , null
  , null
  , null
  , null
  , null
  , null
  , null
  , null
  , 5 as sort_order
order by
  sort_order) tbl

This is a very complex query that employs a lot of different tricks including window functions, unions, regular expressions, direct replacement filters, string casting, trimming, translating, and subqueries!

Feel free to dive into the code, but do not feel like you need to understand it to get benefit from it. Let's look at the chart settings we need to set to optimize this:

 

Hiding table line numbers and aligning the table to fill the space will allow easier copy pasting of the code. Save the chart and you should be good to go!

In the end, you should have an experience similar to the following gif:

 

Reply Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular
Like4 Follow
  • 4 Likes
  • 3 mths agoLast active
  • 140Views
  • 1 Following