AWS Redshift Data Warehouse

Introduction

Amazon Redshift is one of the most popular, scalable, easy to manage data warehouse options for those wanting to setup a Data Warehouse on the cloud. Read more about the features and how to get started with the popular platform from AWS here.

In case you wish to use this platform to host your data that you are aggregating using DataChannel, you can either setup a cluster of your own or use a DataChannel Managed Redshift Warehouse. This document will show you how you can connect / provision a Redshift based warehouse in minutes using the DataChannel Platform.

Self Managed Redshift Cluster

Prerequisites for connecting your Redshift Cluster
  • Create a Redshift Cluster with adequate capacity to be able to store the volume of data you anticipate storing.

  • Create a S3 bucket in the same region as your cluster and create and configure access to the bucket from your Redshift cluster. Read more here

Tips to ensure smooth functioning
  • Create a separate schema for the data coming in from DataChannel.

  • Create two users, one for loading the data and another user for reading the data from BI tools etc. This will assist you in configuring Workload Management and prevent queuing up of your queries.

Step By Step Guide

Step 1

Click on Data Warehouses tab in the left side bar navigation to reach the Data Warehouses Module as shown below.

destinations 1
Step 2

Click on Add New to add a new Data Warehouse to your account.

Step 3

Select Redshift from the Storage Type drop down options.

destinations rs step3
Step 4

Enter the details for your Redshift cluster in the form and click on Save to add the warehouse. An explanation of each of the fields in the form is given in the table below. Refer to AWS Redshift documentation here to know how to get this information for your cluster.

destinations rs step4
Field Description

Name

Required

Provide a name for your warehouse. It needs to be unique across your account.

Host

Required

Provide the hostname or end-point for the cluster.

Username

Required

Provide a username which will be used to create the tables and load data. This user needs to have all rights on the schema that you intend to use. In case you are creating a dedicated schema for the data from DataChannel (which is recommended), then this user can be the schema owner.

Password

Required

Provide the password for the load user.

Select Users

Optional

Comma separated list of users who should get select rights on tables created by DataChannel using the schema and username specified by you.

Port

Required

Provide the port number for your cluster. The default value for this is 5439 unless you have changed it while creating your redshift cluster.

DB Name

Required

Provide the name of the database you have created in your cluster.

Schema Name

Required

Provide the database schema where DataChannel should push the data. As mentioned above, it is recommended to create a new schema for DataChannel in your database.

Use DataChannel S3

Required

Leave this toggle off so that you can specify your own S3 bucket.

AWS Location

Required

Provide the AWS region where your S3 bucket has been created. This should typically be same as the region in which your Redshift cluster is hosted. Example us-east-1

Bucket Name

Required

Provide name of the S3 bucket where DataChannel should copy files before loading them into your Redshift instance. Note that DataChannel does not remove the files after they have been copied into Redshift so it is advisable to use life cycle properties to manage the removal / archival of the raw files to manage S3 costs.

Access Key

Required

Provide the access key required to access the S3 bucket using the API. Refer AWS documentation here to learn how to manage your access keys.

Secret Key

Required

Provide the secret key required to access the S3 bucket using the API. Refer AWS documentation here to learn how to manage your secret keys.

IAM Role

Required

Provide the IAM role required to access the S3 bucket using the API. Refer AWS documentation here to learn how to create and manage your IAM Role.

DataChannel Managed Redshift

The process to add a DataChannel managed warehouse is very simple. Just switch the two Toggles Use DataChannel Redshift and Use DataChannel S3 to ON and click on Save.

destinations rs dc

Still have Questions?

We’ll be happy to help you with any questions you might have! Send us an email at info@datachannel.co.

Subscribe to our Newsletter for latest updates at DataChannel.