Google BigQuery

Introduction

Google BigQuery is one of the most popular, scalable, easy to manage data warehouse options for those wanting to setup a Data Warehouse on the cloud. Read more about the features and how to get started with the popular platform here.

In case you wish to use this platform to host your data that you are aggregating using DataChannel, you can either setup a cluster of your own or use a DataChannel Managed BigQuery Warehouse. This document will show you how you can connect / provision a BigQuery based warehouse in minutes using the DataChannel Platform.

Self Managed Google BigQuery Warehouse

Prerequisites for connecting your BigQuery Cluster
  • Create a Google BigQuery Dataset using the steps given here.

  • Create a Google Cloud Storage bucket in the same region as your cluster

  • Ensure you have granted the required permissions as per the documentation here

Step By Step Guide

Step 1

Click on Data Warehouses tab in the left side bar navigation to reach the Data Warehouses Module as shown below.

destinations 1
Step 2

Click on Add New to add a new Data Warehouse to your account.

Step 3

Select BigQuery from the Storage Type drop down options.

destinations bq step3
Step 4

Enter the details for your BigQuery Dataset in the form and click on Save to add the warehouse. An explanation of each of the fields in the form is given in the table below. Refer to Google BigQuery documentation here to know how to get this information for your project.

destinations rs step4
Field Description

Name

Required

Provide a name for your warehouse. It needs to be unique across your account.

Dataset Name

Required

Provide the name of your BigQuery Dataset (you can get this from the BigQuery console).

Project ID

Required

Provide the project ID for the cloud project which has the BigQuery Dataset.

BigQuery Region

Required

Provide the region / location where your Dataset is located.

BigQuery Authentication Code

Required

Click on the link Generate Authentication Code and follow the process given here to generate a code using OAuth2

Use DataChannel GCS

Required

Leave this toggle off so that you can specify your own GCS bucket.

GCS Project ID

Required

Provide the project ID where you have created the Google Cloud Storage Bucket to store the raw data files. Note that DataChannel does not remove the files after they have been copied into GCS so it is advisable to use life cycle properties to manage the removal / archival of the raw files to manage GCS costs.

GCS Bucket Name

Required

Provide the name of the cloud storage bucket you have created for DataChannel.

GCS Region Name

Required

Provide the name of the region where your GCS Bucket is stored. Note:- This should be same as the region for your BigQuery Dataset.

GCS Authentication Code

Required

Click on the link Generate Authentication Code and follow the process given here to generate a code using OAuth2

DataChannel Managed BigQuery

The process to add a DataChannel managed warehouse is very simple. Just switch the two Toggles Use DataChannel BigQuery and Use DataChannel GCS to ON and click on Save button to add the warehouse.

destinations bq dc

Still have Questions?

We’ll be happy to help you with any questions you might have! Send us an email at info@datachannel.co.

Subscribe to our Newsletter for latest updates at DataChannel.