Core Concepts
This section will help you familiarise with key concepts/ terminologies that will come in handy in understanding the services provided by DataChannel .
What is a Data Warehouse?
Let’s understand this with the help of an analogy.
Suppose Tom runs a chain of kids clothing stores. He has stores at various locations. To meet the needs of his customers, across various age groups in both genders, he needs to keep sufficient inventory for a range of products in assorted sizes and styles. In addition, he must also stock up according to changing demand for different seasons. So, in addition to his stores where the actual operations take place, he needs to have a warehouse where he can receive products from a host of suppliers, stock up according to predicted future demand, replenish the inventory at store locations keeping in mind their sales pattern and so on. Thus, the warehouse serves as a store where products are received, sorted, stored, and retrieved as and when required.
Likewise, while performing your different business activities you generate huge volumes of data. You would be having data about your products, customers, marketing campaigns, employee training, shipments, employees, suppliers, finances, inventory and so on; the list is endless. Unless this data is available in a centralized location, in a usable, well indexed form, all your business departments tend to work in silos, where the left hand is unaware of the actions of the right. So, while all the required data was right there in your company, to enable better, more informed, intelligent business decisions, you weren’t able to take advantage of it since it was residing in disparate sources which everyone could not access.
Thus, you need a location to receive, sort, store and retrieve your data as and when required so that you can utilize multiple Business Intelligence tools to run analyses to support your decision making. A data warehouse is such a centralized location which is a repository of large volumes of data. It is a system which supports data management and business intelligence activities. Data flows into a data warehouse from various sources/ apps and connecting these two endpoints are forward pipelines and reverse syncs.
Read more about Data warehouses in our blog titled The Best Applications of Data Warehousing
Supported Warehouses
Choosing where your data will reside is the first decision you would need to make. We at DataChannel support several data warehouse services to offer you flexibility and control over where you want to store your data. You can pick between a self-managed or a DataChannel managed warehouse.
We offer AWS Redshift, Azure Synapse Analytics, Google BigQuery, Snowflake, or MySQL as supported self-managed warehouses.
Alternatively, you may connect to a DataChannel managed AWS Redshift / Google BigQuery warehouse. We understand that warehouse management is prone to challenges and unless you have a dedicated team for warehouse administration and troubleshooting, you may find it difficult to undertake the task. So, if you do not have the IT resources to set up and manage a warehouse, we at DataChannel can set up a managed warehouse for you.
Before you decide on a data warehouse, we urge you to carefully weigh pricing, performance, security, reliability, and scale for each of these options, keeping in mind your specific business requirements. You must keep in mind that Data Warehouse once chosen while configuring a connector cannot be changed.
What is a Source?
The location from where you would like to move your data. This is the starting point or origin of your data.
Thus, for a forward pipeline that you are configuring, the source is the Web Application/ Database/ Service/ File from where the data is to be moved. E.g.: If you are moving data from Shopify platform to your AWS Redshift data warehouse using Shopify Connector, then Shopify is the source.
On the other hand, for a reverse sync, the source is the data warehouse from where the data is to be moved. E.g.: If you are moving data from your BigQuery data warehouse to your Facebook Ads platform using Facebook Ads Reverse Connector, then your BigQuery data warehouse is the source.
What is a Destination?
The location to which you would like to send your data. This is the landing point where your data is headed.
Thus, for a forward pipeline that you are configuring, the destination is the data warehouse where the data is to be sent. If you are moving data from Shopify platform to your AWS Redshift data warehouse using Shopify Connector, then your AWS Redshift data warehouse is the destination.
On the other hand, for a reverse sync, the destination is the app to where the data is to be sent. E.g.: If you are moving data from your BigQuery data warehouse to your Facebook Ads platform using Facebook Ads Reverse Connector, then the Facebook Ads platform is the destination.
What is an API?
All apps and information systems are built using different software programs and, thus, follow a different syntax. Therefore, to be able to understand each other, they need a facilitator. Here, Application Programming Interface (API) is the code that allows two software programs to communicate with each other. The mechanism of action of an API is that it sends a call (request) to the software program from which it needs data/resources and the software in return sends a response.
What are API end points?
Imagine yourself traveling through a tunnel. There is a specific point where you can enter the tunnel and a specific point where you can exit the tunnel. You cannot just enter / exit the tunnel anywhere you like.
Similarly, API endpoints can be thought of as entry and exit points of a communication tunnel. When APIs send a request to a software for accessing a resource, they can only do so at a specific digital location where such requests for information are accepted and responded to. Thus, API endpoints play a key role in ensuring that the communication being facilitated by the API is successful.
What are Connectors?
Connectors are software elements that move data into and out of a database/ application/service/ file/ warehouse. Thus, connectors are the communication tunnel through which the data is traveling from one location to another. A connector contains all the required information to access a resource such as Libraries, URL, authentication method etc. It collects data from a source and delivers it to your chosen destination. Each connector is designed to deliver data from a specific source to a specific destination so that incompatibility and differences are addressed during the run.
There are two types of connectors that you can build using DataChannel: Forward Connectors and Reverse Connectors.
What are Forward connectors?
Forward Connectors are connectors that move data into your Data Warehouse. Please click here for a list of DataChannel supported Forward Connectors. Every Forward Connector comprises different pipelines each of which can be used to requisition data/resources from a specific API Endpoint in a Web Application/ Database/ Service/ File and send it to your preferred data warehouse. Thus, you can make use of pre-built forward connectors to sync data from a variety of data sources which include Cloud/SaaS Applications, relational databases, cloud data storages or adhoc files.
What are Reverse connectors?
Reverse Connectors are connectors that collect data from your Data Warehouse and deliver it into your chosen Web Application/ Database/ Service/ File. Please click here for a list of DataChannel supported Reverse Connectors. Every Reverse Connector comprises different syncs each of which can be used to requisition data/resources from your data warehouse and send it to a specific API Endpoint in a Web Application/ Database/ Service/ File.
What is Data Extraction?
Data extraction involves pulling data from homogenous or heterogenous sources and validating it to check whether data has been pulled correctly.
What is Data Transformation?
Data Transformation involves data cleansing, applying a set of rules/queries, checking the integrity of the data, creating rows with aggregates and so on. Data transformation is an important step because when data is sourced from disparate sources, they might not be following the same schema.
DataChannel supports SQL-based Data Transformations that can be scheduled to automatically update your tables/ views/ calculated fields whenever new data is loaded into your Data Destinations.
Some useful data transformations that you can use are changing the data type of a column, renaming columns, performing lookup operations, finding the least/ highest value in a column, replacing null values with a predefined value, creating new rows that consist of subtotals/ aggregate values, transposing or pivoting, mapping values in a column to other values and so on. However, you must attempt transformations on your data only if you have a good knowledge and understanding of SQL queries.
What is Data Loading?
Data Loading involves writing/ updating/overwriting the pulled data into the destination.
Still have Questions?
We’ll be happy to help you with any questions you might have! Send us an email at info@datachannel.co.
Subscribe to our Newsletter for latest updates at DataChannel.