Until very recently, if your company needs to share data to other branches, business partners or your customers, you need either make a file for the data, REST API or make a report of the data and then send it somehow to the recipient. To make this data pipeline accessible and easy to use, your company had to spend a lot of money and effort to build up the processes. And even when the data is flowing, it wouldn’t be at all near-real-time and customization would again take a lot of time and investments.
Data sharing via Flat Files
When sharing data with files, the options usually are CSV, XML, JSON, Excel file or some other modern flat file format. With the file approach you first must agree what kind of data you are sending and what are the filters to reduce the amount of data in the files. Then you must build the data pipeline using SFTP, HTTPs or some other secure transfer protocol. On the other end of the data pipeline there must be a similar end using the same protocols that gets the data and saves it to other ends servers. From the file the data is then usually parsed in to a form that is usable in reports (most likely to a database table). And then customer makes their own reports over the data, so that data is more visual and accessible for business users. A very long and error-prone way of sharing data.
Data sharing via API
Using a REST API, a bit more modern way to approach data sharing challenge. You can create a software interface over your data and allow customers to connect on their own, when ever they need the data, to the data. REST API can also include filters for the data, so that the amount of data won’t get very big. The challenges when using REST APIs are that it isn’t very usable for a large amount of data, moving data over internet connections take time. And it takes money and effort to build up the REST API software layer over the data. Then you must use working hours in accessing controls to grant accesses to REST API users. And again, at the REST API usage end, the data must be formatted to be used in their choice of reporting. Usually better than the file approach but very time consuming and still not a good way to share when having a lot of data.
Data sharing via report
Sharing data with reports can be approached by sending just reports in PDF or in other report formats by email to the receiving end. This can be done with a minimal effort, because most of the reporting tools have this kind of functionality build in. This approach to share data limits the receiver to the report format that has been preformatted by sending party and similarly the amount of data is limited (filtered) by the sending party. Other way to share reports is to allow other parties the access to the reporting server portal, so that partners and customers can filter and select the data they want to see at that time. This other approach also takes some work hours for access control and then the reports are usually pretty much fixed and other parties are unable to create their own reports. For changes other parties must ask the data owner party to make report changes or then the data owner must ask for a third party to make the changes. Usually changes won’t be visible in a very short time for the end-users. If end-users want to combine the report data into their own data, this report consuming approach is not a viable option for that. In addition to the reports there must be another data sharing option (choose one from the just presented sharing options: flat file or REST API).
Modern data sharing
A modern and so far, the best answer to data sharing challenges is Snowflake data sharing option. Currently if you and your branches, partners or customers are using Snowflake at the same region, it is possible to share data in database tables, views or user defined functions (UDF) from account to account. For one share provider there can be multiple share consumers. In the data consumer side, a database is created for the share, so the consumer side can grant their own accesses to the data as they wish.
When you share Snowflake data, no actual data is copier or transferred between accounts. Your data consumer only gets a view to the data with read-only access, so the original data cannot be deleted or modified through the share.
Snowflake costs are based on the queries and the storage. When doing a share, data consumers don’t have to pay for the storage costs at all, only for the queries they run against the data. Unless data consumers copy the data to their own databases to use them e.g. joined with their own data, then consumers must pay also for the storage for the data.
There are no limits to the shares you can create on you account, so if the data provider wishes there can be different shares for different kind of data. Limit is that there can be only one database in data consumer side per share.
Currently sharing between Snowflake accounts works only if the accounts are in the same cloud and within the same region in the cloud. In the Snowflakes development track, there is an effort to enable sharing data between other regions and even sharing from one cloud provider to another – time schedule for the solutions is still open.