Azure Synapse Link for Azure Cosmos DB
Azure Synapse team recently provided a new powerful add-in for Synapse studio which makes online analytics simpler than ever.
Previously: You had to create a Spark instance in order to run different spark jobs. Also, in order to avoid operational database resources overloading, you were required to move all the relevant data to data lakes.
Now: With Synapse link to Cosmos DB you can run online analytics with the following advantages: You now can create Spark notebooks directly from the Cosmos Collection/Containers. You can run Spark jobs on isolated Azure Cosmos Containers, which will help you avoid operational database rescores overload. Also, you don’t need to create separated Spark instance, you can run it on the Synapse compute pool.
There is no need for an additional data transformations process, provisions Spark service or other tools. Moreover, there isn’t any effect on the database operation.
The new add-in is available on Azure Cosmos SQL API and on preview for MongoDB API.
For example, I’ve created a new collection “drivers” on Cosmos. This collection has an “Analytical Store” option enabled.
Then, via Synapse studio, you can run spark jobs on this collection. This is very useful for online analytics.
Cost: Azure Synapse link to Cosmos DB analytical store follows a consumption-based pricing model which consists of job usage + compute size.
For more information check out Azure documentation.
Feel free to reach out to us with any Azure related question: [email protected]