CloudThat's Blog
www.cloudthat.com
  • AWS
  • Azure
  • AI/ML
  • Data Analytics
  • DevOps
  • GCP
  • Cybersecurity
  • Test Prep
  • Training
  • Consulting

Step-by-Step-Guide-to-Connect-Azure-Databricks-to-an-Azure-Storage-Account-2
Shaik Munwar Basha Shaik Munwar Basha

Step-by-Step Guide to Connect Azure Databricks to an Azure Storage Account

May 8, 2021 | Comments(7) |

Azure Databricks is a fully managed, Platform-as-a-Service (PaaS) offering which was released on Feb 27, 2019, Azure Databricks leverages Microsoft cloud to scale rapidly, host massive amounts of data effortlessly, and streamline workflows for better collaboration between business executives, data scientists and engineers.

Azure Databricks is a “first party” Microsoft service, the result of a unique year-long collaboration between the Microsoft and Databricks teams to provide Databricks‘ Apache Spark-based analytics service as an integral part of the Microsoft Azure platform.

Azure Databricks uses the Azure Active Directory (AAD) security framework. Existing credentials authorization can be utilized, with the corresponding security settings. Access and identity control are all done through the same environment. Using AAD allows easy integration with the entire Azure stack including Data Lake Storage (as a data source or an output), Data Warehouse, Blob Storage, and Azure Event Hub.

You can use Blob storage to expose data publicly to the world or to store application data privately. For those of you familiar with Azure, Databricks is a premier alternative to Azure HDInsight and Azure Data Lake Analytics.

Azure Databricks

Connecting Azure Databricks to the Azure Storage Account

  • Create a Storage Account and create a container(private) and upload a blob file in it.
    Azure Databricks
  • Upload the blob file into the container, you can download the file from the given link: https://csg10032000aeaa88a0.blob.core.windows.net/datafile/employe_data.csv
    Azure Databricks
  • Click on the context menu and click Generate SAS and copy the blob SAS Token and store it somewhere we will use it in future.
    Azure Databricks
  • Create an Azure Databricks
    Azure Databricks
  • Now click on create and select the subscription if you have many and select/create the resource group name, choose the location where you are trying to create these data bricks and finally select the pricing tier
  • Remain the changes and click on Review + Create and wait for the validation
  • Click on Create once your validation completes
    Azure Databricks
    Azure Databricks
  • Click on Go to resource button once your deployment completes.
    Azure Databricks
  • Click on Launch Workspace then it will redirect to the Azure Databricks page.
  • Now click on Clusters in the left pane and click on Create Cluster and provide the cluster name and Cluster-Mode as Standard and select the configuration details as same mentioned below and create the cluster
    Azure Databricks
  • Now start your cluster and make sure your cluster should be in a running state
  • Now click on the workspace at the left pane, you can see one more workspace then right-click on workspace -> create -> notebook
    Azure Databricks
  • Now give the name of the notebook select Scala in Default Language and select the previous cluster that you have created and click on Create
    Azure Databricks
  • Now paste the below code in the notebook in order to make the connection with your storage account.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    val containerName = "<Container Name>"
    val storageAccountName = "<StorageAccount Nmae>"
    val sas = "<Generated SAS Key>"
    val config = "fs.azure.sas." + containerName+ "." + storageAccountName + ".blob.core.windows.net"
     
    dbutils.fs.mount(
      source = "wasbs://"+containerName+"@"+storageAccountName+".blob.core.windows.net/employe_data.csv",
      extraConfigs = Map(config -> sas))
     
     
    val mydf = spark.read.option("header","true").option("inferSchema", "true").csv("/mnt/myfile")
    display(mydf)
  • If you can fetch the data as shown below then you have successfully completed connecting your Azure DataBricks with your storage Account.
    Azure Databricks

Conclusion:

So far, we understood about Azure DataBricks creation, creating cluster and Notebook and connecting our storage account with DataBricks to access the data using Scala. Engineers who collaborate with business stakeholders to identify and meet data requirements while designing and implementing the management, monitoring, security and privacy of data using the full stack of Azure services to satisfy business needs will benefit extensively from understanding Databricks.

Join our online forum discussions and study groups to enrich your knowledge in pursuit of becoming a Data Science expert with DP-200 Exam: Implementing an Azure Data Solution. Here is a comprehensive study guide to help you crack the exam along with sample questions.


7 Responses to “Step-by-Step Guide to Connect Azure Databricks to an Azure Storage Account”

  1. Venkatesh May 10, 2021

    Very useful information sir

    Reply
  2. rohan saini May 10, 2021

    This article save my lot of time

    Reply
  3. Mani May 11, 2021

    Super excellent work… helped me a lot bro..

    Reply
  4. Mani May 11, 2021

    Super

    Reply
  5. Siva Sai May 19, 2021

    Very Informative Blog

    Reply
  6. Harika May 20, 2021

    Thanks for such good article,it was very useful and helped me alot

    Reply
  7. Pavan Vishwanath Pochinapeddi May 25, 2021

    Excellent and very informative

    Reply

Leave a Reply

Click here to cancel reply.

Benefit from our new offering 'Azure Mastery Pass' – One pass to attend all our Microsoft Azure courses for 1 year & get trained from best of our Consultants and Microsoft Certified Trainer.



Register Your Interest Now! Please fill out below details & we will revert within 24 hours.

Thank you for your interest.
Oops something went wrong.

Follow us for latest articles

Popular Posts

  • Top Cloud Service Providers in 2021: AWS, Microsoft Azure and Google Cloud Platform
  • Tips to Crack AZ-900: Microsoft Azure Fundamentals Exams
  • Site-to-Site VPN connection between AWS & Azure
  • Sample Questions for Amazon Web Services Certified Solution Architect Certification (AWS Architect Certification) – Part I
  • Preparing for Azure 70-532 Exam: Developing Microsoft Azure Solutions

Recent Posts

  • Collaboration of Hugging Face & AWS SageMaker Brings Revolution to NLP Model Training
  • AI-Enabled Customer Segmentation using Amazon SageMaker
  • Best 3 Architectural Approaches to Deploy WordPress Application on Azure Cloud
  • Automated Security Service – AWS Inspector | Improve the Security and Compliance of your AWS Applications
  • An Introduction to AWS Centralized logging, Architecture, and Pricing

Recent Comments

  • Manish M on Top 20 Microsoft Azure Architect Interview Questions and Answers
  • madhura on An Introduction to AWS Centralized logging, Architecture, and Pricing
  • SARTHAK on An Introduction to AWS Centralized logging, Architecture, and Pricing
  • Prapti on An Introduction to AWS Centralized logging, Architecture, and Pricing
  • suhas on An Introduction to AWS Centralized logging, Architecture, and Pricing

Archives

Categories

  • AI
  • AI/ML
  • Analysis
  • Analytics
  • API
  • Apps Development
  • Artificial Intelligence
  • Artificial Intelligence and Machine Learning
  • Automation
  • AWS
  • AWS Amplify
  • AWS EKS
  • AWS PinPoint
  • AWS Thunderbird
  • AWS Transit Gateway
  • AWS VPC
  • Azure
  • Azure Machine Learning
  • Big Data
  • Blockchain
  • Certification BootCamp
  • Cloud Computing
  • Cloud Data Science
  • Cloud Migration
  • Cloud Native Application Development
  • Cloud security
  • Comparision
  • Containerization
  • Cryptocurrency
  • Cybersecurity
  • Data Analytics
  • DevOps
  • Docker
  • Google Cloud (GCP)
  • Hadoop
  • Identity Access and Management
  • Internet of Things (IoT)
  • Interviews
  • Kubernetes
  • Machine Learning
  • Microsoft 365
  • Microsoft Azure
  • ML
  • MongoDB
  • News
  • NoSQL
  • OpenStack
  • Power Platforms
  • Private Cloud
  • Project Management
  • Tutorials
  • Uncategorized
  • vCloud Air
  • Videos
  • Windows

Like us for latest articles

© 2022 CloudThat Technologies. All rights reserved.