Welcome to our Microsoft Azure Data on Cloud Bootcamp, where we’re dedicated to equipping you with the practical skills needed to excel in today’s job market.
In this blog, we’ll dive into a series of Hands-on Labs and Projects meticulously designed to prepare you for real-world challenges and job interviews. Throughout this journey, you’ll explore a variety of topics essential for mastering data management and analysis on the Azure cloud platform. Our bootcamp offers a structured learning path to cultivate job-ready skills, from foundational concepts to advanced techniques.
Let’s embark on this transformative experience together, unlocking the doors to endless opportunities in the realm of data engineering, scientist and analytics.
This post helps you with Data on the Microsoft Azure journey with your self-paced learning and your team learning. There are 50+Hands-On Labsin this course.
1. Hands-On Labs For Data on Azure Cloud
1.1 Azure Basics labs
- Register For A Free Microsoft Azure Cloud Account
- Switch to pay as you go account
- Create billing alert using Azure portal
1.2 Azure Data Analyst
- Prepare Data in Power BI Desktop
- Load Data in Power BI Desktop
- Model Data in Power BI Desktop
- Create DAX Calculations in Power BI Desktop – Part 1
- Create DAX Calculations in Power BI Desktop – Part 2
- Design a Report in Power BI Desktop – Part 1
- Design a Report in Power BI Desktop – Part 2
- Create a Power BI Dashboard
- Perform Data Analysis in Power BI Desktop
- Enforce Row-Level Security
1.3 Azure Data Engineer
- Explore Azure Synapse Analytics
- Query Files using a Serverless SQL Pool
- Transform files using a serverless SQL pool
- Analyze data in a lake database
- Analyze data in a data lake with Spark
- Transform data using Spark in Synapse Analytics
- Use Delta Lake with Spark in Azure Synapse Analytics
- Explore a relational data warehouse
- Load Data into a Relational Data Warehouse
- Build a data pipeline in Azure Synapse Analytics
- Use an Apache Spark notebook in a pipeline
- Use Azure Synapse Link for Azure Cosmos DB
- Use Azure Synapse Link for SQL
- Get started with Azure Stream Analytics
- Ingest real-time data with Azure Stream Analytics and Azure Synapse Analytics
- Create a real-time report with Azure Stream Analytics and Microsoft Power BI
- Use Microsoft Purview with Azure Synapse Analytics
- Explore Azure Databricks
- Use Spark in Azure Databricks
- Use Delta Lake in Azure Databricks
- Use a SQL Warehouse in Azure Databricks
- Automate an Azure Databricks Notebook with Azure Data Factory
1.4 Azure Data Scientist
- Explore the Azure Machine Learning Workspace
- Explore developer tools for Workspace interaction
- Make Data available in Azure Machine Learning
- Work with Compute resources in Azure Machine Learning
- Work with environments in Azure Machine Learning
- Train a model with the Azure Machine Learning Designer
- Find the best classification model with Automated Machine Learning
- Track model training in notebooks with MLflow
- Run a training script as a command job in Azure Machine Learning
- Use MLflow to track training jobs
- Perform Hyperparameter Tuning with a Sweep job
- Run pipelines in Azure Machine Learning
- Create and explore the Responsible AI dashboard
- Log and register models with MLflow
- Deploy a model to a batch endpoint
- Deploy a model to a managed online endpoint
2. Real time-Projects
- Project 1: Design a dashboard with a basic set of visualizations and DAX queries
- Project 2: Big Data Visualization Project
- Project 3: Transform Data by Using Azure Data Factory
- Project 4: Tokyo Olympics Insights
Download Data on Azure Cloud Bootcamp Brochure
1.1 Azure Basics labs
1)Register For A Free Microsoft Azure Cloud Account
Creating a Microsoft Azure free accountis one way to access Azure services. When you start using Azure with a free account, you getUSD2001credit to spend in the first 30 days after you sign up. In addition, you get free monthly amounts of two groups of services: popular services, which are free for 12 months, and more than 25 other services that are free always.
2. Switch to Pay as you go account
The pay-as-you-go subscription model is a pricing strategy where customers pay only for the resources they use. This is in contrast to traditional subscription models, where customers are charged a fixed monthly or annual fee regardless of their usage.
The PAYG model is becoming increasingly popular for a number of reasons. First, it gives customers more flexibility and control over their spending. Customers can scale their usage up or down as needed, and they only pay for what they use.
Second, the PAYG model can help businesses to reduce their upfront costs. Businesses do not need to commit to a long-term contract or purchase a large upfront license fee. This can free up capital for other business needs.
3. Create billing alert using Azure portal
Azure billing alerts allow you to monitor your Azure spending and receive notifications when your spending exceeds a certain threshold. This can help you to avoid unexpected costs and stay within your budget.
Billing alerts are based on the following concepts:
- Threshold:A threshold is a value that you specify.When your Azure spending exceeds the threshold,the alert will be triggered.
- Frequency:The frequency is how often the alert condition is evaluated.You can choose to evaluate the alert condition every hour,day,week,or month.
- Alert actions:Alert actions are the steps that are taken when the alert is triggered.You can choose to send an email notification,create an Azure Monitor alert,or call a webhook
1.2 Azure Data Analyst
1) Prepare Data In Power BI Desktop
Before you will create reports in Power BI, you first need to first extract data from the data sources. Power BI Desktop allows you to get data from different types of files. When you click on theGet datafeature in Power BI then you can find a list of the available options from where you can import your data.
In this lab, we will focus on the first step, of getting the data from the many different data sources and importing it into Power BI by using Power Query.
2) Load Data In Power BI Desktop
Consider the scenario where you’ve got imported data into Power BI from several different sources and, when you examine the loaded data, it is not well prepared for analysis. What could make the data unprepared for analysis?
Power BI and Power Query come with a powerful environment to clean the raw data and prepare the data. In this lab, you will learn how to transform raw data with Power Query Editor in Power BI Desktop.
3) Model Data In Power BI Desktop, Part 1
In this lab, you’ll initiate developing the data model. It’ll involve creating relationships between tables, and then configuring table and column properties to enhance the friendliness and usability of the data model. You’ll also create hierarchies and create quick measures.
In this lab, you’ll create a many-to-many relationship between theSalestable and theSalespersontable. You’ll also implement row-level security to secure that a salesperson can only analyze sales data for their assigned region(s).
4) Create DAX Calculations In Power BI Desktop, Part 1
DAX [Data Analysis Expressions] is a programming language that is used throughout Power BI for creating calculated columns, measures, and custom tables. It is a collection of operators, functions, and constants that can be used in an expression, or formula, to calculate and return one or more values.
In this lab, you’ll create calculated columns, calculated tables, and simple measures using Data Analysis Expressions (DAX).
5) Create DAX calculations In Power BI Desktop, Part 2
In this lab, you’ll create measures with Data Analysis Expressions involving filter context manipulation and you’ll use Time Intelligence functions.
6) Design A Report In Power BI Desktop – Part 1
Power BI visuals are attractive graphics and charts that you can use to modernize your data. Visuals allow you to share data intuition more effectively and increase retention, comprehension, and appeal. After you’ve loaded your data and modeled it in Power BI Desktop, you will be ready to start creating your reports.
In this lab, you’ll create a three-page report and then you’ll publish it to Power BI, where you can easily open and interact with the report.
7) Design A Report In Power BI Desktop, Part 2
In this lab, you’ll enhance theSalesAnalysiswith advanced interactions and drill through features. You’ll learn how to work with Sync slicers and Drill through features. You’ll also Add bookmarks and buttons to your reports.
8) Create A Power BI Dashboard
Power BI reports and Power BI dashboards are not the same. Dashboards allow report consumers to create a single output of directed data that is personalized just for them.
Dashboards can be composed of pinned visuals that are taken from different reports. Where a report uses data from a single dataset and a dashboard can contain visuals from many different datasets.
In this lab, you’ll create theSalesMonitoringdashboard.
9) Perform Data Analysis In Power BI Desktop
In this lab, you’ll use the AI aspect of the advanced analytic capabilities of Power BI to enhance your reports in good ways. In this lab, you’ll create a forecast to determine possible future sales revenue. You’ll create a decomposition tree and using theKey influencersAI visual to determine what influences profitability.
10) Enforce Row-level Security
In this lab, you will enforce row-level security to ensure a salesperson can only ever see sales made in their assigned region(s). For example : you will enforce row-level security to ensure a salesperson can only see sales made in their assigned region.
USERPRINCIPALNAME() is a Data Analysis Expressions (DAX) function that returns the name of the authenticated user. It means that theSalesperson (Performance)table will filter by the User Principal Name (UPN) of the user querying the model.
This is the list of activity guides/hands-on required for the preparation of the PL-300 Microsoft Power BI Data Analyst Exam.
1.3 Azure Data Engineer
1) Explore Azure Synapse Analytics
In the competitive world of retail, staying ahead of the curve is essential. As the Sr. Data Engineer of a growing retail company called “Fashion Forward,” you’re faced with the challenge of making data-driven decisions in a rapidly changing market. To meet this challenge, you decide toexplore Azure Synapse Analytics, a promising solution in the Microsoft Azure ecosystem
2) Query Files using a Serverless SQL Pool
In today’s data-driven business landscape, organizations rely heavily on data analysis to make informed decisions. As the Head of Data Analytics at “TechCo,” a rapidly expanding technology company, you are tasked with finding anefficient way to analyze large volumesof data without the complexities of managing infrastructure. To address this challenge, you decide to leverage Serverless SQL Pools within Microsoft Azure Synapse Analytics for querying files and extracting valuable insights.
3)Transform files using a serverless SQL pool
In today’s data-centric business environment, organizations often face the challenge of efficiently transforming large volumes of data toextract valuable insights. As the Director of Data Engineering at “DataTech Solutions,” a data-focused company, you are tasked with finding a scalable and cost-effective solution for data transformation. To meet this challenge, you decide to utilizeServerless SQL Poolwithin Azure Synapse Analytics for transforming files and enhancing data quality.
4) Analyze data in a lake database
In today’s data-driven business environment, organizations are continually looking for ways to harness the power of theirdata for insights and decision-making.As the Chief Analytics Officer (CAO) of a global retail conglomerate called “RetailX,” you recognize the importance of analyzing vast amounts of data stored in your data lake database. To derive valuableinsights and drive data-centric strategies, you embark on a journey to analyze data in your data lake database.
5) Analyze data in a data lake with Spark
In today’s data-centric business landscape, organizations are continuously seeking ways to unlock the insights hidden within their vast repositories of data. As the Chief Data Officer (CDO) of “DataInsights Corp,” a leading data analytics company, you recognize the importance ofharnessing the power of big datatechnologies. To extract valuable insights and drive data-centric strategies, you embark on a journey to analyze data in your data lake using Apache Spark.
6)Transform data using Spark in Synapse Analytics
Data engineers often use Spark notebooks as one of their preferred tools to perform extract, transform, and load (ETL) or extract, load, and transform (ELT) activities that transform data from one format or structure to another
As the Chief Data Officer (CDO) of “TechSolutions Inc.,” a prominent technology firm, you recognize the importance ofefficient data transformation. To address this need, you embark on a journey to transform data using Apache Spark within Microsoft Azure Synapse Analytics, a powerful platform for data integration and analytics.
In this exercise, you’ll use a Spark notebook in Azure Synapse Analytics to transform data in files
7) Use Delta Lake with Spark in Microsoft Azure Synapse Analytics
In the era of data-driven decision-making, companies are constantly seeking ways to improve data processing, storage, and analytics. As the Chief Data Officer (CDO) of a global financial institution named “FinTechCorp,” you face the challenge ofmanaging vast volumes of financial data securely and efficiently.To address these needs, you decide to leverageDelta Lake with Sparkwithin Azure Synapse Analytics, a powerful combination for modern data processing.
8) Explore a relational data warehouse
Azure Synapse Analytics is built on a scalable set capability to support enterprise data warehousing; including file-based data analytics in a data lake as well as large-scale relational data warehouses and the data transfer and transformation pipelines used to load them.
As the Data Engineer of “DataTech Enterprises,” a leading data-driven company, you recognize the importance ofexploring a relational data warehouse to unlock valuable insights for informed decision-making and strategic planning.
In this lab, you’ll explore how to use a dedicated SQL pool in Microsoft Azure Synapse Analytics to store and query data in a relational data warehouse.
9) Load Data into a Relational Data Warehouse
Extract, Load, and Transform (ELT) is a process by which data is extracted from a source system, loaded into a dedicated SQL pool, and then transformed.
To centralize and manage data effectively, you initiate a project to load data into a relational data warehouse forcomprehensive analytics.
In this exercise, you’re going to load data into a dedicated SQL Pool.
10) Build a data pipeline in Microsoft Azure Synapse Analytics
In this exercise, you’ll load data into a dedicated SQL Pool using a pipeline inAzure Synapse Analytics Explorer. The pipeline will encapsulate a data flow that loads product data into a table in a data warehouse.
11) Use an Apache Spark notebook in a pipeline
The Synapse Notebook activity enables you to run data processing code in Spark notebooks as a task in a pipeline; making it possible toautomate big data processing and integrateit into extract, transform, and load (ETL) workloads.
In this exercise, we’re going to create an Azure Synapse Analytics pipeline that includes an activity to run an Apache Spark notebook.
12)Use Microsoft Azure Synapse Link for Azure Cosmos DB
As a Sr. Data Engineer of “CloudData Enterprises,” a forward-thinking technology company, you understand the importance of real-time data analytics. To meet this need, you initiate a project to leverage Azure Synapse Link forAzure Cosmos DB, a powerful solution for enhancing real-time analytics and decision-making.
13) Use Azure Synapse Link for SQL
As the Data Engineer of “DataXcellence Corp,” a cutting-edge data-focused company, you recognize theimportance of real-time analytics.To meet this need, you initiate a project to leverage Azure Synapse Link for SQL, a powerful solution that enables real-time access to data stored in Azure Synapse Analytics.
14) Get started with Microsoft Azure Stream Analytics
In today’s fast-paced business environment, organizations need to harness the power ofreal-time dataprocessing to gain actionable insights and respond swiftly to changing conditions. As the Data Engineer of “StreamTech Innovations,” a forward-thinking technology company, you recognize the importance of real-time analytics. To stay ahead in the market and drive data-driven strategies, you initiate a project to get started with Azure Stream Analytics, apowerful platform for processing and analyzing streaming data.
15) Ingest realtime data with Microsoft Azure Stream Analytics and Microsoft Azure Synapse Analytics
In this exercise, you’ll use Azure Stream Analytics to process a stream of sales order data, such as might be generated from an online retail application. The order data will be sent to Azure Event Hubs, from where yourAzure Stream Analytics jobswill read the data and ingest it into Azure Synapse Analytics.
16) Create a real-time report with Azure Stream Analytics and Microsoft Power BI
In this exercise, you’ll use Azure Stream Analytics to process a stream of sales order data, such as might be generated from an online retail application. The order data will be sent to Azure Event Hubs, from where your Azure Stream Analytics job willread and summarize the data before sending it to Power BI, where you will visualize the data in a report.
17) Use Microsoft Purview with Microsoft Azure Synapse Analytics
In today’s data-driven business landscape, organizations are constantly seeking ways to efficiently integrate and process data from various sources. As the Data Engineer of “DataTech Solutions,” a prominent data-focused company, you recognize the importance of a robust data pipeline to streamline data ingestion and transformation. To meet this need, you initiate a project to build a data pipeline in Azure Synapse Analytics, a powerfuldata integration and analytics platform.
18) Explore Azure Databricks
In today’s data-driven business landscape, organizations are constantly seeking ways to gain deeper insights from their data. As the Data Engineer of “TechInsights,” a dynamic technology company, you recognize the need for a robust platform to analyze and derive value from diverse datasets. To address this need, you decide to explore Azure Databricks,a powerful analytics platformthat combines the best ofApache Sparkwith Azure cloud capabilities.
19) Use Spark in Azure Databricks
In today’s data-driven landscape, organizations are continually seeking ways to unlock the insights hidden within their massive datasets. As the Data Engineer of “DataTech Enterprises,” a dynamic data-focused company, you recognize theimportance of leveraging cutting-edge technologiesfor data analytics. To meet this demand, you decide to utilize Apache Spark within Azure Databricks, a high-performance analytics platform, to empower your data analytics team.
20) Use Delta Lake in Azure Databricks
In today’s data-driven world, organizations are continually seeking ways to streamline data management, improve data quality, and enhance analytics capabilities. As the Chief Technology Officer (CTO) of a fast-growing e-commerce company named “ShopifyX,” you’re faced with thechallenge of managing and analyzing diverse data sources. To address these challenges, you decide to implement Delta Lake within Azure Databricks, a powerful combination for data lake management and analytics.
21) Use a SQL Warehouse in Azure Databricks
In today’s data-driven business landscape, organizations need to make informed decisions in real-time to stay competitive and responsive. As the Chief Data Officer (CDO) of “DataInsights Corp,” a forward-thinking data-driven company, you recognize the importance of real-time reporting and analytics. To empower your organization with immediate business insights, you initiate a project tocreate real-time reports using Azure Stream Analyticsin conjunction with Microsoft Power BI, enabling real-time data visualization and decision-making.
22) Automate an Azure Databricks Notebook with Azure Data Factory
In today’s data-centric world, organizations rely on efficient data processing to extract insights and drive informed decision-making. As the Chief Data Officer (CDO) of “DataOps Solutions,” an innovative data-focused company, you understand the importance of automating data workflows.To enhance productivity and enable seamless data processing, you initiate a project to automate an Azure Databricks Notebook with Azure Data Factory, streamlining data transformation and analysis.
1.4 Azure Data Scientist
1) Explore the Microsoft Azure Machine Learning Workspace
The workspace is the first resource to create for Azure Machine Learning, it provides a centralized place to work with all the assets like data, compute, model training code logged metrics, and trained models you create when you use Microsoft Azure Machine Learning. The workspace keeps the history of all the training runs to pick the best model out of it.
2) Explore developer tools for Workspace interaction
A developer tasked with integrating their application with an AML workspace must explore tools for provisioning resources, submitting and monitoring jobs, managing data and models, automating workflows, and extending functionality.
Check out:Overview ofAzure Machine Learning Service
3) Make Data available in Azure Machine Learning
As a data scientist working at a company that is new to machine learning, you have been tasked with making your data available in Azure Machine Learning (AML). This involves creating and configuring datastores within your AML workspace, which will enable you to store and manage your data in a secure and scalable manner.
4) Work with Compute resources in Azure Machine Learning
In this exercise, you’ll learn how to use cloud compute in Microsoft Azure Machine Learning to run experiments and production code at scale.
5) Work with environments in Azure Machine Learning
To run notebooks and scripts, you must ensure that the required packages are installed. Environments allow you to specify the runtimes and Python packages that must be used by your compute to run your code.
In this exercise, you will learn about environments and how to use them when training machine learning models with Azure Machine Learning compute.
Read more:MLOpsis based on DevOps principles and practices that increase the efficiency of workflows and improves the quality and consistency of the machine learning solutions.
6) Train a model with the Microsoft Azure Machine Learning Designer
Azure Machine Learning Designer provides a drag and drop interface with which you can define a workflow. You can create a workflow to train a model, testing and comparing multiple algorithms with ease.
In this exercise, you’ll use the Designer to quickly train and compare two classification algorithms
7) Find the best classification model with Automated Machine Learning
AutoML allows you to try multiple preprocessing transformations and algorithms with your data to find the best machine learning model.
In this exercise, you’ll use automated machine learning to determine the optimal algorithm and preprocessing steps for a model by performing multiple training runs in parallel.
8) Track model training in notebooks with MLflow
To track your work and keep an overview of the models you train and how they perform, you can use MLflow tracking.
In this exercise, you’ll MLflow within a notebook running on a compute instance to log model training.
9) Run a training script as a command job in Azure Machine Learning
A notebook is ideal for experimentation and development. Once you’ve developed a machine learning model and it’s ready for production, you’ll want to train it with a script. You can run a script as a command job.
In this exercise, you’ll test a script and then run it as a command job.
Readmore about Datastores and datasets in our blog at Working WithAzure Datastores and Datasets.
10) Use MLflow to track training jobs
MLflow is an open-source platform for managing the end-to-end machine learning lifecycle. MLflow Tracking is a component that logs and tracks your training job metrics, parameters and model artifacts.
In this exercise, you’ll use MLflow to track model training run as a command job.
11) Perform Hyperparameter Tuning with a Sweep job
Hyperparameters are variables that affect how a model is trained, but which can’t be derived from the training data. Choosing the optimal hyperparameter values for model training can be difficult, and usually involves a great deal of trial and error.
In this exercise, you’ll use Microsoft Azure Machine Learning to tune hyperparameters by performing multiple training trials in parallel.
12) Run pipelines in Microsoft Azure Machine Learning
You can use the Python SDK to perform all of the tasks required to create and operate a machine-learning solution in Azure. Rather than perform these tasks individually, you can use pipelines to orchestrate the steps required to prepare data, run training scripts, and other tasks.
In this exercise, you’ll run multiple scripts as a pipeline job.
13) Create and explore the Responsible AI dashboard
After you train your model, you’ll want to evaluate your model to explore whether it’s performing as expected. Next to performance metrics, there are other factors you can take into consideration. The responsible AI dashboard in Azure Machine Learning allows you to analyze the data and the model’s predictions to identify any bias or unfairness.
In this exercise, you’ll prepare your data and create a responsible AI dashboard in Azure Machine Learning.
14) Log and register models with MLflow
MLflow is an open-source platform for managing the end-to-end machine learning lifecycle. When you log models with MLflow, you can easily move the model across platforms and workloads.
In this exercise, you’ll use MLflow to log machine learning models.
15) Deploy a model to a batch endpoint
In many scenarios, inferencing is performed as a batch process that uses a predictive model to score a large number of cases. To implement this kind of inferencing solution in Azure Machine Learning, you can create a batch endpoint.
In this exercise, you’ll deploy an MLflow model to a batch endpoint, and test it on sample data by submitting a job.
16) Deploy a model to a managed online endpoint
To consume a model in an application, and get real-time predictions, you’ll want to deploy the model to a managed online endpoint. An MLflow model is easily deployed since you won’t need to define the environment or create the scoring script.
In this exercise, you’ll deploy an MLflow model to a managed online endpoint, and test it on sample data.
1. Design a dashboard with a basic set of visualizations and DAX queries:
In a rapidly evolving marketing landscape, Company’s bold approach to data management sets them apart as a forward-thinking industry player. By implementing a powerful business intelligence solution, Company aims to streamline its operations, enhance decision-making capabilities, and maintain its competitive advantage. With complete insights into inventory, performance metrics, competitor activities, and other critical data, the Company is well-equipped to navigate the challenges of the marketing industry and drive its business forward into a prosperous future.
2. Big Data Visualization Project:
In this Data Engineering project, our team will tackle the task of designing a comprehensive solution to effectively handle and process historical flight delays and weather data. Our primary objective is to develop a machine-learning model capable of accurately predicting flight delays. Throughout this blog, we will delve into the intricacies of ingesting and preparing the vast amounts of data involved, as well as the crucial steps involved in training and deploying the predictive model. Join us as we embark on this data engineering exciting journey of harnessing the power of data and machine learning to improve the accuracy of flight delay predictions.
3. Transform Data by Using Azure Data Factory:
You are a Data Engineer with Hexelo. You need to provision a new Azure Data Factory that supports a data pipeline that will transform data. First, you will design a batch processing solution, and then you will add directories to a storage account that uses a Data Lake Storage hierarchical namespace. Next, you will deploy an Azure Data Factory, and then you will create a data pipeline. Finally, you will author a copy data activity that will transform data into a blob data file, and then you will test and publish the data pipeline.
4. Tokyo Olympics Insights:
As data engineers, we are expected to convert this raw data into information that can be interpreted by data scientists and business analysts. We need to create a power BI report that will help us gain insights about details such as:
- Gender distribution across various Olympic disciplines
- Medals distribution across the countries
- Discipline distribution across the countries
Related/References
- Microsoft Power BI VS Tableau | Which one is Better?
- Introduction To Data Analysis Expression (DAX) In Power BI
- Azure Data Lake For Beginners: All You Need To Know
- Introduction to Big Data and Big Data Architectures
- Azure Machine Learning Studio
- Overview of Hyperparameter Tuning In Azure
- Object Detection And Tracking In Azure Machine Learning
Next Task For You
In ourAzure Data on Cloud Job-Oriented training program, we will cover 50+ Hands-On Labs.If you want to begin your journey towards becoming aMicrosoft Certified Associate and Get High-Paying Jobscheck out ourFREE CLASS.