Cloud storage is part of any application architecture. You need storage to store backups, files for your users, Disks for virtual machines, archive files, log files.
You use different types of files and blobs to fulfill your business requirements and support your software solution with a highly available, scalable, and secure storage service.
I wrote this blog post after I reviewed many materials about Azure Storage Platform.
This post is for:
- Software architects
- Software developers
- IT Admins
- Technical Leaders
- Anyone interested to know more about the Azure storage platform.
By the end of this post, you will have a good overview of the Azure Storage Platform and its services. When and why to use it.
What is Azure Storage?
Let’s start with a quick overview of Azure storage. It is the cloud storage service on Microsoft Azure.
It offers many services:
- Blobs
- Files
- Tables
- Queues
- Disks
Other related services like Azure HPC cache, Azure FXT Edge Filer, Data Box, Azure StorSimple, and others which will not be covered in this post.
Blobs
Blobs are binary large objects, like images, videos, audios, log files, or any documents. The key idea is that it supports any data whether it is a text or binary.
Blobs has 3 types:
- Block blobs to store text or binary data files for up to 4.7 TB file size.
- Append blobs that are optimized for append operations, the ideal example to use append blobs are log files.
- Page blobs that are randomly accessed files with up to 8 TB file size. the common example is virtual hard drives for virtual machines
Files
Managed cloud Files service supports SMB (Server Message Block Protocol) so it can be used as a file share on cloud or on-promise and you can map it to a drive name in Windows, Linux, or macOS. It’s a good choice to support legacy applications that store files to a folder.
I used it once on a legacy application where I planned to “lift and shift” the application to the cloud. I hosted the application into two virtual machines and configured Files Service as a shared hard drive to store the files. No change in implementation was needed at all.
Tables
It’s a No-SQL key-value store. it’s good to store data you need to query by a key.
Queues
A cloud messaging service that provides exchange messages between application components. It will help you decouple your application’s components to scale them separately.
Disks
Azure storage disks service is used in Azure virtual machines. It gives a highly durable and scalable disk. You can use it to transfer VHD to azure.
Conclusion!
All azure storage services are managed services, so Azure will handle all the update patches for you.
To be able to create an Azure storage account you have to have Azure subscription first.
You can get a free account subscription for one year to use more than 25+ free services tiers. You also get $200 credit to explore azure for the first 30 days.
Choosing a Storage Solution that Fulfills Your Business Requirements
Let’s dive in and talk about Azure storage services in more details so you know what services that meet your business requirements, when to use and when not to use Azure storage services.
Don’t forget to wear your awesome software architect’s hat 🙂
and let’s do it!
let’s discuss the common requirements one by one:
Durability
The durability of the data persistence service is to make sure the data is highly available by maintaining a second copy of all data and failover to it when the primary server crashes.
The fail over must be done without any data loss or downtime.
Azure Storage offers various data redundancy options on azure storage account level to make sure that your data is durable:
1- LRS (Locally redundant storage)
This option offers to replicate 3 copies of your data in one data center (one single physical location) in the selected region. Providing at least 11 nines of the durability of objects over a given year.
2- ZRS (Zone redundant storage)
This option offers replicating data in 3 availability zones in the selected primary region of the storage account. each availability zone is a separate physical location in the same region. this provides extra one 9 of durability so it becomes 12 nines of durability of data objects over a given year.
3- GRS (Geographically redundant storage)
This option offers to replicate data 3 copies in the same physical location in the primary region like (LRS) option and asynchronously replicating it to another 3 copies on physical location in the secondary region.
This provides 16 nines of durability of data objects over a given year.
4- GZRS (Geo zone redundant storage)
This option offers to replicate data 3 copies in 3 different availability zones in the same primary region like (ZRS) and asynchronously replicating it to another 3 copies on physical location in the secondary region.
This also provides 16 nines of durability of data objects over a given year.
Enabling Read-access to Data in Secondary Region
In all previous geo-redundancy types: GRS, and GZRS, the secondary storage is not available for read-access unless there is a failover when the primary region fails.
In RA-GRS and RA-GZRS, the secondary region is always available for read-only access.
Storage redundancy options summary can be reviewed on this azure documentation link
Performance
Azure storage offers many options to support different performance levels. You can choose what match your performance requirements:
1- Performance tiers
There are two performance tiers:
- Premium
- Standard
Premium is your choice if you need fast and consistent response times. it’s the perfect choice for applications that do many small operations. (read and or write). some common cases are data analytics, AI/ML, and data transformation where the fast and consistent response is a must.
Standard is optimized for high capacity and high throughput on large data sets. the common cases are data backups, media content, and bulk data processing.
a quick note: you can’t change the performance tier from Standard to Premium. In case you need it to, you can use the AzCopy command-line tool to migrate data from the standard account to a new Premium account.
2- Access tiers
You can configure your Azure storage account’s availability based on your need. You can do that by selecting the access tier for your storage account. You can also set the access tier on the blob level, which means different blobs in the same account can have different access tiers.
There are 3 access tiers :
- Hot
- Cool
- Archive (on blob level)
Hot access tier is suitable for data that will be accessed frequently. It has a very low latency retrieval.
Cool access tier is suitable for storing data that is infrequently accessed and stored for at least 30 days.
Archive access tier is suitable for data that will not be accessed frequently within 6 months. It has a high latency when you retrieve the data. it may need hours to be able to retrieve it.
One thing you need to pay attention to is that you pay more when you try to access archived data than when you pay when you access hot, cool data.
Archive | Cool | Hot | |
---|---|---|---|
Data Access Occurrence | very rarely (at least 180 days) | infrequently (at least 30 days) | frequently |
Can be Configured at | Only blob level | Account level blob level | Account level blob level |
Storage Cost | Lowest | Lower than Hot | Highest |
Access Cost | Highest | Higher than Hot | Lowest |
After you assess your business requirements, you will have a good understanding of your blob storage best tier.
You may create different azure storage for different blobs for your application.
You can move blobs from the access tier to another to have the most performance and cost-effective storage solution.
Azure also has a good feature that helps you manage the life cycle of your blobs. Here is the link to Azure Blob Storage Life cycle management feature.
Reliability
Reliability refers to the ability of software to perform consistently according to its specifications.
Remember that having redundant support will not protect data against your software error or bug. If a file deleted in the primary region, it will get deleted in the secondary one.
Azure storage offers various features that you can build on top of them a reliable software solution:
1- Versioning
Blob versioning is a new feature to maintain automatic historical copies of the blob data.
It’s still in preview when I wrote this post and it is for non-production.
Currently, it’s available on the following regions:
- France Central
- Canada East
- Canada Central
2- Snapshots
Snapshot is a read-only version of the file that is taken at a point in time. a blog can have many snapshots and it can be deleted explicitly only. too many details about snapshots and how they affect the billing are in azure documentation.
For now, Snapshots may be aligned with your business requirements if you want to support tracking the historical changes of your blobs.
3- Soft delete
Soft delete feature as its name implies will protect your blobs in case they are accidentally deleted by a user or even a bug in the code.
You can recover your blob data if it’s deleted to its original state within a retention period that you configure on account storage level.
Retention period can be between 1 day to 365 days.
Conclusion!
I think snapshots, soft-delete, and along with versioning. all are really useful features and could be a good choice for your application instead of writing a code to implement the same functionalities.
Security
Security is a big topic for sure and Azure Storage supports various features to ensure security in different layers.
Azure storage security is related or part of the Azure Security baseline in general. You can check many security recommendations in azure security baseline documentation.
Here I picked the most 5 features you need to know when considering choosing Azure Storage in your solution:
1- Authorizing Access to Data
Each time you access the data in a storage account, you send a request over Http or Https.
Access Keys Azure AD On-Premise AD SAS
You can have different types of authorization on azure storage account.
- Access keys: each azure account storage has two keys, you can use them in your application to access data programmatically. Access keys are on the storage account level.
- SAS (Shared Access Signature) It’s a URI that grants access rights to Azure storage resource. It has expiry date and it can be generated using CLI, PowerShell, or SDK programmatically.
- Anonymous read access
- Azure AD
- On-premise AD domain services
Kindly note that Azure storage services have Blobs, Files, Queues, Tables and not all of them support all authorization types.
Only The top supported authorization are access keys and SAS.
Access Keys | SAS | Azure AD | On-Premise AD | Anonymous Read Access | |
---|---|---|---|---|---|
Blobs | Supported | Supported | Supported | NOT | Supported |
Files (SMB) | Supported | NOT | Supported | Supported | NOT |
Files (REST) | Supported | Supported | NOT | NOT | NOT |
Queues | Supported | Supported | Supported | NOT | NOT |
Tables | Supported | Supported | NOT | NOT | NOT |
2- Container Read Access Permissions
Under azure storage account, you can create unlimited containers. Containers include blobs. You can configure access permission in container level. Container permissions will be applied to all blobs within the container.
You can set the access permission to one of the following options:
- Private: all blobs are private and can only be accessed if client has permission.
- Public blob access: all blobs can be accessed by anonymous user
- Public container access: all blobs can be accessed by anonymous user just like the public blob access. in addition to that, the container it self is public so all its content of blobs can be listed.
Remember it’s just a read access only.
3- Data Encryption at Rest
Azure storage like any Azure product has many security measures. Data encryption at rest is one of the security measures that add another layer of data protection.
In short, all data in Azure storage accounts are encrypted, so any physical access to the hardware where data is stored will be useless to the attacker. The reason is that the data is encrypted and without the encryption key, the attacker can’t access the unencrypted data even if he has the hard drive.
Besides that, data encryption at rest is a business requirement for some customers because of the government regulations.
So it is a security measure and it is satisfying compliance and regulatory requirements.
4- Firewalls and Virtual Networks
Storage accounts have a public endpoint that is accessible through the internet by default. You can limit that to a specific virtual network based on your architecture.
Your storage firewall configuration also enables you to select trusted Azure platform services to access the storage account securely.
5- Advanced Threat Protection
It’s another layer of security that allows you to address potential threats to access data on Azure storage.
A security alert will be triggered if the abnormal activity occurs or unusual access. Security alerts can be sent via email to admins and they can get details about the suspicious activity.
ATP leverages machine learning and behavioral analytics to detect abnormal behaviors.
One more thing, ATP (Advanced threat protection) for Azure Storage is currently available only for Blob Storage.
Development (Implementation)
Azure storage supports different languages and has various supported SDKs and Tools.
1- Supported SDKs
One of my favorite languages is c# and I usually use .net core SDK. Other language has SDKs so you can evaluate them.
Language | SDK on Github | Package |
---|---|---|
.NET Core | GitHub | NuGet Packages |
NodeJs | GitHub | NPM |
PHP | GitHub | |
Java | GitHub | |
Python | GitHub | SDK Docs |
Go language | GitHub | GitHub |
Azure storage also can be accessed by a Rest API. Check the Azure developer page for more information about tools and SDKs.
The Cost
Azure storage is not expensive. The actual cost you have to pay is affected by many Azure storage options.
- The volume of data you store
- The region where your data is stored
- The data redundancy option
- The storage account type
- Access tiers
- Performance tiers
- Subscription model
- The support plan
- Others.
Azure Pricing Calculator will help you calculate the expected cost based on the options you are considering to choose.
Summary!
Azure Storage will help you store, manage data using different types of data persistence services on the cloud. It will provide you with fast, available, stable, secure, and inexpensive.
Azure storage has quota limits that you have to check when you design your application. Some limits like:
- The total number of azure accounts per region
- Maximum storage account capacity
- Maximum requests per second
- and much more.
Azure storage has a service level agreement which describes Microsoft’s commitment to the service availability and some performance metrics like processing time.
One of the metrics is Geo redundancy lag between the primary and the secondary regions. Many other metrics are defined clearly in the SLA.
Finally, like all technologies, the change will happen, some features will be deprecated, new features will be released. One good practice to keep up to date is azure product update page where you see all new feature announcements and anything on the roadmap for all azure services.
The official Azure storage documentation has a lot more detailed information for things I covered and many things I didn’t cover.
Questions and Answers
It’s a cloud storage service on Microsoft Azure. It offers various services to store different types of blobs, files, NoSQL tables, Queues, and Disks.
Managed cloud Files service that supports SMB (Server Message Block Protocol) so it can be used as a file share on cloud or on promise and you can map it to a drive in Windows, Linux or macOS.
First-in first-out cloud messaging service that provide exchange messages between application components. It will help you decouple your application’s components so scale them separately.
It’s a No-SQL key-value store. it’s good to store data you need to query by a key.
Blobs are binary large objects, like images, videos, audios, log files, or any documents. The key idea is that it supports any data weather it is a text or binary.
File Storage.
LRS (Local Redundancy Storage) is the least durable option.
The Archive option is only set on blob level.
It’s a URI that grants access rights to Azure storage resource. It has expiry date and it can be generated using CLI, PowerShell, Portal Azure or SDK programmatically.