- Suppose you have two video files stored as blobs. One of the videos is business-critical and requires a replication policy that creates multiple copies across geographically diverse datacenters. The other video is non-critical, and a local replication policy is sufficient. Which of the following options would satisfy both data diversity and cost sensitivity consideration.
A. Create a single storage account that makes use of Local-redundant storage (LRS) and host both videos from here.
B. Create a single storage account that makes use of Geo-redundant storage (GRS) and host both videos from here.
C. Create a two storage accounts. The first account makes use of Geo-redundant storage (GRS) and hosts the business-critical video content. The second account makes use of Local-redundant storage (LRS) and hosts the non-critical video content.
- The name of a storage account must be:
A. Unique within the containing resource group.
B. Unique within your Azure subscription.
C. Globally unique.
- In a typical project, when would you create your storage account(s)?
A. At the beginning, during project setup.
B. After deployment, when the project is running.
C. At the end, during resource cleanup.
Large file shares provide support up to a 100 TiB, however this type of storage account can’t convert to a Geo-redundant storage offering, and upgrades are permanent.
EPHEMERAL OS DISKS
An ephemeral OS disk is a virtual disk that saves data on the local virtual machine storage. An ephemeral disk has faster read-and-write latency than a managed disk. It’s also faster to reset the image to the original boot state if you’re using an ephemeral disk. However, an individual virtual machine failure might destroy all the data on an ephemeral disk and leave the virtual machine unable to boot. Because ephemeral disks reside locally to the host, they incur no storage costs and are free.
Ephemeral disks work well when you want to host a stateless workload, such as the business logic for a multitier website or a microservice. Such applications are tolerant of individual virtual machine failures, because requests can be rerouted to other virtual machines in the system. You can reset the failed virtual machine to its original boot state rapidly and get it back up and running faster than if it used managed disks.
Most disks that you use with virtual machines in Azure are managed disks. A managed disk is a virtual hard disk for which Azure manages all the required physical infrastructure. Because Azure takes care of the underlying complexity, managed disks are easy to use. You can just provision them and attach them to virtual machines.
Virtual hard disks in Azure are stored as page blobs in an Azure Storage account, but you don’t have to create storage accounts, blob containers, and page blobs yourself or maintain this infrastructure later.
The benefits of managed disks include:
- Simple scalability. You can create up to 50,000 managed disks of each type in each region in your subscription.
- High availability. Managed disks support 99.999% availability by storing data three times. If there’s a failure in one replica, the other two can maintain full read-write functionality.
- Integration with availability sets and zones. If you place your virtual machines into an availability set, Azure automatically distributes the managed disks for those machines into different fault domains so that your machines are resilient to localized failures. You can also use availability zones, which distribute data across multiple datacenters, for even greater availability.
- Support for Azure Backup. Azure Backup natively supports managed disks, which includes encrypted disks.
- Granular access control. You can use Azure role-based access control (RBAC) to grant access to specific user accounts for specific operations on a managed disk. For example, you could ensure that only an administrator can delete a disk.
- Support for encryption. To protect sensitive data on a managed disk from unauthorized access, you can encrypt it by using Azure Storage Service Encryption (SSE), which is provided with Azure Storage accounts. Alternatively, you can use Azure Disk Encryption (ADE), which uses BitLocker for Windows virtual machines, and DM-Crypt for Linux virtual machines.
An unmanaged disk, like a managed disk, is stored as a page blob in an Azure Storage account. The difference is that with unmanaged disks, you create and maintain this storage account manually. This requirement means that you have to keep track of IOPS limits within a storage account and ensures that you don’t over provision throughput of your storage account. You must also manage the security and RBAC access at the storage account level, instead of at each individual disk with managed disks.
Because unmanaged disks don’t support all of the scalability and management features that you’ve seen for managed disks, they’re no longer widely used. Consider using them only if you want to manually set up and manage the data for your virtual machine in the storage account.
In the portal, to use unmanaged disks, expand the Advanced section on the Disks page of the Create a virtual machine wizard.
Originally, all virtual hard disks in Azure were unmanaged. If you’re running an old virtual machine, it might have unmanaged disks. You can convert those unmanaged disks to managed disks by using the ConvertTo-AzureRmVmManagedDisk PowerShell cmdlet.
- Which of these disk roles should you use to store a set of help videos that explain how to use the accounting application?
- OS disk
- Data disk
- Temporary disk
- You have a business-critical database, and you want to store it on a virtual disk with 99.999% availability. What kind of disk should you use?
- An ephemeral disk
- An unmanaged disk
- A managed disk
Ultra SSDs provide the highest disk performance available in Azure. Choose them when you need the fastest storage performance, which includes high throughput, high IOPS, and low latency.
The performance of an Ultra SSD depends on the size you select, as you can see from examples in this table:
|Disk Size (GB)||IOPS||Throughput (MBPS)|
Ultra disks can have capacities from 4 GB up to 64 TB. A unique feature of ultra disks is that you can adjust the IOPS and throughput values while they’re running and without detaching them from the host virtual machine. Performance adjustments can take up to an hour to take effect.
Ultra disks are a new disk type and currently have some limitations:
- They’re only available in a subset of Azure regions.
- They can only be attached to virtual machines that are in availability zones.
- They can only be attached to ES/DS v3 virtual machines.
- They can only be used as data disks and can only be created as empty disks.
- They don’t support disk snapshots, virtual machine images, scale sets, Azure Disk Encryption, Azure Backup, or Azure Site Recovery.
Premium SSDs are the next tier down from ultra disks in terms of performance, but they still provide high throughput and IOPS with low latency. Premium disks don’t have the current limitations of ultra disks. For example, they’re available in all regions and can be used with virtual machines that are outside of availability zones.
You can’t adjust performance without detaching these disks from their virtual machine. Also, you can only use premium SSDs with larger virtual machine sizes, which are compatible with premium storage.
This table has examples that illustrate the high performance of premium SSDs:
With premium SSDs, these performance figures are guaranteed. There’s no such guarantee for standard tier disks, which can be impacted occasionally by high demand.
|Disk Size Name||Disk Size||IOPS||Throughput (MBPS)|
If you need higher performance than standard disks provide, or if you can’t sustain occasional drops in performance, use premium SSDs. Also use premium SSDs when you want the highest performance but can’t use ultra disks because of their current limitations. Premium SSDs are a good fit for mission-critical workloads in medium and large organizations.
Standard SSDs in Azure are a cost-effective storage option for virtual machines that need consistent performance at lower speeds. Standard SSDs aren’t as fast as premium or ultra SSDs, but they still have latencies in the range of 1 millisecond to 10 milliseconds and up to 6,000 IOPS. They’re available to attach to any virtual machine, no matter what size it is.
This table has examples that illustrate the performance characteristics of standard SSDs in several sizes:
|Disk Size Name||Disk Size (GB)||IOPS||Throughput (Mbps)|
These performance figures aren’t guaranteed but are achieved 99% of the time.
Use standard SSDs when you have budgetary constraints and a workload that isn’t disk intensive. For example, web servers, lightly used enterprise applications, and test servers can all run on standard SSDs.
If you choose to use standard HDDs, data is stored on conventional magnetic disk drives with moving spindles. Disks are slower and speeds are more variable than for SSDs, but latencies are under 10 ms for write operations and 20 ms for reads. As for standard SSDs, you can use standard HDDs for any virtual machine.
- You need the best disk performance for a new virtual machine. What disk type should you use, assuming all disk types are supported in your region?
- Ultra SSD
- Premium SSD
- Standard HDD
- You have a medium-sized application with modest IO requirements. However, you need to ensure that a throughput of 25 MBps is guaranteed. What disk type should you use?
- Premium SSD
- Standard SSD
- Standard HDD
- For the administrative web interface server, suppose you want to use Azure Backup to protect the content of the virtual machine’s disks. How would this requirement change the disk type you choose?
- Use standard SSDs instead of standard HDDs.
- Use premium SSDs instead of standard HDDs.
- No change is needed.
- For the standby database servers, suppose that your requirements change. You decide that a minimum IOPS of 1,100 is absolutely required at all times. How would this requirement change the disk type you choose?
- Use premium SSDs instead of standard SSDs.
- Use standard HDDs instead of standard SSDs.
- Use unmanaged disks instead of managed disks.
LOCALLY REDUNDANT STORAGE
Locally redundant storage (LRS) replicates your data three times within a single physical location in the primary region. LRS provides at least 99.999999999% (11 nines) durability of objects over a given year.LRS is the lowest-cost redundancy option and offers the least durability compared to other options. LRS protects your data against server rack and drive failures.
LRS doesn’t protect you from a datacenter-wide outage. If the datacenter goes down, you could lose your data.
GEOGRAPHICALLY REDUNDANT STORAGE
Geo-redundant storage (GRS) copies your data synchronously three times within a single physical location in the primary region using LRS. It then copies your data asynchronously to a single physical location in a secondary region that is hundreds of miles away from the primary region. GRS offers durability for Azure Storage data objects of at least 99.99999999999999% (16 9’s) over a given year.A write operation is first committed to the primary location and replicated using LRS. The update is then replicated asynchronously to the secondary region. When data is written to the secondary location, it’s also replicated within that location using LRS.
READ-ACCESS GEO-REDUNDANT STORAGE
With read-access geo-redundant storage (RA-GRS), your secondary region isn’t available for read access until the primary region fails. If you want to read from the secondary region, even if the primary region hasn’t failed, use RA-GRS for your replication type.
Zone-redundant storage (ZRS) replicates your Azure Storage data synchronously across three Azure availability zones in the primary region. Each availability zone is a separate physical location with independent power, cooling, and networking. ZRS offers durability for Azure Storage data objects of at least 99.9999999999% (12 9’s) over a given year.
With ZRS, your data is still accessible for both read and write operations even if a zone becomes unavailable. If a zone becomes unavailable, Azure undertakes networking updates, such as DNS repointing. These updates may affect your application if you access data before the updates have completed. When designing applications for ZRS, follow practices for transient fault handling, including implementing retry policies with exponential back-off.
A write request to a storage account that is using ZRS happens synchronously. The write operation returns successfully only after the data is written to all replicas across the three availability zones.
Microsoft recommends using ZRS in the primary region for scenarios that require consistency, durability, and high availability. We also recommend using ZRS if you want to restrict an application to replicate data only within a country or region because of data governance requirements.
ZRS provides excellent performance, low latency, and resiliency for your data if it becomes temporarily unavailable. However, ZRS by itself may not protect your data against a regional disaster where multiple zones are permanently affected. For protection against regional disasters, Microsoft recommends using geo-zone-redundant storage (GZRS), which uses ZRS in the primary region and also geo-replicates your data to a secondary region.
Because all availability zones are in a single region, ZRS can’t protect your data from a regional level outage.
Geo-zone-redundant storage (GZRS) combines the high availability provided by redundancy across availability zones with protection from regional outages provided by geo-replication. Data in a GZRS storage account is copied across three Azure availability zones in the primary region and is also replicated to a secondary geographic region for protection from regional disasters. Microsoft recommends using GZRS for applications requiring maximum consistency, durability, and availability, excellent performance, and resilience for disaster recovery.
With a GZRS storage account, you can continue to read and write data if an availability zone becomes unavailable or is unrecoverable. Additionally, your data is also durable in the case of a complete regional outage or a disaster in which the primary region isn’t recoverable. GZRS is designed to provide at least 99.99999999999999% (16 9’s) durability of objects over a given year.
Only general-purpose v2 storage accounts support GZRS and RA-GZRS. For more information about storage account types, see Azure storage account overview. GZRS and RA-GZRS support block blobs, page blobs (except for VHD disks), files, tables, and queues.
READ-ACCESS GEO-ZONE-REDUNDANT STORAGE
Read-access geo-zone-redundant storage (RA-GZRS) uses the same replication method as GZRS but lets you read from the secondary region. If you want to read the data that’s replicated to the secondary region, even if your primary isn’t experiencing downtime, use RA-GZRS for your replication type.
A paired region is where an Azure region is paired with another in the same geographical location to protect against regional outage. Paired regions are used with GRS and GZRS replication types.
Live migrations are done by creating an Azure support request in the Azure portal.You’ll then be contacted by a support representative about your live migration request.There are some limitations to live migration. For example:
- Unlike a manual application, you won’t know exactly when a live migration will complete.
- Data can only be migrated to the same region.
- Live migration is only supported for data held in standard storage account types.
- If your account contains a large file share, live migration to GZRS isn’t supported.
- Suppose you want data redundancy and protection from a datacenter outage but for compliance reasons you need to keep your data in the same region. Which redundancy option should you choose?
- Suppose you want to provide the highest availability to your storage data and to protect against a region-wide outage. Which redundancy option should you choose?
- Suppose you’re using geo-zone-redundant storage (GZRS) and there’s an outage in your primary region. You initiate a failover to the secondary region. How does the failover affect the data in your storage account?
- You might have some data loss.
- Data in the primary region is inaccessible to users.
- After you fail over, the data is still protected with geo-replication.
ENCRYPTION AT REST
All data written to Azure Storage is automatically encrypted by Storage Service Encryption (SSE) with a 256-bit Advanced Encryption Standard (AES) cipher. SSE automatically encrypts data when writing it to Azure Storage. When you read data from Azure Storage, Azure Storage decrypts the data before returning it. This process incurs no additional charges and doesn’t degrade performance. It can’t be disabled.
For virtual machines (VMs), Azure lets you encrypt virtual hard disks (VHDs) by using Azure Disk Encryption. This encryption uses BitLocker for Windows images, and it uses dm-crypt for Linux.
ENCRYPTION IN TRANSIT
Keep your data secure by enabling transport-level security between Azure and the client. Always use HTTPS to secure communication over the public internet. When you call the REST APIs to access objects in storage accounts, you can enforce the use of HTTPS by requiring secure transfer for the storage account. After you enable secure transfer, connections that use HTTP will be refused. This flag will also enforce secure transfer over SMB by requiring SMB 3.0 for all file share mounts.
Azure Storage supports cross-domain access through cross-origin resource sharing (CORS). CORS uses HTTP headers so that a web application at one domain can access resources from a server at a different domain. By using CORS, web apps ensure that they load only authorized content from authorized sources.
CORS support is an optional flag you can enable on Storage accounts. The flag adds the appropriate headers when you use HTTP GET requests to retrieve resources from the Storage account.
ROLE-BASED ACCESS CONTROL
Azure Storage supports Azure Active Directory and role-based access control (RBAC) for both resource management and data operations. To security principals, you can assign RBAC roles that are scoped to the storage account. Use Active Directory to authorize resource management operations, such as configuration. Active Directory is supported for data operations on Blob and Queue storage. You can assign RBAC roles that are scoped to a subscription, a resource group, a storage account, or an individual container or queue.
You can audit Azure Storage access by using the built-in Storage Analytics service.Storage Analytics logs every operation in real time, and you can search the Storage Analytics logs for specific requests. Filter based on the authentication mechanism, the success of the operation, or the resource that was accessed.
STORAGE ACCOUNT KEYS
In Azure Storage accounts, shared keys are called storage account keys. Azure creates two of these keys (primary and secondary) for each storage account you create. The keys give access to everything in the account. The storage account has only two keys, and they provide full access to the account. Because these keys are powerful, use them only with trusted in-house applications that you control completely.
SHARED ACCESS SIGNATURES
As a best practice, you shouldn’t share storage account keys with external third-party applications. If these apps need access to your data, you’ll need to secure their connections without using storage account keys. For untrusted clients, use a shared access signature (SAS). A shared access signature is a string that contains a security token that can be attached to a URI. Use a shared access signature to delegate access to storage objects and specify constraints, such as the permissions and the time range of access.
TYPES OF SHARED ACCESS SIGNATURES
You can use a service-level shared access signature to allow access to specific resources in a storage account. You’d use this type of shared access signature, for example, to allow an app to retrieve a list of files in a file system or to download a file.
Use an account-level shared access signature to allow access to anything that a service-level shared access signature can allow, plus additional resources and abilities. For example, you can use an account-level shared access signature to allow the ability to create file systems.
ADVANCED THREAT PROTECTION
What you really want is a way to be notified when suspicious activity is happening. That’s where the Advanced Threat Protection feature in Azure Storage can help.
Advanced Threat Protection detects anomalies in account activity. It then notifies you of potentially harmful attempts to access your account. You don’t have to be a security expert or manage security monitoring systems to take advantage of this layer of threat protection.Currently, Advanced Threat Protection for Azure Storage is available for the Blob service. Security alerts are integrated with Azure Security Center. The alerts are sent by email to subscription admins.
AZURE DATA LAKE STORAGE SECURITY FEATURES
Azure Data Lake Storage Gen2 provides a first-class data lake solution that allows enterprises to pull together their data. It’s built on Azure Blob storage, so it inherits all of the security features. Along with role-based access control (RBAC), Azure Data Lake Storage Gen2 provides access control lists (ACLs) that are POSIX-compliant and that restrict access to only authorized users, groups, or service principals. It applies restrictions in a way that’s flexible, fine-grained, and manageable. Azure Data Lake Storage Gen2 authenticates through Azure Active Directory OAuth 2.0 bearer tokens. This allows for flexible authentication schemes, including federation with Azure AD Connect and multifactor authentication that provides stronger protection than just passwords.
- You are working on a project with a 3rd party vendor to build a website for a customer. The image assets that will be used on the website are stored in an Azure Storage account that is held in your subscription. You want to give read access to this data for a limited period of time. What security option would be the best option to use?
- CORS Support
- Storage Account
- Shared Access Signatures
- When configuring network access to your Azure Storage Account, what is the default network rule?
- To allow all connections from all networks
- To allow all connection from a private IP address range
- To deny all connections from all networks
- Which Azure service detects anomalies in account activities and notifies you of potential harmful attempts to access your account?
- Advanced Threat Protection
- Azure Storage Account Security Feature
- Encryption in transit
DATA ACCESS METHODS
There are two built-in methods of data access supported by Azure Files. One method is direct access via a mounted drive in your operating system. The other method is to use a Windows server (either on-premises or in Azure) and install Azure File Sync to synchronize the files between local shares and Azure Files.
FILE REDUNDANCY OPTION
Because Azure Files stores files in a storage account, you can choose between standard or premium performance storage accounts:
- Standard performance: Double-digit ms latency, 10,000 IOPS, 300-MBps bandwidth
- Premium performance: Single-digit ms latency, 100,000 IOPS, 5-GBps bandwidth
Standard performance accounts use HDD to store data. With HDD, the costs are lower but so is the performance. SSD arrays back the premium storage account’s performance, which comes with higher costs. Currently, premium accounts can only use file storage accounts with ZRS storage in a limited number of regions.
|You can easily re-create data, and cost is a priority.||✔|
|Data must be stored in a single known location.||✔|
|Premium performance is required.||✔||✔|
|Data needs to be highly available, and redundancy is a priority.||✔||✔|
|99.999999999% (11 nines) durability.||✔|
|99.9999999999% (12 nines) durability.||✔|
|99.99999999999999% (16 nines) durability.||✔|
|Azcopy||Command-line tool that offers the best performance, especially for a low volume of small files.|
|Robocopy||Command-line tool shipped with Windows and Windows Server. AzCopy is written to be Azure aware and performs better.|
|Azure Storage Explorer||Graphical file management utility that runs on Windows, Linux, and macOS.|
|Azure portal||Use the portal to import files and folders.|
|Azure File Sync||Can be used to do the initial data transfer, and then uninstalled after the data is transferred.|
|Azure Data Box||If you have up to 35 TB of data and you need it imported in less than a week|
- You’ve been asked by a local manufacturing company that runs dedicated software in their warehouse to keep track of stock. The software needs to run on machines in the warehouse, but the management team wants to access the output from the head office. The limited bandwidth available in the warehouse caused them problems in the past when they tried to use cloud-based solutions. You recommend that they use Azure Files. Which is the best method to sync the files with the cloud?
- Create an Azure Files share and directly mount shares on the machines in the warehouse.
- Use a machine in the warehouse to host a file share, install Azure File Sync, and share a drive with the rest of the warehouse.
- Install Azure File Sync on every machine in the warehouse and head office.
- The manufacturing company has a number of sensors that record time-relative data. Only the most recent data is useful. The company wants the lowest cost storage for this data. What is the best kind of storage account for them?
- The manufacturing company’s finance department wants to control how the data is being transferred to Azure Files. They want a graphical tool to manage the process, but they don’t want to use the Azure portal. What tool do you recommend they use?
- Azure Data Box
- Azure Storage Explorer
Azure Import/Export provides a way for organizations to export data from Azure Storage to an on-premises location. The service offers a secure, reliable, and cost-effective method to export large amounts of data. By using the service, you send and receive physical disks that contain your data between your on-premises location and an Azure datacenter. You ship data that’s stored on your own disk drives. These disk drives can be Serial ATA (SATA) hard-disk drives (HDDs) or solid-state drives (SSDs).
THE WAIMPORTEXPORT TOOL
If you’re importing data into Azure Storage, your data must be written to disk in a specific format. Use the WAImportExport drive preparation tool to do this. This tool checks a drive and prepares a journal file that’s then used by an import job when data is being imported into Azure.
- WAImportExport formats the drive and checks it for errors before data is copied to the disks.
- Encrypts the data on the drive.
- Quickly scans the data and determines how many physical drives are required to hold the data being transferred.
- Creates the journal files that are used for import and export operations. A journal file contains information about the drive serial number, encryption key, and storage account. Each drive you prepare with the Azure Import/Export tool has a single journal file.
There are two versions of this tool:
- Version 1 supports import or export of data to or from Azure Blob storage.
- Version 2 supports import of data into Azure Files.
For export jobs, the Import/Export Service uses BitLocker to encrypt the drive before it’s shipped back to you. For import jobs, all data must be encrypted through BitLocker before you send the disks to Microsoft. You can encrypt disks by using the WAImportExport tool. Or, you can manually enable BitLocker on the drive and provide the encryption key to the WAImportExport tool.
- What does the Azure Import/Export service do?
- Migrates large amounts of data between on-premises storage and Azure by using disks supplied by Microsoft
- Transfers large amounts of data between on-premises storage and Azure by using a high-bandwidth network connection
- Transfers large amounts of data between on-premises storage and an Azure Storage account without tying up network bandwidth
- What’s the purpose of the WAImportExport tool?
- Use it to copy files to a disk to import data into Azure Storage.
- It prepares a drive containing files for import into Azure Storage.
- Use it to create an import job. You specify the files that the disk contains and what should be imported into Azure Storage.
HOW AZURE IMPORT/EXPORT WORKS
To use Azure Import/Export, you create a job that specifies the data that you want to import or export. You then prepare the disks to use to transfer the data. For an import job, you write your data to these disks and ship them to an Azure datacenter. Microsoft uploads the data for you. For an export job, you prepare a set of blank disks and ship them to an Azure datacenter. Microsoft copies the data to these disks and ships them back to you.
EXPORT DATA FROM AZURE
You can use the Import/Export service to export data from Azure Blob storage only. You can’t export data that’s stored in Azure Files.
You must have the following items to support the export process:
- An active Azure subscription and an Azure Storage account holding your data in Azure Blob storage
- A system running a supported version of Windows
- BitLocker enabled on the Windows system
- WAImportExport version 1 downloaded and installed from the Microsoft Download Center
- An active account with a shipping carrier like FedEX or DHL for shipping drives to an Azure datacenter
- A set of disks that you can send to an Azure datacenter on which to copy the data from Azure Storage
- Azure Import/Export allows you to export data from __.
- Azure Storage
- Azure Blob storage
- Azure Blob and Azure Storage
- You receive disk drives with your exported data from Microsoft. You try to copy the data from the disks back to on-premises storage, but you can’t access the data. Why might this be the case?
- You don’t have permissions to view the data. Ask the Azure tenant administrator to give you access to the storage account from which the data was exported.
- You didn’t specify the correct BitLocker encryption key when you tried to access the data.
- You haven’t provided the correct password for your Azure account.
OFFLINE TRANSFER OF MASSIVE DATA
The Import/Export service is an offline solution. It’s designed to handle more data than would be feasible to transmit over a network connection. Using the Import/Export service, you take responsibility for preparing and shipping the necessary hardware.
Microsoft provides an alternative solution in the form of the Azure Data Box family. The Data Box family uses Microsoft-supplied devices to transfer data from your on-premises location into Azure Storage. A Data Box device is a proprietary, tamper-proof network appliance. You connect the device to your own internal network to move data to the device. You ship the device back to Microsoft, which then uploads data from the device into Azure Storage.
*** Azure Data Box family is the recommended solution for handling very large import jobs when the organization is located in a region where Data Box is supported. It’s an easier process than using the Import/Export service. ***
*** Azure Data Box family doesn’t support offline export from Azure. *** For offline export of large amounts of data from Azure to an on-premises location, we recommend Azure Import/Export.
ONLINE TRANSFER OF MASSIVE DATA
The Import/Export service doesn’t provide an online option. If you need an online method to transfer massive amounts of data, you can use *** Azure Data Box Edge or Azure Data Box Gateway ***. Data Box Edge is a physical network appliance that you install on-premises. The device connects to your storage account in the cloud. Data Box Gateway is a virtual network appliance. Both of these products support data transfer from an on-premises location to Azure.
ONLINE TRANSFER OF SMALLER DATA VOLUMES
If you’re looking to import or export more moderate volumes of data to and from Azure Blob storage, consider using other tools like AzCopy or Azure Storage Explorer.
AzCopy is a simple but powerful command-line tool that lets you copy files to or from Azure Storage accounts. With AZCopy you can:
- Upload, download, and copy files to Azure Blob storage
- Upload, download, and copy files to Azure Files
- Copy files between storage accounts
- Copy files between storage accounts from different regions
|Dataset (DS)||Network bandwidth||Solution to use|
|Large DS||Low Bandwidth||Azure Imp/Exp for exp; Data Box Disk or Data Box for imp where supported; otherwise use Azure Imp/Exp|
|Large DS||High-bandwidth network: 1 – 100 Gbps||AZCopy; to imp data, Azure Data Factory, Azure Data Box Edge, or Azure Data Box Gateway|
|Large DS||Moderate : 100 Mbps – 1 Gbps||Azure Imp/Exp for exp or Azure Data Box family for import where supported|
|Small DS: up to 1 Gbps||< 1 GBPS||If xferring only a few files, use Azure Storage Explorer, Azure portal, AZCopy, or AZ CLI|
- When should you use AzCopy instead of the Azure Import/Export service?
- To import or export small amounts of data from Azure Storage when network latency isn’t an issue
- To import or export large amounts of data
- To transfer data from Azure Storage to on-premises storage while you are offline
- When should you use Azure Import/Export service instead of Data Box Family?
- To transfer data using an online connection
- To use a Microsoft-supplied device instead of disks you supply
- To export data from Azure Storage while you’re offline
AZURE DATA BOX PRODUCTS
The Azure Data Box family can be divided into two groups, for offline and online data transfer. Offline data transfer allows you to move large amounts of data to Azure whenever you have time, network bandwidth, or cost constraints.
OFFLINE DATA TRANSFER
The devices in the offline grouping include:
- Data Box Disk: Provides one ~35-TB transfer to Azure. Connect and copy data over USB.
- Data Box: Provides one ~80-TB transfer to Azure per order. Connect and copy data to the device over standard network interface protocols like SMB and NFS.
- Data Box Heavy: Provides one ~800-TB transfer to Azure. Use high-throughput network interfaces to connect and copy data to the device. This process uses standard network interface protocols like SMB and NFS. Data Box Heavy is like two Data Boxes, each with an independent node.
ONLINE DATA TRANSFER
Online data transfer enables a link between your on-premises assets and Azure. Transferring huge amounts of data to Azure is similar to copying data to a networking share. Online data transfer is ideal when you need a continuous link to transfer a massive amount of data.
- Data Box Edge: This device is a dedicated appliance with 12 TB of local SSD storage. It can preprocess and run machine learning on data before uploading it to Azure.
- Data Box Gateway: This device is an entirely virtual appliance. It’s based on a virtual machine that you provision in your on-premises environment.
- You have network bandwidth constraints and ~70 terabytes (TB) of data to import into Azure. Which Azure Data Box device should you order?
- Data Box or multiple orders of Data Box Disk
- Data Box Edge
- Data Box Heavy
- You have a huge amount of data being generated by smart devices and applications at your data center. You want to perform rapid machine learning-based inference on the data before moving it to Azure for deeper analysis. Which Azure Data Box device would you use?
- Data Box Edge
- Data Box Gateway
- Data Box Heavy
AZURE DATA BOX FAMILY WORKFLOW
COPY DATA USING STANDARD TOOLS
You can copy data by using standard tools. For example, use File Explorer to drag and drop files. Or, use any Server Message Block (SMB)-compatible file-copy tool like Robocopy.
USE AZURE STORAGE FILE-NAMING CONVENTIONS AND SIZE LIMITS
When you copy data, all the standard Azure Storage naming conventions apply:
- Subfolder names should be lowercase, from 3 to 63 characters, and consist only of letters, numbers, and hyphens. Consecutive hyphens aren’t allowed.
- Directory and file names for Azure Files shouldn’t exceed 255 characters.
- File size must not exceed ~4.75 tebibytes (TiB) for block blobs, ~8 TiB for page blobs, and ~1 TiB for Azure Files.
Copy data into the appropriate folder for your storage type: PageBlob, BlockBlob, AzureFile, or ManagedDisk.
- Use the ManagedDisk folder for virtual hard disks (VHDs) that you want to migrate to Azure. Use the PageBlob folder for VHDX files.
- Any files copied directly to the PageBlob or BlockBlob folders are inserted in a default $root container. Subfolders are created as containers in Azure.
- For Azure Files, files must be in subfolders under the AzureFile folder. Any files copied to the root of the AzureFile folder are uploaded as block blobs instead of Azure Files items.
If you don’t follow the file structure, size limit, and naming conventions, the data upload to Azure might fail. If you’re using Windows, we recommend that you validate the files by using DataBoxDiskValidation.cmd, which is provided in the DataBoxDiskImport folder. If you have time, use the generate checksums option to validate your data before sending it to Azure.
- What method would you use to unlock the disks you’ve received?
- Get the passkey from the Device Details page in the Azure portal.
- Look on the reverse side of the Data Box Disk device for the passkey.
- Use the passkey in the root folder of the Data Box Disk device.
- You want to copy a virtual hard disk in the VHDX format to an Azure Data Box Disk. What folder do you copy the VHDX to?
- PageBlob folder
- BlockBlob folder
- AzureFiles folder
- ManagedDisk folder
*** Azure Data Box family doesn’t support export of data from Azure. ***
If you’re not in a region supported by Azure Data Box family, consider using Azure Import/Export to import data into Azure.
AZURE DATA FACTORY
Azure Data Factory is a service that enables you to organize, move, and transform large quantities of data from many different sources. In Data Factory, you create data pipelines that ingest data from relational databases, NoSQL databases, and other systems. You can use Azure Machine Learning, Hadoop, Spark, and other services to process and transform that data. Then, at the end of the pipeline, you can publish the transformed data to Azure SQL Data Warehouse, Azure SQL Database, Azure Cosmos DB, and Azure Storage.
Use this service, if you have complex data transformation needs but don’t want to write scripts or compile code.
SCRIPTED OR PROGRAMMATIC TRANSFER
- Azure powershell
- Azure CLI
- Which data import method is best for importing daily traffic-camera video data when you have moderate to high network bandwidth?
- Data Box Gateway
- Data Box Edge
- Data Box Disk
- What’s the maximum amount of data that can be transferred to Azure in one operation through the Azure Data Box Disk?
- 35 TB
- 500 TB
- 80 TB
AZURE FILE SYNC
Azure File Sync allows you to extend your on-premises file shares into Azure. It works with your existing on-premises file shares to expand your storage capacity and provide redundancy in the cloud. It requires Windows Server 2012 R2 or later. You can access your on-premises file share with any supported file sharing protocol that Windows Server supports, like SMB, NFS, or FTPS.Azure File Sync uses your on-premises file server as a local cache for your Azure file share. With cloud tiering, you can cache locally on your file server the files your organization uses the most. The files that are used less frequently are accessible from the same local share, but only a pointer to the data is stored there. When a user goes to open the file, the rest of the file data is pulled from Azure Files.
File A is used frequently, so the entire file is available on the local file share. File B isn’t used often, so the rest of the file is retrieved from the Azure file share.With cloud tiering, you’re storing a smaller set of file data locally. So you have more local storage space for the files your organization uses more often.By default, cloud tiering is off. You enable it when you create the server endpoint.
- Storage Sync Service is the high-level Azure resource for Azure File Sync. The service is a peer of the storage account, and it can also be deployed to Azure resource groups.
- A sync group outlines the replication topology for a set of files or folders. All endpoints located in the same sync group are kept in sync with each other. If you have different sets of files that must be in sync and managed with Azure File Sync, you would create two sync groups and different endpoints.
- A registered server represents the trust relationship between the on-premises server and the Storage Sync Service. You can register multiple servers to the Storage Sync Service. But a server can be registered with only one Storage Sync Service at a time.
- Azure File Sync agent is a downloadable package that enables Windows Server to be synced with an Azure file share. The agent has three components:
- FileSyncSvc.exe. Service that monitors changes on endpoints.
- StorageSync.sys. Azure file system filter driver.
- PowerShell management cmdlets.
- A server endpoint represents a specific location on a registered server, like a folder on a local disk. Multiple server endpoints can exist on the same volume if their paths don’t overlap.
- The cloud endpoint is the Azure file share that’s part of a sync group. The whole file share syncs and can be a member of only one cloud endpoint. An Azure file share can be a member of only one sync group at a time.
- Cloud tiering is an optional feature of Azure File Sync that allows frequently accessed files to be cached locally on the server. Files are cached or tiered according to the cloud tiering policy you create.
HOW DOES IT WORK?
Azure File Sync uses a software-based agent that’s installed on the on-premises server that you want to replicate. This agent communicates with the Storage Sync Service.
Azure File Sync uses Windows USN journaling on the Windows Server computer to automatically start a sync session when files change on the server endpoint. So changes made to the on-premises file share are immediately detected and replicated to the Azure file share.
Azure Files doesn’t yet have change notification or journaling. So Azure File Sync has a scheduled job called a change detection job. This job is initiated every 24 hours. So if you change a file on the Azure file share, you might not see the change on the on-premises file share for at least 24 hours.
Before you consider using Azure File Sync with your on-premises servers, be aware of the following possible problems:
- Antivirus: Antivirus programs work by scanning files known for malicious code. This feature might cause an undesired recall of tiered files. Most recent antivirus products, including Microsoft products like Windows Defender and System Center Endpoint Protection, recognize and support dealing with these files. But if you’re using a third-party program, check compatibility with the software vendor.
- Backup: Like antivirus solutions, backup solutions can cause the recall and processing of tiered files. We highly recommended you use Azure Backup because it backs up the data on the Azure file share itself. If you’re restoring files from Azure Backup, it’s important to use volume-level or file-level restore operations when you’re using Azure File Sync. Files restored by these methods will automatically be synced to all endpoints in the sync group. Existing files will be replaced with the newly restored versions.
- Encryption: Azure File Sync works with common encryption methods from Microsoft, including BitLocker, Azure Information Protection, Azure Rights Management, and Active Directory RMS. Azure File Sync doesn’t work with the NTFS file system encryption method, Encrypted File System (EFS).
Azure File Sync has these system requirements for your local file server:
- Operating system: Windows Server 2012 R2, Windows Server 2016, or Windows Server 2019, in either Datacenter or Standard edition in full or core deployments.
- Memory: 2 GB of RAM or more.
- Patches: Latest Windows patches applied.
- Storage: Locally attached volume formatted in the NTFS file format. Remote storage connected by USB isn’t supported.
The supported NTFS file system features are:
- Access control lists (ACLs): ACLs are preserved and enforced on Windows Server endpoints.
- NTFS compression: Compressing files to save space is fully supported.
- Sparse files: Sparse files are stored in a more efficient way than normal files. Sparse files are supported, but, during the sync to the cloud, they’re stored as normal full files.
- How often does the cloud endpoint change detection job run?
- Every 12 hours.
- Every 8 hours.
- Every 24 hours.
- What is the Azure File Sync agent?
- It’s installed on a server to enable Azure File Sync replication between the local file share and an Azure file share.
- It’s installed on a server to set NTFS permissions on files and folders.
- It’s installed on an Azure file share to control on-premises file and folder replication traffic.
- How do you assess your server’s compatibility with Azure File Sync?
- Download and run the Azure File Sync agent to assess the file share and server.
- Install the Azure PowerShell module on the server and use the cmdlet Invoke-AzStorageSyncCompatibilityCheck.
- Register the server with the Storage Sync Service to have the server evaluated for compatibility.
- In what order do you create the Azure resources needed to support Azure File Sync?
- Storage Sync Service, storage account, file share, and then the sync group.
- Storage account, file share, Storage Sync Service, and then the sync group.
- Storage account, file share, sync group, and then Storage Sync Service.
- What is cloud tiering in Azure File Sync?
- It’s a policy you create that prioritizes the sync order of file shares.
- It’s a policy that sets the frequency at which the sync job runs.
- It’s a feature that archives infrequently accessed files to free up space on the local file share.
- What’s the deployment process for Azure File Sync?
- Evaluate your on-premises system, create the Azure resources, install the Azure File Sync agent, register the on-premises server, and create the server endpoint.
- Create the Azure resources, install the Azure File Sync agent, register the on-premises server, and create the server endpoint.
- Evaluate your on-premises system, create the Azure resources, install the Azure File Sync agent on a virtual machine, register the on-premises server, and create the server endpoint.
- Which of these answers isn’t a possible cause for file sync issues?
- Your company’s firewall rules are blocking network traffic on port 445.
- The virtual machine that runs the Storage Sync Service is stopped.
- Your on-premises server doesn’t support SMB encryption.
- Which of these answers isn’t a possible cause for file sync issues?
- Your company’s firewall rules are blocking network traffic on port 445.
- The virtual machine that runs the Storage Sync Service is stopped.
- Your on-premises server doesn’t support SMB encryption.
- Which caching option is a good choice for write-heavy operations such as storing log files?
- For which type of disk does Azure restart the VM in order to change caching type?
- Operating system (OS)
- Zone-redundant storage (ZRS)
- Suppose you are using Azure PowerShell to manage a VM. You have a local object that represents the VM and you’ve made several updates to that local object. Which PowerShell cmdlet would you use to apply those local changes to the actual VM?
- In general, increased diversity means an increased number of storage accounts. A storage account by itself has no financial cost. However, the settings you choose for the account do influence the cost of services in the account. Use multiple storage accounts to reduce costs.
- The storage account name is used as part of the URI for API access, so it must be globally unique.
- Storage accounts are stable for the lifetime of a project. It’s common to create them at the start of a project.
- Place all your application data, including databases and static content files, on data disks.
- Managed disks guarantee 99.999% availability because data is automatically replicated.
6 Ultra SSDs offer the best performance of disk types on Azure.
- Standard tier disks don’t guarantee a minimum throughput. For such a guarantee, use premium SSD disks. Standard SSD and HDD do not guarantee performance
- Azure Backup supports managed disks, so no change is needed.
- Although larger standard SSD sizes support this IOPS, it’s not guaranteed and might occasionally drop below 1,100. Premium SSDs do have a guaranteed performance level.
- Zone-redundant storage (ZRS) copies your data to three different availability zones within a single region. If one datacenter is experiencing an outage, your data remains accessible from another availability zone within the same Azure region.
- With geo-zone-redundant storage (GZRS), your data is copied across three availability zones in your primary region and to a paired secondary region. You could also use geographically redundant storage (GRS), read-access geo-redundant storage (RA-GRS), or read-access geo-zone-redundant storage (RA-GZRS).
- There’s delay before data is copied from the primary region and written to the secondary because data is copied asynchronously. After you fail over, compare the last sync time and last fail over time.
- A shared access signature is a string that contains a security token that can be attached to a URI. Use a shared access signature to delegate access to storage objects and specify constraints, such as the permissions and the time range of access.
- The default network rule is to allow all connections from all networks.
- Advanced Threat Protection, detects anomalies in account activity. It then notifies you of potentially harmful attempts to access your account.
- Low bandwidth means Azure File Sync will handle the updating and syncing of files efficiently over the low-bandwidth network.
- LRS, This option is the best because it’s the lowest cost, the data is being continuously created, and data loss isn’t an issue.
- You send and receive physical disks holding your data between your on-premises location and an Azure datacenter.
- Your data must be written to disk by using a specific format. The tool checks the drive and prepares a journal file that’s then used by an import job.
- You can only export blob data with the Azure Import/Export service.
- Find the encryption keys in the details for the export job in the Azure portal.
- Use AzCopy to transfer small to moderate amounts of data online across the network.
- Data Box Family doesn’t support exporting data from Azure Storage.
- Data Box allows up to an 80-TB transfer. Two Data Box Disk orders allow up to 70 TB.
- Data Box Edge is an online physical appliance that provides for data preprocessing and machine learning before transfer to the cloud.
- The passkey is shown on the Device Details page in the Azure portal.
- The PageBlob folder is the correct place to copy the VHDX.
- Data Box Gateway is ideally suited to continuous ingestion scenarios.
- 35 TB is the maximum amount of transferable data for a single Data Box Disk order.
- The detection job runs every 24 hours.
- Azure File Sync agent is a downloadable package that enables a Windows Server file share to be synced with an Azure file share.
- The results of the cmdlet can tell you if the OS, file system, file names, or folder names have compatibility issues.
- Create the storage account, and then create a file share within the storage account. Create the Storage Sync Service, and then create the sync group within the Storage Sync Service.
- Cloud tiering allows frequently accessed files to be cached on the local server. Infrequently accessed files are tiered, or archived, to the Azure file share according to the policy you create.
- Verify that your on-premises server’s OS and file system are supported. Then create the required resources in Azure. On the local server, install the Azure File Sync agent and register the server. Finally, create the server endpoint in Azure.
- The Storage Sync Service is the high-level Azure resource for Azure File Sync. You create this resource in Azure.
- The Storage Sync Service is the high-level Azure resource for Azure File Sync. You create this resource in Azure.
- Write-heavy operations generally do not benefit from caching. ‘None’ is probably the best choice for a disk dedicated to log files.
- Changing the cache setting of the OS disk requires a VM restart.
- This cmdlet updates the state of an Azure virtual machine to the state of a virtual machine object.