Path2.Mod1.b - Make Data Available - Creating Datastores Flashcards
I-B AK ST, ab
Create an Azure Blob Datastore through CLI:
- Three credential types you can use to connect to a Blob
- The type
value for Blob
- The full CLI command line assuming a YAML file
CLI requires you to run az ml datastore create --file <yaml file>
that sets type: azure_blob
, along with credentials provided in the yaml file:
1. Identity-based access using account_name
2. Account Key using credentials: account_key
3. SAS Tokens using credentials: sas_token
Example of the YAML file. Note the type
and the credentials
:
# my_blob_datastore.yml $schema: https://azuremlschemas.azureedge.net/latest/azureBlob.schema.json name: blob_example type: azure_blob description: Datastore pointing to a blob container. account_name: mytestblobstore container_name: data-container credentials: account_key: XXXxxxXXXxXXXXxxXXXXXxXXXXXxXxxXxXXXxXXX
I-B AK ST, ABDs
Create an Azure Blob Datastore through Python SDK:
- Three credential types you can use to connect to a Blob
- The Constructor type for Blob Datastores
my_blob_datastore.yml
Python SDK requires similar to CLI while using the AzureBlobDatastore
constructor, then calling MLClient.create_or_update(datastore_instance)
1. Identity-based access using account_name
2. Account Key using credentials: AccountKeyConfiguration(account_key)
3. SAS Token using credentials: SasTokenConfiguration(sas_token)
Example of that code. Note the constructor for AzureBlobDatastore
and SasTokenConfiguration
:
from azure.ai.ml.entities import AzureBlobDatastore from azure.ai.ml.entities import SasTokenConfiguration from azure.ai.ml import MLClient ml_client = MLClient.from_config() store = AzureBlobDatastore( name="blob_sas_example", description="Datastore pointing to a blob container using SAS token.", account_name="mytestblobstore", container_name="data-container", credentials=SasTokenConfiguration( sas_token= "?xx=XXXX-XX-XX&xx=xxxx&xxx=xxx" ), ) ml_client.create_or_update(store)
I-B SP, adlg2
Create an Azure Data Lake Gen 2 Datastore through CLI:
- Two credential types you can use to connect to a Data Lake
- The type
value for Data Lake Gen 2
my_blob_datastore.yml
CLI requires you to run az ml datastore create --file <yaml file>
that sets type: azure_data_lake_gen2
, along with credentials provided in the yaml file:
1. Identity-based access using account_name
2. Service Principal using credentials: tenant_id, client_id, client_secret
Example of the YAML file. Note the type
and the credentials
:
# my_adls_datastore.yml $schema: https://azuremlschemas.azureedge.net/latest/azureDataLakeGen2.schema.json name: adls_gen2_example type: azure_data_lake_gen2 description: Datastore pointing to an Azure Data Lake Storage Gen2. account_name: mytestdatalakegen2 filesystem: my-gen2-container credentials: tenant_id: XXXXXXXX-XXXX-XXXX-XXXX-XXX client_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXX client_secret: XXXXXXXXXXXXXXXXXXXXXXX
I-B SP, ADLG2Ds
Create an Azure Data Lake Gen 2 Datastore through Python SDK:
- Two credential types you can use to connect to a Data Lake Gen 2
- The Constructor type for Data Lake Gen 2 Datastores
my_blob_datastore.yml
Python SDK requires similar to CLI while using the AzureDataLakeGen2Datastore
constructor, then calling MLClient.create_or_update(datastore_instance)
1. Identity-based access using account_name
2. Service Principal using credentials: ServicePrincipalCredentials(tenant_id, client_id, client_secret)
Example of that code. Note the constructor for AzureDataLakeGen2Datastore
and ServicePrincipalCredentials
:
from azure.ai.ml.entities import AzureDataLakeGen2Datastore from azure.ai.ml.entities._datastore.credentials import ServicePrincipalCredentials from azure.ai.ml import MLClient ml_client = MLClient.from_config() store = AzureDataLakeGen2Datastore( name="adls_gen2_example", description="Datastore pointing to an Azure Data Lake Storage Gen2.", account_name="mytestdatalakegen2", filesystem="my-gen2-container", credentials=ServicePrincipalCredentials( tenant_id= "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX", client_id= "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX", client_secret= "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX", ), ) ml_client.create_or_update(store)
AK ST, af
Create an Azure Files Datastore through CLI:
- Two credential types you can use to connect to a File Store
- The type
value for File Store
my_adls_datastore.yml
CLI requires you to run az ml datastore create --file <yaml file>
that sets type: azure_file
, along with credentials provided in the yaml file:
1. Account Key access using credentials: account_key
2. SAS Token using credentials: sas_token
Example of the YAML file. Note the type
and the credentials
:
# my_files_datastore.yml $schema: https://azuremlschemas.azureedge.net/latest/azureFile.schema.json name: file_sas_example type: azure_file description: Datastore pointing to an Azure File Share using SAS token. account_name: mytestfilestore file_share_name: my-share credentials: sas_token: ?xx=XXXX-XX-XX&xx=xxxx&xxx=xxx&xx=xxxxx
AK ST, AFDs
Create an Azure Files Datastore through Python SDK :
- Two credential types you can use to connect to a File Store
- The Constructor type for File Datastores
Python SDK requires similar to CLI while using the AzureFileDatastore
constructor, then calling MLClient.create_or_update(datastore_instance)
1. Account Key access using AccountKeyConfiguration(account_key)
2. SAS Token using SasTokenConfiguration(sas_token)
Example of that code. Note the constructor for AzureFileDatastore
and SasTokenConfiguration
:
from azure.ai.ml.entities import AzureFileDatastore from azure.ai.ml.entities import SasTokenConfiguration from azure.ai.ml import MLClient ml_client = MLClient.from_config() store = AzureFileDatastore( name="file_sas_example", description="Datastore pointing to an Azure File Share using SAS token.", account_name="mytestfilestore", file_share_name="my-share", credentials=SasTokenConfiguration( sas_token="?xx=XXXX-XX-XX&xx=xxxx&xxx=xxXXXX ), ) ml_client.create_or_update(store)
I-B SP, adlg1
Create an Azure Data Lake Gen 1 Datastore through CLI:
- Two credential types you can use to connect to a Data Lake
- The type
value for Data Lake Gen 1
my_blob_datastore.yml
CLI is exactly the same as Gen 2 except the type requires type: azure_data_lake_gen1
. You run az ml datastore create --file <yaml file>
with a YAML file that sets credentials accordingly:
1. Identity-based access based on system authentication
2. Service Principal using credentials: tenant_id, client_id, client_secret
Example of the YAML file. Note the type
and the simplicity of Identity-based access:
# my_adls_datastore.yml $schema: https://azuremlschemas.azureedge.net/latest/azureDataLakeGen1.schema.json name: alds_gen1_credless_example type: azure_data_lake_gen1 description: Credential-less datastore pointing to an Azure Data Lake Storage Gen1. store_name: mytestdatalakegen1
I-B SP, ADLG1Ds
Create an Azure Data Lake Gen 1 Datastore through Python SDK:
- Two credential types you can use to connect to a Data Lake Gen 1
- The Constructor type for Data Lake Gen 1 Datastores
my_blob_datastore.yml
Python SDK is exactly the same as Gen 2 except using the AzureDataLakeGen1Datastore
constructor, then calling MLClient.create_or_update(datastore_instance)
:
1. Identity-based access based on system authentication
2. Service Principal using credentials: ServicePrincipalCredentials(tenant_id, client_id, client_secret)
Example of that code. Note the constructor for AzureDataLakeGen1Datastore
and ServicePrincipalCredentials
:
from azure.ai.ml.entities import AzureDataLakeGen1Datastore from azure.ai.ml.entities.datastore.credentials import ServicePrincipalCredentials from azure.ai.ml import MLClient ml_client = MLClient.from_config() store = AzureDataLakeGen1Datastore( name="adls_gen1_example", description="Datastore pointing to an Azure Data Lake Storage Gen1.", store_name="mytestdatalakegen1", credentials=ServicePrincipalCredentials( tenant_id= "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX", client_id= "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX", client_secret= "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX", ), ) ml_client.create_or_update(store)
Of the four Datastores we have studied, which one does not accept Identity-Based Credentials?
Azure File Datastore
Of the four Datastores we have studied, which ones can accept Service Principal for Credentials?
Azure Data Lakes (Gen 1 and 2)
Of the four Datastores we have studied, which ones can accept Access Key and SAS Token Credentials (Credentials-based authentication)?
Azure Blob Datastores and Azure File Datastores