Create Azure Batch for Covalent

1. Install Terraform

sudo dnf install -y dnf-plugins-core
sudo dnf config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
sudo dnf -y install terraform

2. Install Azure CLI

Refer to this link: Install the Azure CLI on Linux | Microsoft Learn

# Take Centos8 as an example
sudo rpm --import https://packages.microsoft.com/keys/microsoft.asc
sudo dnf install -y https://packages.microsoft.com/config/rhel/8/packages-microsoft-prod.rpm
sudo dnf install azure-cli

3. Login into Azure CLI

az login # Follow the prompts to perform actions 

image-20231027173800955

4. Download terraform scripts for Azure Batch

Download terraform scripts.tar to HOME folder:

Extract the files from scripts.tar

tar -xvf scripts.tar
mv scripts/ azurebatch_terraform_scripts/

5. Configuration

cd $HOME/azurebatch_terraform_scripts

vi terraform.tfvars
prefix          = "my-prefix"
subscription_id = "my-subscription-id"
tenant_id       = "my-tenant-id"
vm_name         = "Standard_A1_v2"
owners          = ["my-user-id"]

vi versions.tf
required_version = "~> 1.6.2"

note:

prefix: All the cloud resources start with the name which prefix specified.

subscription_id, tenant_id, owners:

​ User can fetch subscription id,tenant id,user id from Azure web console, you can refer following link https://hpc.lenovo.com/lico/hybrid/hpc/en-us/initialize.html, chapter 6

vm_name: the vm size in Azure

6. Create Azure Batch Resource

Initial terraform environment


[root@head azurebatch_terraform_scripts]# terraform init

Initializing the backend...

Initializing provider plugins...
- Reusing previous version of hashicorp/azurerm from the dependency lock file
- Reusing previous version of hashicorp/azuread from the dependency lock file
- Using previously-installed hashicorp/azurerm v3.73.0
- Using previously-installed hashicorp/azuread v2.42.0

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

generate the execution plan

[root@head azurebatch_terraform_scripts]# terraform  plan

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

。。。。。。
。。。。。。

Plan: 14 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + covalent_azurebatch_object = (known after apply)
  + plugin_client_secret       = (sensitive value)
  + plugin_client_username     = (known after apply)
  + user_identity_resource_id  = (known after apply)

create the resources

[root@head azurebatch_terraform_scripts]# terraform apply -auto-approve

。。。。。。

。。。。。。

Apply complete! Resources: 14 added, 0 changed, 0 destroyed.

Outputs:

covalent_azurebatch_object = <<EOT
    executor = ct.executor.AzureBatchExecutor(
        tenant_id="my-tenant-id",
        client_id="my-client-id",
        client_secret=plugin_client_secret,
        batch_account_url="my-batch-accoynt-url",
        storage_account_name="my-storage-account-name",
        pool_id="my-pool-id",
    )

EOT
plugin_client_secret = <sensitive>
plugin_client_username = "cd2b13bc-562f-4430-bd58-8f29f44aed7e"
user_identity_resource_id = "8bbd3fda-a0af-4e1c-8701-ca696a525196"

Get the client_secret

[root@head azurebatch_terraform_scripts]# terraform output -raw plugin_client_secret
OZB8Q~YeIC5uroIuG6znMcyniTutKaKXyn46dcee

7.Add covalent executor for Azure Batch in LiCO web portal

image-20231020153104212

image-20231031164927553

In the Azure Batch Executor form, filled with data which Terraform apply outputs, then Covalent template can use Azure Batch Executor.

8. Delete Azure Batch resource

If you don't need to use Azure Batch resource, you can use following command to delete resources.

[root@head azurebatch_terraform_scripts]# terraform destroy -auto-approve
azuread_application.batch: Refreshing state... [id=866dbd77-1fc2-4e6f-b76e-96281a596b2b]
azuread_service_principal.batch: Refreshing state... [id=4fd2abff-157d-4d50-b775-46c4aafc90db]
azuread_service_principal_password.covalent_plugin: Refreshing state... [id=4fd2abff-157d-4d50-b775-46c4aafc90db/password/4d4677df-a178-42a5-8484-16e195300d36]
azurerm_role_definition.covalent_batch: Refreshing state... [id=/subscriptions/66139a44-e4f5-4135-aa86-de2152485836/providers/Microsoft.Authorization/roleDefinitions/bc0fe064-e428-8651-0a21-e3075b6d85d9|/subscriptions/66139a44-e4f5-4135-aa86-de2152485836]
azurerm_resource_group.batch: Refreshing state... [id=/subscriptions/66139a44-e4f5-4135-aa86-de2152485836/resourceGroups/lenovolico-covalent-batch]
azurerm_role_assignment.covalent_plugin_storage: Refreshing state... [id=/subscriptions/66139a44-e4f5-4135-aa86-de2152485836/providers/Microsoft.Authorization/roleAssignments/06c0ebd2-8013-89e3-da8d-d0afb9d4484b]
azurerm_user_assigned_identity.batch: Refreshing state... [id=/subscriptions/66139a44-e4f5-4135-aa86-de2152485836/resourceGroups/lenovolico-covalent-batch/providers/Microsoft.ManagedIdentity/userAssignedIdentities/lenovolicocovalentbatch]
azurerm_storage_account.batch: Refreshing state... [id=/subscriptions/66139a44-e4f5-4135-aa86-de2152485836/resourceGroups/lenovolico-covalent-batch/providers/Microsoft.Storage/storageAccounts/lenovolicocovalentbatch]
azurerm_role_assignment.covalent_plugin_batch: Refreshing state... [id=/subscriptions/66139a44-e4f5-4135-aa86-de2152485836/providers/Microsoft.Authorization/roleAssignments/b9e4f215-9376-e060-7dbb-b65325e1eb26]
azurerm_role_assignment.batch_to_storage: Refreshing state... [id=/subscriptions/66139a44-e4f5-4135-aa86-de2152485836/providers/Microsoft.Authorization/roleAssignments/fba7c591-3402-7985-1475-eb46bb31d412]
azurerm_role_assignment.batch_to_acr: Refreshing state... [id=/subscriptions/66139a44-e4f5-4135-aa86-de2152485836/providers/Microsoft.Authorization/roleAssignments/7581b0ae-2092-ddb6-4dc2-2fd470a96cc4]
azurerm_storage_container.assets: Refreshing state... [id=https://lenovolicocovalentbatch.blob.core.windows.net/covalent-assets]
azurerm_batch_account.covalent: Refreshing state... [id=/subscriptions/66139a44-e4f5-4135-aa86-de2152485836/resourceGroups/lenovolico-covalent-batch/providers/Microsoft.Batch/batchAccounts/lenovolicocovalentbatch]
azurerm_batch_pool.covalent: Refreshing state... [id=/subscriptions/66139a44-e4f5-4135-aa86-de2152485836/resourceGroups/lenovolico-covalent-batch/providers/Microsoft.Batch/batchAccounts/lenovolicocovalentbatch/pools/default]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  - destroy

Terraform will perform the following actions:

  。。。。。。
  。。。。。。
azurerm_resource_group.batch: Destruction complete after 1m7s

Destroy complete! Resources: 14 destroyed.