AdministraTION Flashcards
Revise the administrations concepts in databricks
What are workspace admins?
Workspace admins have admin privileges within a single workspace. They can
* manage workspace-level identities,
* regulate compute use, and
* enable and delegate role-based access control (Premium plan or above only).
What is the effect of Unity catalog in workspace for identity management?
If your workspace is enabled for Unity Catalog, identities should be added at the account level. Workspace admins can then assign users, groups, and service principals to their workspace.
What compute resources Workspace admin can create?
Workspace admins can create SQL warehouses (a compute resource that lets you run SQL commands on data objects within Databricks SQL) and clusters for their workspace users.
How does Workspace admin regulate compute usage?
Workspace admins have the following tools:
- Limit workspace users’ cluster creation options with cluster policies.
Databricks recommends managing all init scripts as cluster-scoped init scripts. Instead of using global init scripts, manage init scipts using cluster policies.
- Learn which compute resources have Unity Catalog access.
- Grant S3 bucket access through clusters using instance profiles.
What are different Admin types in Databricks?
There are two main levels of admin privileges available on the Databricks platform:
Account admins: Manage the Databricks account, including workspace creation, user management, cloud resources, and account usage monitoring.
Workspace admins: Manage workspace identities, access control, settings, and features for individual workspaces in the account.
Additionally, users can be assigned these feature-specific admin roles, which have narrower sets of privileges:
Marketplace admins: Manage their account’s Databricks Marketplace provider profile, including creating and managing Marketplace listings.
Metastore admins: Manage privileges and ownership for all securable objects within a Unity Catalog metastore, such as who can create catalogs or query a table.
What are account Admins?
Account admins have privileges over the entire Databricks account. As an account admin, you can
* create workspaces,
* configure cloud resources,
* Enable Unity Catalog (If your Databricks account was created after November 8, 2023, your workspaces might have Unity Catalog enabled by default. )
* view usage data,(Monitor account with system tables) and
* manage account identities, settings, and subscriptions.
* Manage Previews
Account admins can also delegate the account admin and workspace admin roles to any other user.
How does Account Admin manage identities?
If you’ve enabled Unity Catalog for at least one workspace in your account, identities (users, groups, and service principals) should be managed in the account console. Account admins can grant permissions and assign workspaces to these identities.
What are different support options in databricks?
If you have any questions about setting up Databricks and need live help, please e-mail onboarding-help@databricks.com.
If you have a Databricks support package, you can open and manage support cases with Databricks.
If your organization does not have a Databricks support subscription, or if you are not an authorized contact for your company’s support subscription, you can get answers to many questions in Databricks Office Hours or from the Databricks Community.
How to locate the account ID?
To retrieve your account ID, go to the account console and click the down arrow next to your username in the upper right corner. In the drop-down menu you can view and copy your Account ID.
You must be in the account console to retrieve the account ID, the ID will not display inside a workspace
What are different billing methods in Databricks?
- Pay-as-you-go accounts through AWS Marketplace
- Pay-as-you-go accounts paid by credit card to Databricks
- Contract accounts
Your account’s billing method is permanent and cannot be changed after you sign up for your account.
What is the default pricing plan for new accounts?
By default, new accounts are on the Premium plan, which adds audit logging, role-based access control, and other features that give you more control over security, governance, and more.
Are access control settings enabled if you upgrade from standard plan to higher.
Access control settings are disabled by default on workspaces that are upgraded from the Standard plan to the Premium plan or above. Once an access control setting is enabled, it can not be disabled.
What are the things to keep in mind after canceling a subscription.
After you cancel your subscription:
- You can no longer access workspaces, notebooks, or data in your Databricks account.
- In accordance with the Databricks terms of service, any Customer Content contained within workspaces tied to your subscription will be deleted within 30 days of cancellation.
- You can’t sign up for a new subscription using the same email address. You must provide a new email address in the sign-up form.
Once a subscription is cancelled, Databricks is not responsible for cleaning up the resources attached to the account. Databricks recommends terminating all compute resources before you cancel your subscription. Additionally, terminate any Databricks associated resources from your AWS console.
What are the steps to delete a databricks account?
Before you delete a Databricks account, you must first cancel your Databricks subscription and delete all Unity Catalog metastores in the account. After you delete all metastores associated with your organization’s account, you can start the process to delete your account.
If you need to delete your Databricks account, reach out to your account team for assistance or file a ticket at help.databricks.com.
Where can you import a usage dashboard?
Account admins can import cost management AI/BI dashboards to any Unity Catalog-enabled workspace in their account.
Which underlying table the usage dashboard use ?
To use the imported dashboard, a user must have the SELECT permissions on the system.billing.usage and system.billing.list_prices tables. The dashboard’s data is subject to the usage table’s retention policies.
How many max data lines are shown in Usage graph?
If there are more than 10 workspaces, SKUs, or tag values, the chart displays the nine with the highest usage. The usage of the remaining workspaces, SKUs, or tag values is aggregated and displayed in a single line, which is labeled as combined.
What are the limitations of Budget?
- There could be up to a 24-hour delay between usage occurring and an email notification being sent.
- After you create a new budget, there could be a delay before you can see the budget details.
- Budgets do not factor in any billing credits or negotiated discounts your account might have. The spent amount is calculated by multiplying usage by the SKU list price.
What is the default level of network access for serverless compute ?
Serverless compute for notebooks and jobs has unrestricted access to the public internet by default.
Who can enable serverless compute in a account and what can they be used by?
After an account admin enables serverless compute, all eligible workspaces in the account will have access to use serverless compute for notebooks, jobs, and Delta Live Tables.
If your account was created after March 28, 2022, serverless compute is enabled by default for your workspaces.
What is the eligiblity for serverless compute enablement in a workspace ?
To be eligible for serverless compute for notebooks and jobs, your workspace must meet the following requirements:
- Must have Unity Catalog enabled.
- Must be in a supported region
What are No Isolation Shared Cluster ?
No Isolation Shared clusters run arbitrary code from multiple users in the same shared environment, similar to what happens on a cloud Virtual Machine that is shared across multiple users.
Data or internal credentials provisioned to that environment might be accessible to any code running within that environment. To call Databricks APIs for normal operations, access tokens are provisioned on behalf of users to these clusters.
When a higher-privileged user, such as a workspace administrator, runs commands on a cluster, their higher-privileged token is visible in the same environment.
Who can protect the admin credentials from being shared in No Isolation Shared Cluster ?
**Account admins **can prevent internal credentials from being automatically generated for Databricks workspace admins on No Isolation Shared clusters.
What are limitations of enabling admin protection in No Isolation shared cluster ?
The following Databricks features do not work if you enable admin protection for No Isolation Shared clusters on your account:
- Machine Learning Runtime workloads.
- Workspace files.
- dbutils Secrets utility.
- dbutils Notebook utility.
- Delta Lake operations by admins that create, modify, or update data.
Other features might not work for admin users on this cluster type because these features rely on automatically generated internal credentials.
In those cases, Databricks recommends that admins do one of the following:
- Use a different cluster type than “No isolation shared” or its equivalent legacy cluster types.
- Create a non-admin user when using No Isolation Shared clusters.
What is the prerequisite for enabling Automatic cluster update ?
Enabling this feature on a workspace requires that you add the Enhanced Security and Compliance add-on as described on the pricing page. This feature also requires the Enterprise pricing tier.
What is the default time for automatic cluster update ?
By default, automatic cluster update is scheduled for** the first Sunday of every month at 1:00 AM UTC**. Account admins can use the Databricks account console to change the maintenance window frequency, start date, and start time.
Which resources does the automatic cluster update applies to ?
This applies to all compute resources that run in the classic compute plane:
1. clusters,
2. pools,
3. classic SQL warehouses, and
4. legacy Model Serving.
It does not apply to serverless compute resources
Will the machines will restart during maintainence window of Automatic cluster update ?
If there are no updates for images for running compute resources, by default they are not restarted but you can configure the feature to force restart during the maintenance window.
Who can enable automatic cluster update on a workspace
?
You must be an account admin to configure automatic cluster update. Although this setting is configured for each workspace, the controls for this feature are part of the account console UI, not the workspace admin console.
What are serverless quotas ?
Serverless quotas are a safety measure for serverless compute. Quotas are enforced on serverless compute for notebooks, jobs, Delta Live Tables, and for serverless SQL warehouses.
Quotas are measured in Databricks Units (DBUs) per hour. A DBU is a normalized unit of processing power on the Databricks platform used for measurement and pricing purposes.
What are serverless quota for serverless compute for notebooks, jobs, and Delta Live Tables?
Serverless compute for notebooks, jobs, and Delta Live Tables includes a scale-up limit that imposes a maximum cost per workload per hour.
Because this limit is enforced per workload, it does not prevent you from launching new jobs or notebooks using serverless compute.
What are serverless quota for serverless SQL warehouses?
Quotas for serverless SQL warehouses restrict the number of serverless compute resources a customer can have at a time. This quota is enforced at the regional level for all workspaces in your account.
When you reach your serverless quota in a region, workspaces cannot launch new serverless SQL warehouses. Reaching this limit does not terminate existing SQL warehouses in the region. However, if you reach the limit, Databricks prevents increasing the number of compute resources in the warehouse.
There are initial default quotas for accounts, but Databricks automatically proactively increases quotas based on your type of account and how you use Databricks.** Databricks intends to prevent typical customers from reaching quota limits during normal usage**.
What are the two tiers avaialble in databricks for subcription ?
Premium and Enterprise tier. The standard tier is now deprecated.
What is a workspace in databricks ?
A workspace is a Databricks deployment in a cloud service account. It provides a unified environment for working with Databricks assets for a specified set of users.
After how much time the usage information gets updated?
If you subscribed to Databricks through AWS Marketplace, Databricks usage takes 24 hours to appear in the AWS Billing & Cloud Management dashboard. Usage appears in the Databricks account console after one hour.