Study Areas Flashcards
What are the 10 high level services that make up a Tableau Server installation (2019.2). Hint (see diagram onAdmin Guide page). Note - there are more ‘processes’ running within these high level services.
Gateway Backgrounder Data Server Data Engine VizQL Server Cache Server Application Server Ask Data Elastic Cache Tableau Prep Condictor
Which Process: What process is the component that redirects traffic from all Tableau clients to the available server nodes in a cluster
Gateway
Which Process: What is a logical grouping of services that provide data freshness, shared meta data management, governed data sources, and in-memory data. The underlying processes that power Data Services are the Backgrounder, Data Server and Data Engine processes
Data Services
Which Process: Composes of the VizQL and Cache Server processes, provide user-facing visualization and analytics services and caching services
Analytics Services
Which Process: Sharing and Collaboration, and Content Management Service are powered by the Application Server process. Core Tableau Server functionality such as user login, content management (projects, sites, permissioning, etc.) and administration activities are provided by the Application Server process.
Application Server
Which Process: contains structured relational data like metadata, permissions, workbooks, data extracts, user info, and other data. The File Store process enables data extract file redundancy across the cluster and ensures extracts are locally available on all cluster nodes. Under heavier loads, extract files are available locally across the cluster for faster processing and rendering. All of the Tableau Server services use and rely on the Repository process
Repository Service
Where can you install Tableau Server
Tableau’s architecture is flexible, allowing you to run the platform just about anywhere. You can install Tableau Server on-premises, in your private cloud or data center, on Amazon EC2, on Google Cloud Platform, or on MS Azure. Tableau analytics platform can also run atop virtualization platforms. We recommend you follow the best practices for each virtualization platform to ensure the best performance from Tableau Server
Coordination Service - Overview
The Coordination Service is built on Apache ZooKeeper, an open-source project, and coordinates activities on the server, guaranteeing a quorum in the event of a failure, and serving as the source of “truth” regarding the server topology, configuration, and state. The service is installed automatically on the initial Tableau Server node, but no additional instances are installed as you add additional nodes. Because the successful functioning of Tableau Server depends on a properly functioning Coordination Service, we recommend that for server installations of three or more nodes, you add additional instances of the Coordination Service by deploying a new Coordination Service ensemble. This provides redundancy and improved availability in the event that one instance of the Coordination Service has problems.
Coordination Service - Hardware
The hardware for your cluster can have some effect on how well the Coordination Service runs. In particular:
Memory. The Coordination Service maintains state information in memory. By design, the memory footprint is small, and is typically not a factor in overall server performance.
Disk speed. Because the service stores state information on disk, it benefits from fast disk speed on the individual node computers.
Connection speed between nodes. The service communicates continuously between cluster nodes; a fast connection speeds between nodes helps with efficient synchronization.
Coordination Service - Configuration
The Coordination Service is installed automatically on the initial node of Tableau Server. If you are running a single-node installation, you do not need to do anything to deploy or configure the Coordination Service. If your installation includes three or more nodes, you’ll be prompted to configure a Coordination Service ensemble when you add your third node. This is not required, but is highly recommended as the Coordination Service serves a key function for high availability, acting as the source of “truth” about server topology, configuration, and state.
To configure a Coordination Service ensemble, use the TSM CLI and add the Coordination Service to the nodes you want running it. For details on how to deploy a Coordination Service ensemble, see Deploy a Coordination Service Ensemble .
Coordination Service - Quorum
To ensure that the Coordination Service can work properly, the service requires a quorum—a minimum number of instances of the service. This means that the number of nodes in your installation impacts how many instances of the Coordination Service you want to configure in your ensemble.
Coordination Service - Number of instances
The maximum number of Coordination Service instances you can have in an ensemble on Tableau Server depends on how many Tableau Server nodes you have in your deployment. Configure a Coordination Service ensemble based on these guidelines:
Total number of server nodes Recommended number of Coordination Service nodes in ensemble (must be 1, 3, or 5) Notes
1-2 nodes 1 node This is the default and requires no changes unless you want to move the Coordination Service off your initial node and onto your additional node.
3-4 nodes 3 nodes
5 or more nodes 5 nodes Five is the maximum number of Coordination Service instances you can install.
If you reduce the nodes in your cluster from three (or more) to two nodes, a warning tells you Tableau Server can no longer support high availability:
A minimum of three Tableau Server nodes are required for high availability. You can add a third node now,
or continue with only two nodes. Continuing with only two nodes means Tableau Server will not be highly available.
You can always add a third node later. Click OK to continue with 2 nodes, or Cancel to go back and add a node.
If you continue, Tableau Server will run, but you will not have any automatic failover of the repository.
Coordination Service - Check Status
The Coordination Service is not included in the listing when you View Server Process Status. To see the state of the service, you can use the tsm status command:
tsm status -v
Data Engine - Overview
Hyper is Tableau’s in-memory Data Engine technology optimized for fast data ingests and analytical query processing on large or complex data sets. Starting in Tableau 10.5 release, Hyper powers the Data Engine in Tableau Server, Tableau Desktop, Tableau Online, and Tableau Public. The Data Engine is used when creating, refreshing or querying extracts. It is also used for cross-database joins to support federated data sources with multiple connections.
Data Engine - CPU Usage
Hyper technology leverages the new instruction sets in CPU and is capable of parallelizing and scaling to all the available cores. Hyper technology is designed to scale to many cores efficiently, and also to maximize the use of each single core as much as possible. This means that you can expect to see the CPU being fully used during query processing. Adding more CPU is expected to result in performance improvement.
Modern operating systems such as Microsoft Windows, Apple macOS, and Linux have mechanisms to make sure that even if a CPU is fully used, incoming and other active processes can run simultaneously. In addition, to manage overall resource consumption and to prevent overloading and completely starving other processes running on the machine, the Data Engine monitors itself to stay within the limits set in the Tableau Server Resource Manager (SRM). Tableau Server Resource Manager monitors the resource consumption and notifies Data Engine to reduce the usage when it exceeds the predefined limit.
Data Engine - Memory
Memory usage of the Data Engine depends on the amount of data required to answer the query. The Data Engine will try to run this in-memory first. A working set memory is allocated to store an intermediate data structure during query processing. In most cases, systems have enough memory to do these types of processing, but if there isn’t enough available memory, or if more than 80% of RAM is utilized, the Data Engine shifts to spooling by temporarily writing to disk. The temporary file get deleted after the query has been answered. Therefore, spooling is an indication that more memory may be needed. Memory usage should be monitored and upgraded appropriately to avoid performance issues caused by spooling.
To manage memory resources on the machine, the maximum memory limit for Data Engine is set by Tableau Server Resource Manager (SRM).
Data Engine - Server Configuration
A single instance of Data Engine is automatically installed per node where an instance of File Store, Application Server (VizPortal), VizQLServer, Data Server, or Backgrounder is installed. The Data Engine can scale by itself and uses as much CPU and memory as needed, thus removing the need for multiple instances of the Data Engine. For more information on the server processes, see Tableau Server Processes.
The instance of Data Engine installed on the node where File Store is installed is used for querying data for view requests. The instance of Data Engine installed on the node where backgrounder is installed is used for extract creation and refreshes. This is an important consideration when you are doing performance tuning. For more information, see Performance Tuning Examples.
Data Server, VizQL Server, and the Application Server (VizPortal) all use the local instance of Data Engine to do cross-database joins and create shadow extracts. Shadow extract files are only created when you work with workbooks that are based on non-legacy Excel or text, or statistical files. Tableau creates a shadow extract file in order to load the data more quickly.
In Tableau Server 10.5 one instance of Data Engine is installed automatically when you install backgrounder. The backgrounder process uses the single instance of Data Engine (hyperd.exe) installed on the same node
2019.2 Minimum Specs
Trail: 4 Cores / 8 vCPU & 16GB RAM
Prod: 8 Cores /16 vCPU & 32GB RAM
TSM - Connect to remote server
tsm login -s -u
Anon Access
Anonymous access to embedded Tableau views requires that you enable “guest user” for Tableau Server. Guest user also requires that you license Tableau Server according to the number of cores you are running, rather than a named-user (interactor) model
Scale Out
When you scale out Tableau Server, you add computers (or nodes). To create a highly available deployment with failover, you need at least three nodes. For example, you might run most CPU-intensive server processes on two nodes and use the third node for the gateway and coordination controller services.
Task Priority
The priority determines the order in which refresh tasks are run, where 0 is the highest priority and 100 is the lowest priority. The priority is set to 50 by default.
Execution mode
The execution mode indicates to the Tableau Server backgrounder processes whether to run refreshes in parallel or serially. Schedules that run in parallel use all available backgrounder processes and serial schedules run on only one backgrounder process. However, a schedule can contain one or more refresh tasks, and each task will only use one backgrounder process, whether in parallel or serial mode. This means that a schedule in parallel execution mode will use all available backgrounder processes to run the tasks under it in parallel, but each task will only use one backgrounder process. A serial schedule uses only one backgrounder process to run one task at a time.
Task Frequency
You can set the frequency to hourly, daily, weekly, or monthly.
When can data refresh be performed
At Publish TIme
UI (Data Window)
Command Line (find example)
Schedule
Schedule Failure Email Goes to?
Email goes to data and workbook owners
Receivers can opt out
Which Service? Stores and distributes files needed by TSM (e.g. certificates, customization files, etc.).
Client File Service (CFS)
Failure Notifications - Site or Server setting?
Site
What is contained in the Refresh Failure emails
Extract name Location on the server Time of last successful refresh Number of consecutive times the refresh has failed Reason for the failure Possible solution
How many times will a refresh fail before suspension
After five consecutive failures, the refresh schedule is suspended until you or the data owner takes an action to address the cause of the failure, such as updating database credentials or a path to the original data file
How do you change the number of times will a refresh fail before suspension?
use tsm to set wgserver.alerts.observed_days via command line
What are the different Task types?
Refreshing extracts
Running flows (Prep)
Delivering subscriptions
Internal system jobs
What limitations impact scheduled tasks?
The number of concurrent tasks is limited to the number of backgrounder processes you have configured for Tableau Server.
Separate refreshes for the same extract cannot run at the same time.
Tasks associated with a schedule that is set to run serially run one at a time.
Subscriptions are at a site or server level
Site
How do subscriptions get suspended?
After 5 failures, via the UI or API
How do you change the number of failures before a subscription suspends?
tsm config set backgrounder.subscription_failure_threshold_for_run_prevention=14 (14 being the new value).
What do you see on the Job view in the Web UI?
For each job generated, there is a Job ID , the status of that Job, the priority, the type of task that the job was generated from, as well as the current run time, if the job is in-progress, and current queue time, if queued. The Job ID can be useful when viewing jobs on Admin views and can also be used to query the Workgroups Database.
What do you see when you click on the Job ID on the Job view in the Web UI?
When you click on the Job ID, you will see more detailed information about the job, such as the Job LUID, the project name, the schedule , the content name, content owner, and the site name. The site name is displayed if you navigate to the Jobs page using the Manage All Sites menu.
How can you cancel a running job in 2019.2?
In 2019.3 you can cancel jobs using UI. 2019.2 and below was only via REST API
What are the different Job Statuses
Completed In Progress In Progress (late) Pending Pending (late) Cancelled Failed
What should you consider if you have a slow extract refresh
Isolate time period it is running Reduce Size of Extract Configure Incremental Reduce Frequency Consider Execution Mode (Paralell) Increase Backgrounder Processes Isolate affected backgrounder processes on distributed cluster
What types of user login security are supported?
Either local auth or 3rd party such as Kerberos, SSPI, SAML, or OpenID.
How can you add users when configured for local security?
User identities can be added to Tableau Server in the server UI, using tabcmd Commands, or using the REST API.
What are the different types of external security that can be configured for TS?
NTLM & SSPI - do not use SSPI for SAML. See tsm authentication sspi
Kerberos (Windows)
SAML
OpenID
Mutual SSL (install client certificate)
Trusted Authentication - supply ticket to whitelisted Web Server
LDAP
What are proxy servers and how can TS leverage them?
To enable internet access to Tableau Server do not put it in direct on internet or DMZ. Use proxies
Forward proxy servers mediate traffic from inside the network to targets on the internet.
Reverse proxy servers mediate traffic from the internet to targets inside the network.
What syntax do you use with TSM to set a configuration property
tsm configuration set –key –value
What syntax do you use to retrieve a configuration value using TSM
tsm configuration get -k
Can you use tsm to connect to a node remotely?
Yes
How many groups of tsm commands can you name?
tsm authentication tsm configuration tsm configuration set Options tsm customize tsm data-access tsm initialize tsm jobs tsm licenses tsm login tsm logout tsm maintenance tsm pending-changes tsm register tsm reset tsm restart tsm security tsm settings tsm sites tsm start tsm status tsm stop tsm topology tsm user-identity-store tsm version tsm File Paths
What are the different types of maintenance calls you can make using tsm eg backup/cleanup etc?
tsm maintenance backup tsm maintenance cleanup tsm maintenance metadata-services tsm maintenance metadata-services enable tsm maintenance metadata-services disable tsm maintenance metadata-services get-status tsm maintenance reindex-search tsm maintenance restore tsm maintenance send-logs tsm maintenance validate-resources tsm maintenance ziplogs
How do you see what changes are pending on the server using tsm?
tsm pening-changes list
What types of operation can you perform using tsm topology commands?
You can use the tsm topology commands to prepare File Store nodes for safe removal or to put them back into read-write mode. You can also initiate a repository failover, get a list of nodes or ports, get the bootstrap configuration file required to add additional nodes to your cluster, remove nodes, and configure external repository.
Which services require access to the licence server
Application Server Backgrounder Data Engine Data Server Tableau Prep Conductor VizQL Server
When creating a new project, how are the default permission configured?
Inherit from the parent site or Default site if top level.
If a site has grant permissions for a capability but a parent site has denied, which site takes precedence?
If there are nested projects, permissions set at the child level take precedence over permissions set at the parent level
If there is a change to permissions at the project level, will it be applied to all of the content within that project?
No. Changes to permissions at the project level only impact new content, they are not enforced for existing content
Can permissions on content be different to the project permissions?
Yes. If there are permissions set on content (workbook, data source, or flow) during or after publication, these take precedence over rules set at the project level.
What happens to the permissions of a workbook if a user who is not the owner of the workbook saves over the top of it?
They become the owner, with all the permissions that entails. The original owner’s access to the content is then determined by their permissions as a user rather than the owner.
Which two processes run only on the initial node and cannot be moved to any other node except in a failure situation
Licence Service (Manager) and TSM Controller (Administration Controller)
Which two process are installed on the initial node but can be added or moved to other nodes.
Client File Service (CFS) and the Coordination Service.