Basic Administration Flashcards

Question

Describe the process of restoring a backup made via a filesystem snapshot

Answer 1

A filesystem snapshot backup can be restored by 1) copying the backup into place (including the WAL logs) and 2) starting the database, which will go into crash recovery mode and replay WAL logs.

Answer 2

To minimize restoration time of a filesystem snapshot, force a CHECKPOINT right before the snapshot -- this will minimize the amount of WAL logs that have to be replayed.

Answer 3

Yes, you can back up a PostgreSQL database by 1) shutting down the database; 2) using tar to copy *all* of the files in the database cluster; and 3) restarting the database.

Answer 4

You can back up a database using rsync as follows: 1) With the database running, use rsync to copy all the files. 2) Shut down the database. 3) Use rsync again to copy the files; this second rsync will create a consistent image of the database and will be quite fast, minimizing downtime.

Answer 5

A third backup approach is continuous archiving of WAL logs (combined with a possibly inconsistent filesystem backup, such as produced by tar, even with the database running).

Answer 6

Online backup

Answer 7

The primary purpose of WAL logs is to allow database commits to happen quickly (without the data being fully written to the final data pages) but to prevent loss of information in case of a crash -- the WAL logs can be "played" when the database starts up after a crash, thus restoring the physical database to match its logical state at the time of the crash.

Answer 8

Online backup -- first, get WAL log archiving started, in which full and switched WAL logs are copied to backup storage before being recycled; then take a file system backup while the database is running (tar, rsync, etc). If the database crashes, or if you want to revert the database to a specified point in time (PIT), you can copy the original full backup into place along with the archived WAL files, and start the database.

Answer 9

* wal_level must be set to `archive` or `hot_standby` (as opposed to, say, `minimal`). * archive_mode must be set to `on` (default is `off`). * archive_command must be defined -- a command to copy the WAL files somewhere.

Answer 10

PITR can be accomplished by having the database replay the WAL files only up to a specified file, not all the way to the last file.

Answer 11

Have a second server loaded with the base backup file (filesystem-level backup), and feed the archived WAL files to this second server. At any time, the WAL files can be replayed on this second server so that the second server can take over from the first with a nearly identical state.

Answer 12

pg_dump and pg_dumpall produce logical, not physical backups of the database. They don't capture enough information for the WAL files to

Answer 13

16MB -- this can be changed by recompiling PostgreSQL

Answer 14

Numerically, according to the position in the abstract WAL sequence. I say abstract sequence because there is only a small number of physical files, which are recycled by renaming when a particular WAL file is no longer needed (it has been checkpointed; i.e. the changes it encodes have been reflected to the actual data pages).

Answer 15

archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'

Answer 16

%f is the base file name of the WAL file

Answer 17

%p is the full path name of the WAL file

Answer 18

0 if the file could be copied successfully; otherwise, non-zero.

Answer 19

The pg_xlog directory will continue to fill up with WAL files. If the containing filesystem fills up, PostgreSQL will do a "panic" shutdown -- no committed transactions will be lost, however.

Answer 20

No, you must have another approach to back up changes to the configuration files in the data directory.

Answer 21

Start PostgreSQL with `-D config_dir`, where config_dir is a directory outside of the data directory, some place where the normal operating system backup procedures will back the directory up. Then, in postgresql.conf within this directory, use the `data_directory` parameter to point to the actual data directory.

Answer 22

Set archive_command to the empty string and reload the server -- WAL files will start to accumulate in the pg_xlog directory, though.

Answer 23

pg_basebackup is a command for taking a base backup of the live database, to which you would apply archived WAL files in order to recover from a disaster.

Answer 24

* streams transaction logs from a running cluster * uses streaming replication protocol * these files can be used for PITR * logs streamed in real time * can be used instead of archive command * example: pg_receivexlog -h localhost -D /usr/local/pgsql/archive

Answer 25

Uses the Unix file copy (cp) command to copy the source WAL file (%p - path) to the desired destination path (/mnt/server/archivedir/%f). The twist is that the interactive (-i) switch is used, presumably to prevent an already existing file of the same name from being overwritten. "But wait", you say, "how can this command be interactive? There's no human involved!" That's where the redirection from /dev/null comes in -- if a file would be overwritten, the prompt is written, and the empty response causes an error (status 1), which is what we would want in this scenario.

Answer 26

* Connect with psql and issue the command "select pg_start_backup('backup label');" * Back up the data directory using tar or rsync, etc. * Now issue the SQL command "select pg_stop_backup()". [Note: no label is required for the stop; it's assumed (or enforced) that there is only one backup happening at a time.]

Answer 27

pg_basebackup

Answer 28

It uses the low-level backup API (pg_start_backup('label') and pg_stop_backup()) wrapped around some binary copying mechanism, i.e. it automatically puts the database in and out of backup mode.

Answer 29

- D - destination for backup - F

- format: plain or tar - X - include log files created during the backup, so the backup is fully usable - z - compress with gzip - Z - compression level - P - enable progress reporting - h - host on which cluster is running - p - cluster port

Answer 30

pg_basebackup is being used to take a base backup of the cluster running on localhost; the files will be written to /usr/local/pgsql/backup/.

Answer 31

* Modify pg_hba.conf to add a replication connection, e.g.: "host replication postgres IP_ADDR/32 trust" * archive_command = 'cp -i %p /dest/dir/%f * archive_mode = on # requires restart * max_wal_senders = 3 wal_keep_segments = NUM wal_level = archive

Answer 32

* Stop the server (if it is running). * If you have enough space, keep a copy of the data directory and the transaction logs. * Remove all directories and files from the cluster data directory. * Restore the database files from the base backup (file system backup). * Verify the ownership of restored backup directories (must not be root). * Remove any files in pg_xlog/. * If you have any unarchived WAL segment files recovered from the crashed cluster, copy them in pg_xlog. * Create a recovery command file recovery.conf in the cluster data directory. * Start the server. * Upon completion of the recovery process, the server will rename recovery.conf to recovery.done.

Answer 33

``` restore_command (string) # unix: restore_command = 'cp /archive/dir/%f "%p"' # windows: restore_command = 'copy c:\\archive\\\dir\\"%f" "%p"' recovery_target_name (string) recovery_target_time (timestamp) recovery_target_xird (string) recovery_target_inclusive (boolean) recovery_target_timeline (string) pause_at_recovery_target (boolean) ```

Answer 34

When doing WAL archiving, remember that only full (16MB) WAL files are shipped, so if the transaction rate and volume are low, you could be exposed to losing data in case of catastrophe. The archive_timeout forces a WAL log to be shipped after the specified number of seconds has passed, regardless of whether it is full or not. If the value is too low (but not zero), it could lead to WAL bloat in the archive.

Answer 35

* pg_xlogdump displays the WAL in human-readable format * can give wrong results when the server is running * Syntax: pg_xlogdump [[startseg] [endseg]]

Answer 36

pg_xlogdump

Answer 37

* ) Old-school: pg_dump and then restore into the new cluster * ) New-school: pg_upgrade

Answer 38

* Helps upgrade between major releases of PostgreSQL (that would ordinarily require dump and restore) * Supports upgrading PG 8.3.X or later to latest version * Verifies old and new clusters are binary compatible * Option for checking clusters (?) * Can do in-place or side-by-side upgrade * Side-by-side upgrade requires double storage * Can be done with parallel jobs in PG 9.3+

Answer 39

- b old_bindir - B new_bindir - d old_datadir - D new_datadir - p old_port - P new_port - c check clusters only (no change) - j # of jobs - k use hard links instead of copying files - r retain SQL and log files even after successful completion - u user_name

Answer 40

``` row estimates (cardinality) access method: sequential or index join method: hash, nested loop, etc join type and order sorting and aggregates ```

Answer 41

``` EXPLAIN (options) statement where option can be one of: ANALYZE VERBOSE COSTS BUFFERS TIMING FORMAT {TEXT|XML|JSON|YAML} ```

Answer 42

X=estimated startup cost Y=estimated total cost R=estimated rows output W=estimated average width in bytes of rows output

Answer 43

Table statistics in pg_class include retuples, the estimated number of rows in a table, and relpages, the estimated number of disk blocks taken up by a table or index.

Answer 44

No, table statistics are updated using ANALYZE or VACUUM ANALYZE

Answer 45

pg_class and pg_statistic store the table statistics; pg_stats is a view on pg_statistic that is more commonly used.

Answer 46

The tables and indexes in the current database that the current user owns (is that right?)

Answer 47

Yes - "ANALYZE some_table" or "ANALYZE some_table(some_col)"

Answer 48

Yes, autovacuum runs ANALYZE by default.

Answer 49

No, DELETE merely marks a row as deleted; the row space can be reused or removed by VACUUM .

Answer 50

* Can recover or reuse space occupied by obsolete rows (from deletes and updates) * Can (via ANALYZE option) update data statistics * Updates the visibility map, which speeds up index-only scans * Protects against loss of very old data due to transaction ID wraparound

Answer 51

It makes index-only scans more efficient (and more likely to be chosen by the query planner) - the full heap tuple doesn't need to be read in order to determine whether an index entry is valid or not.

Answer 52

* removes dead rows and marks the space available for future reuse. * Does *not* shrink the file *except* for dead rows at the end of the table

Answer 53

* More aggressive algorithm (what does this mean?) * Rewrites the entire table with no dead space * Takes a hell of a lot more time * Requires at least twice the disk space of the original copy of the table

Answer 54

Each transaction has a 32-bit ID (XID) that is allocated serially, and each row version is marked with the XID of the transaction that created it. A transaction is not allowed to see rows with XIDs that are larger than its own, because these are "in the future". But because there is a limit to how large XIDs can be (2^32, approx 4 billion), at some point XIDs wrap around to 0, at which point a row with a new, low XID is seen as older than higher XIDs, when actually it is newer. Vacuuming can "freeze" old rows by assigning them a special XID that means "old, visible by all". This prevents wraparound failure, although I am not sure about the details.

Answer 55

At least once every two billion transactions.

Answer 56

The visibility map for a table keeps track of which pages contain only tuples that are visible to all current and future transactions (until the page is modified, anyway). This has two benefits: 1) it helps vacuum avoid looking at pages unnecessarily, and 2) it allows index scans to avoid grabbing the heap tuple merely to determine if the current transaction is allowed to see the tuple for an index entry. The VM is tiny compared to the table/heap, so for very large tables, the cost savings can be significant. This allows "index-only scans" to be used.

Answer 57

VACUUM updates it.

Answer 58

The visibility map is very small, so it is readily cached, and many index entries can be checked for visibility with very little memory or disk I/O.

Answer 59

REINDEX should be run when: * an index is corrupted (rare) * an index is bloated (many almost pages) * a storage parameter for the index (like fillfactor) has been changed * an index build with CONCURRENTLY failed, leaving an invalid index

Answer 60

REINDEX {INDEX|TABLE|DATABASE|SYSTEM} name [FORCE]

Answer 61

The pg_catalog schema is where system information about a database is stored.

Answer 62

The following objects are stored in the pg_catalog schema: * System tables (like pg_class) * System functions (e.g. pg_database_size) * System views (pg_stat_activity)

Answer 63

Yes, pg_catalog is effectively part of the search_path.

Answer 64

The \dS psql command lists tables and views from the pg_catalog schema (in addition to other tables and schemas in the search path).

Answer 65

pg_tables, pg_constraint (no s), pg_indexes, pg_trigger (no s), and pg_views - (thanks for the consistency, guys!) These are all views except for pg_constraint.

Answer 66

current_database() is the system function for showing the database to which you are connected

Answer 67

current_schema() is the system function for showing the first schema in the search path.

Answer 68

inet_client_addr, inet_client_port, inet_server_addr, inet_server_port

Answer 69

pg_postmaster_start_time

Answer 70

schema, relation name, relation type, and owner

Answer 71

size and description (the latter is empty for system objects)

Answer 72

session_user() is analogous to the UNIX "real user", current_user() to the "effective user"

Answer 73

current_schemas(boolean)

Answer 74

If true, the schemas implicitly in the search path (usually pg_catalog) are also included. If false, just the schemas from the normal explicit search_path are included.

Answer 75

current_setting(setting_name_str)

Answer 76

set_config()

Answer 77

pg_cancel_backend(pid) cancels the current query in a backend process - the argument is pid

Answer 78

pg_terminate_backend(pid)

Answer 79

set_config(setting, new_value, is_local_to_transaction) (if not transaction-specific, then applies to the session).

Answer 80

pg_reload_conf() reloads the configuration files

Answer 81

pg_rotate_logfile

Answer 82

pg_start_backup(label, [fast]) and pg_stop_backup()

Answer 83

pg_tablespace_size(name_or_oid)

Answer 84

pg_database_size(name_or_oid)

Answer 85

pg_relation_size(name_or_oid)

Answer 86

pg_total_relation_size(name_or_oid)

Answer 87

Neither one!

Answer 88

pg_column_size(something)

Answer 89

pg_ls_dir, pg_read_file, pg_stat_file

Answer 90

pg_ls_dir(dir_relative_to_data_dir) - superuser only. E.g.: select pg_ls_dir('.') lists all files in the data directory

Answer 91

pg_read_file(path) reads a file, one line per row

Answer 92

\df func_name

Answer 93

pg_stat_activity

Answer 94

pg_stat_database

Answer 95

pg_stat_user_tables

Answer 96

pg_stat_user_indexes

Answer 97

pg_stat_user_functions

Answer 98

``` -- Show all schemas explicitly in search path: select current_schemas(False); ```

Answer 99

select viewname, definition from pg_views where schemaname = 'edbstore';

Answer 100

select pg_reload_conf();

Answer 101

select usename as user, now()-backend_start as session_time from pg_stat_activity;

Answer 102

select pg_terminate_backend(pid) from pg_stat_activity where usename = 'blah';

Answer 103

select datname, pg_size_pretty(pg_database_size(oid)) from pg_database order by pg_database_size(oid);

Answer 104

COPY moves data between tables and file-system files on the database server

Answer 105

COPY table_name FROM 'filename' -- or: COPY table_name FROM PROGRAM 'command'

Answer 106

Copies data from a file into a table

Answer 107

COPY table_name TO 'filename' -- or: COPY table_name TO PROGRAM 'command'

Answer 108

COPY { table_name [ ( column_name [, ...] ) ] | ( query ) } TO { 'filename' | PROGRAM 'command' | STDOUT } [ [ WITH ] ( option [, ...] ) ] where option can be one of: FORMAT format_name OIDS [ boolean ] FREEZE [ boolean ] DELIMITER 'delimiter_character' NULL 'null_string' HEADER [ boolean ] QUOTE 'quote_character' ESCAPE 'escape_character' FORCE_QUOTE { ( column_name [, ...] ) | * } FORCE_NOT_NULL ( column_name [, ...] ) ENCODING 'encoding_name'

Answer 109

COPY emp TO '/tmp/emp.csv' WITH (FORMAT CSV, HEADER); | -- Don't forget the parentheses around options!

Answer 110

cat emp.csv | ssh remote.host "psql -U edbstore edbstore -c 'copy emp from stdin;'"

Answer 111

`COPY tablename FROM filename (FREEZE)` will freeze the loaded rows. It can only be used if the target table was previously created or truncated in the same transaction. This prevents VACUUM from having to do this freezing at some point in the future. The caveat is that the rows will be visible to all other transactions as soon as they are loaded (before the end of the enclosing transaction) -- this is a violation of MVCC.

Answer 112

COPY emp TO '/tmp/emp.csv' WITH (FORMAT CSV, HEADER, DELIMITER '|')

Answer 113

CREATE TABLE copyemp (LIKE emp);

Answer 114

* ACID-compliant * Supports transactions * Supports savepoints * Uses Write-Ahead Logging

Answer 115

* MVCC (connection scalability) * Table partitioning (size) * Tablespaces (size)

Answer 116

250-1600, depending on column types

Answer 117

A table or index

Answer 118

UC Berkeley

Answer 119

SQL Server, Informix, Ingres

Answer 120

The postmaster listens for connections from clients and spawns new a new backend process to handle each connection. The postmaster manages these backend processes as well as other background utility processes

Answer 121

No, it uses processes

Answer 122

* Shared buffers (data buffers) * WAL buffers * Process array

Answer 123

* bgwriter * stats collector * checkpointer * archiver * autovacuum * log writer * WAL writer

Answer 124

* Data files and friends (indexes, visibility map) * WAL segments * Archived WAL * Error/diagnostic log files

Answer 125

Writes dirty data blocks to disk when room is needed for more blocks in shared memory.

Answer 126

background writer (bgwriter)

Answer 127

Flushes write-ahead log to disk

Answer 128

The WAL writer process

Answer 129

The checkpointer process

Answer 130

It performs checkpoints (syncing of dirty data blocks to disk) at intervals or otherwise according to configuration parameters

Answer 131

There is one autovacuum launcher process, which launches multiple autovacuum workers processes

Answer 132

The autovacuum launcher process

Answer 133

Launches autovacuum worker processes

Answer 134

Recover free space for reuse

Answer 135

Autovacuum worker processes

Answer 136

Logging collector

Answer 137

Routes log messages to syslog, eventlog or log files

Answer 138

Stats collector

Answer 139

Collects usage statistics by relation and block

Answer 140

Archives write-ahead log files in pg_xlog when full (e.g. copies them to a mounted SAN share).

Answer 141

Shared memory and semaphores

Answer 142

IP address, user, password, key

Answer 143

Verifying permissions in the database

Answer 144

Shared buffers

Answer 145

To read OS and disk reads

Answer 146

Shared buffer blocks are written to disk only when needed: 1) to make room for new block 2) at checkpoint time

Answer 147

When DML is executed to change data, the changes are made to the data blocks in shared memory and also (in a different form) to WAL buffers in shared memory. The WAL writer process flushes WAL buffers to WAL segment files on disk (the "transaction log") periodically, or on commit, or when the WAL buffers are full. As of 9.2, there is a group commit feature which attempts to batch together WAL-writing from multiple commits that occur nearly at the same time.

Answer 148

* Before commit: changes are stored in memory in the shared data buffers and also in the WAL buffers. (It is possible under conditions of high activity and/or tight memory that changes may be forced out to WAL files and data files). * After commit: changes have been written to write-ahead log files on disk (but not necessarily to the data files). * After checkpoint: changes have been written from the shared buffers to data files.

Answer 149

Parsing, optimizing, and execution

Answer 150

1. Syntax check 2. Call Traffic Cop (what *is* that?) 3. Identify query type 4. Command processor if needed 5. Break query in tokens

Answer 151

1. Planner generates plans using database statistics 2. Query cost calculation 3. Choose best plan

Answer 152

Haha, fooled you. Just one step: execution.

Answer 153

A cluster is a collection of one or more databases managed by one server instance

Answer 154

Each cluster has a separate: * data directory * TCP port * set of processes

Answer 155

PG_VERSION A file containing the major version number of PostgreSQL base - Subdirectory containing per-database subdirectories global - Subdirectory containing cluster-wide tables, such as pg_database pg_clog - Subdirectory containing transaction commit status data pg_multixact - Subdirectory containing multitransaction status data (used for shared row locks) pg_notify - Subdirectory containing LISTEN/NOTIFY status data pg_serial - Subdirectory containing information about committed serializable transactions pg_snapshots - Subdirectory containing exported snapshots pg_stat_tmp - Subdirectory containing temporary files for the statistics subsystem pg_subtrans - Subdirectory containing subtransaction status data pg_tblspc - Subdirectory containing symbolic links to tablespaces pg_twophase - Subdirectory containing state files for prepared transactions pg_xlog - Subdirectory containing WAL (Write Ahead Log) files postmaster. opts - A file recording the command-line options the server was last started with postmaster. pid A lock file recording the current postmaster process ID (PID), cluster data directory path, postmaster start timestamp, port number, Unix-domain socket directory path (empty on Windows), first valid listen_address (IP address or *, or empty if not listening on TCP), and shared memory segment ID (this file is not present after server shutdown) Usually: postgresql.conf Usually: pg_hba.conf Usually: pg_ident.conf

Answer 156

Cluster-wide system tables, like pg_database

Answer 157

Per-database subdirectories for databases having data in the default tablespace

Answer 158

Write Ahead Log files

Answer 159

Transaction commit status data

Answer 160

Subtransaction commit status data

Answer 161

Multitransaction status data (used for shared row locks)

Answer 162

LISTEN/NOTIFY status data

Answer 163

Information about committed serializable transactions

Answer 164

Exported snapshots

Answer 165

Temporary files for the statistics subsystem

Answer 166

Symbolic links to tablespaces

Answer 167

A file recording the command-line options the server was last started with

Answer 168

A lock file recording the current postmaster process ID (PID), cluster data directory path, postmaster start timestamp, port number, Unix-domain socket directory path (empty on Windows), first valid listen_address (IP address or *, or empty if not listening on TCP), and shared memory segment ID (this file is not present after server shutdown)

Answer 169

State files for prepared transactions

Answer 170

One directory per database in the cluster, the directory being named with the OID of the database. This is the default location for the cluster's databases and files, and the system catalogs are stored here at a minimum

Answer 171

One or more files for each table or index in the database. For ordinary relations, these files are named after the table or index's filenode number, which can be found in pg_class.relfilenode. But for temporary relations, the file name is of the form tBBB_FFF, where BBB is the backend ID of the backend which created the file, and FFF is the filenode number. In either case, in addition to the main file (a/k/a main fork), each table and index has a free space map, which stores information about free space available in the relation. The free space map is stored in a file named with the filenode number plus the suffix _fsm. Tables also have a visibility map, stored in a fork with the suffix _vm, to track which pages are known to have no dead tuples. Unlogged tables and indexes have a third fork, known as the initialization fork, which is stored in a fork with the suffix _init.

Answer 172

Each tablespace is a directory. `base` is the default tablespace. All other tablespaces can be located anywhere, but a soft link to each tablespace must be placed in data/tblspc/.

Answer 173

Yes, a database can have files in multiple tablespaces. Each tablespace has a subdirectory for each database that has files in the tablespace, so the OID of the tablespace may appear in multiple tablespace directories.

Answer 174

* A table or index is stored in one or more physical files. * For non-temporary relations, the first such file is named as the relation's file node number (pg_class.relfilenode), and subsequent files for the same relation are named _N where N is a serial number. * For temporary relations, the file name is of the form tBBB_FFF, where BBB is the backend ID of the backend which created the file, and FFF is the filenode number

Answer 175

* Visibility map: RELFILENODE_vm * Free space map: RELFILENODE_fsm * (For unlogged relations): initialization fork: RELFILENODE_init

Answer 176

``` bin - programs data - data directory doc - documentation include - header files installer, scripts - installer files (EDB) lib - libraries pgAdmin III (EDB) StackBuilder (EDB) pg_env.{bat,sh} ```

Answer 177

* Page header - approx 24 bytes long; pointer(s) to free space in the page; general info * Row/index pointers - array of offset/length pairs pointing to row or index entries later in the page * Free space - unallocated space. New pointers are allocated from the front , new rows/index entries from the rear * Row/Index entries - actual row or index entries * Special - index access method-specific data (empty in regular tables) [Really? Empty?]

Answer 178

No; a password is required

Answer 179

* EDB One-Click Installer * OS system package (RPM/YUM, Debian/Ubuntu DEB, FreeBSD port, Solaris package, Mac OS X Homebrew * Source code

Answer 180

PATH - should include the correct PG bin directory PGDATA - points to data cluster directory PGPORT - point to port on which cluster is running PGUSER - default database user name PGDATABASE - default database

Answer 181

Edit your shell .profile or .bash_profile

Answer 182

Use the Windows My Computer properties page

Answer 183

initdb creates a database cluster's data directory

Answer 184

initdb –D - a - specifies the authentication method for local users - D - Database cluster directory - U - Select the database super user name - E - Specify the database encoding - k --data-checksums - Use checksums on data pages to help detect corruption - W – prompt for superuser password - X, --xlogdir=XLOGDIR location for the transaction log directory

Answer 185

postgtesql.conf - to set the correct listening address and port, and to set appropriate configuration in general pg_hba.conf - to define what users should be able to connect to which databases from which IP addresses

Answer 186

−pg_ctl initdb [] - creates a new PostgreSQL database cluster −pg_ctl start [] - Start the server −pg_ctl stop [] - Stop the server −pg_ctl restart [] - Restart the server −pg_ctl status [] - Display server status −pg_ctl reload [] - Reload configuration file −pg_ctl promote [-D DATADIR] – Promote Standby to be Primary −pg_ctl kill signal_name process_id – send a signal(ABRT HUP INT QUIT TERM USR1 USR2) to a process −Pg_ctl register|unregister - a system service on Microsoft Windows

Answer 187

−-m smart (the defaults) waits for all clients to exit −-m fast rolls back active transactions, closes open connections, and shuts down cleanly −-m immediate performs an immediate, abnormal shutdown (i.e. a crash)

Answer 188

−-D to specify an alternate cluster location −-l to specify an alternate log file, when starting the server −-c, --core-files allow postgres to produce core files Starting and Stopping the Server (pg_ctl)

Answer 189

\c [DBNAME [USERNAME]]

Answer 190

* listen_addresses (default localhost) - IP addresses to listen on; '*' means all * port (default 5432) - port to listen on * max_connections (default 100) - max concurrent connections * superuser_reserved_connections (default 3) - number of connections reserved for superusers (out of the defined max_connections) * unix_socket_directory (default /tmp) - directory to be used for UNIX socket connections * unix_socket_permissions (default 0777) - access permissions of the UNIX-domain socket

Answer 191

* authentication_timeout (default 1 minute) * ssl (default: off) - enable SSL connections * ssl_ca_file - SSL certificate authority file * ssl_cert_file - SSL certification * ssl_key_file - SSL private key * ssl_ciphers - list of eligible SSL ciphers * ssl_renegotiation_limit (default 512 MB) - how much data can flow through the connection before renegotiation occurs

Answer 192

* shared_buffers (default: <=128MB) - size of shared buffer pool; rule of thumb: 25% of system memory to a max of 8GB on Linux, or 512 MB on Windows * temp_buffers (default: 8 MB) - amount of memory used by each backend for caching temp table data * work_mem (default: 1MB) - amount of memory used for each sort or hash operation before switching to temporary disk files * maintenance_work_mem (default: 16 MB) - amount used for each index build or VACUUM * temp_file_limit (default: -1) - amount of disk space that a session can use for temporary files. Default is unlimited. Attempting to exceed the limit will abort a transaction.

Answer 193

25% of system memory up to a maximum of 512 MB.

Answer 194

25% of system memory, up to a maximum of 8 GB.

Answer 195

Memory setting controlling the size of the shared data buffer cache. Default is <= 128 MB.

Answer 196

temp_buffers (default: 8MB): Amount of memory used by each backend for caching temporary table data.

Answer 197

work_mem (default: 1MB): Amount of memory used for each sort or hash operation before switching to temporary disk files. Default is conservative, but don't overdo it.

Answer 198

maintenance_work_mem (default: 16MB): Amount of memory used for each index build or VACUUM.

Answer 199

temp_file_limit (default -1): amount of disk space that a session can use for temporary files. A transaction attempting to exceed this limit will be cancelled. Default is unlimited. Memory Settings

Answer 200

* random_page_cost (default 4.0): Estimated cost of a random page fetch, in abstract cost units. May need to be reduced to account for caching effects. * seq_page_cost (default 1.0): Estimated cost of a sequential page fetch, in abstract cost units. May need to be reduced to account for caching effects. Must always set random_page_cost >= seq_page_cost. * effective_cache_size (default 128M): Used to estimate the cost of an index scan. Rule of thumb is 75% of system memory. * There are plenty of enable_* parameters which influence the planner in choosing an optimal plan. For example: * enable_indexonlyscan enables or disables the query planner's use of index-only-scan plan types

Answer 201

random_page_cost (default 4.0): Estimated cost of a random page fetch, in abstract cost units. May need to be reduced to account for caching effects.

Answer 202

seq_page_cost (default 1.0): Estimated cost of a sequential page fetch, in abstract cost units. May need to be reduced to account for caching effects. Must always set random_page_cost >= seq_page_cost.

Answer 203

effective_cache_size (default 128M): Used to estimate the cost of an index scan. Rule of thumb is 75% of system memory.

Answer 204

* wal_level * fsync * wal_buffers * checkpoint_segments * checkpoint_timeout

Answer 205

wal_level (default: minimal). Determines how much information is written to the WAL. Change this to enable WAL archiving/replication. Other values are archive and hot_standby.

Answer 206

fsync (default on): Turn this off to make your database much faster – and silently cause arbitrary corruption in case of a system crash.

Answer 207

wal_buffers (default: -1, autotune): The amount of memory used in shared memory for WAL data. The default setting of -1 selects a size equal to 1/32nd (about 3%) of shared_buffers

Answer 208

checkpoint_segments (default 3): Maximum number of 16MB WAL file segments between checkpoints. Default is too small!

Answer 209

checkpoint_timeout (default 5 minutes): Maximum time between checkpoints.

Answer 210

* log_destination * logging_collector * log_directory * log_filename * log_file_mode * log_rotation_age * log_rotation_size

Answer 211

log_destination. Valid values are combinations of stderr, csvlog, syslog, and eventlog, depending on platform.

Answer 212

strftime (but system strftime is not used so you can't use local extensions). postgresql-%Y-%M-%d.log

Answer 213

* client_min_messages (default NOTICE). Messages of this severity level or above are sent to the client. * log_min_messages (default WARNING). Messages of this severity level or above are sent to the server. * log_min_error_statement (default ERROR). When a message of this severity or higher is written to the server log, the statement that caused it is logged along with it. * log_min_duration_statement (default -1, disabled): When a statement runs for at least this long, it is written to the server log, with its duration.

Answer 214

* log_connections (default off): Log successful connections to the server log. * log_disconnections (default off): Log some information each time a session disconnects, including the duration of the session. * log_error_verbosity (default “default”): Can also select “terse” or “verbose”. * log_duration (default off): Log duration of each statement. * log_line_prefix: Additional details to log with each line. * log_statement (default none): Legal values are none, ddl, mod (DDL and all other data-modifying statements), or all. * log_temp_files (default -1): Log temporary files of this size or larger, in kilobytes. * log_checkpoints (default off): Causes checkpoints and restartpoints to be logged in the server log.

Answer 215

Logs temporary files of this size or larger, in kilobytes

Answer 216

* bgwriter_delay * bgwriter_lru_maxpages * bgwriter_lru_multiplier

Answer 217

bgwriter_delay (default 200 ms): Specifies time between activity rounds for the background writer.

Answer 218

bgwriter_lru_maxpages (default 100): Maximum number of pages that the background writer may clean per activity round.

Answer 219

bgwriter_lru_multiplier (default 2.0): Multiplier on buffers scanned per round. By default, if system thinks 10 pages will be needed, it cleans 10 * bgwriter_lru_multiplier of 2.0 = 20.

Answer 220

The primary background writer tuning technique is to lower the bgwriter_delay.

Answer 221

search_path specifies the order in which schemas are searched. The default value for this parameter is: "$user", public

Answer 222

search_path

Answer 223

default_tablespace is the name of the tablespace in which objects are created by default

Answer 224

temp_tablespaces holds the tablespace name(s) in which objects are created by default. (The temp table load is evenly spread across the tablespaces in this list).

Answer 225

Any statement that takes more than the specified number of milliseconds will be aborted. The default value is 0 (no maximum statement time).

Answer 226

``` vacuum_cost_delay vacuum_cost_page_hit vacuum_cost_page_miss vacuum_cost_page_dirty vacuum_cost_limit ```

Answer 227

vacuum_cost_delay is the length of time, in milliseconds, that the process will wait when the cost limit it exceeded. Default for manual VACUUM is 0, so if you want a low-impact manual vacuum, you should set this to a non-zero value (autovacuum uses 20 ms ....)

Answer 228

autovacuum (default on) turns on or off autovacuuming

Answer 229

Autovacuum tasks running longer than this duration (in milliseconds) are logged. -1 is the default, which disables this logging.

Answer 230

autovacuum_max_workers (default 3) is the max number of autovacuum worker processes that may be running at one time

Answer 231

autovacuum_max_workers

Answer 232

autovacuum!

Answer 233

include 'filename'

Answer 234

include_dir 'somedir'. All files named *.conf will be included from the named directory, in C locale filename order. Thus, you could name the files 00_foo.conf, 01_bar.conf, 02_baz.conf to control the order of loading while still having the names be meaningful.

Answer 235

superuser_reserved_connections

Answer 236

log_min_duration_statement

Answer 237

In postgresql.conf, set: max_connections = 200 | and restart the server

Answer 238

In postgresql.conf, set: superuser_reserved_connections = 10 | and restart the server

Answer 239

In postgresql.conf set: authentication_timeout = 10s | and reload the server

Answer 240

Set logging_collector to on

Answer 241

Set log_min_duration_statement to 5000

Answer 242

Set log_connections to on and set the log_line_prefix to something include '%u'

Answer 243

``` set autovacuum_max_workers to 6 set autovacuum_vacuum_threshold to 100 set autovacuum_vacuum_scale_factor to 0.3 set autovacuum_analyze_threshold to 100 set autoavacuum_cost_limit to 100 ```

Answer 244

A database cluster contains: * Roles (users, groups) * Tablespaces * Databases

Answer 245

A database contains: * catalogs * extensions * schemas

Answer 246

* tables * views * sequences * functions * event triggers

Answer 247

select datname from pg_database;

Answer 248

\l (lower-case L)

Answer 249

CREATE DATABASE name [ [ WITH ] [ OWNER [=] dbowner ] [ TEMPLATE [=] template ] [ ENCODING [=] encoding ] [ TABLESPACE [=] tablespace ] [ CONNECTION LIMIT [=] connlimit ] ]

Answer 250

create schema fooz authorization foozer;

Answer 251

"$user", public

Answer 252

The current schema is the first schema in the search_path

Answer 253

When trying to resolve object names (tables, etc), PG searches the schemas in the search_path in order. When creating a table, by default the table is created in the "current schema" -- the first schema in the search_path.

Answer 254

Users are roles that can log into any database; groups are roles that can NOT log into any database.

Answer 255

``` > create user fred with password 'flinstone'; > create database fred with owner fred; > \c fred fred > create schema fred; > \dn > \q ```

Answer 256

select datname from pg_database; -- Advanced: list databases with owners: select d.datname, u.usename from pg_database d join pg_user u on (d.datdba = u.usesysid);

Answer 257

\l (lower case L)

Answer 258

Two methods: 1. (psql meta-command) \d+ 2. (SQL) select schemaname, tablename, tableowner from pg_tables

Answer 259

Double-quotes are used to specify an exact name, preserving case

Answer 260

There are two different and equivalent usages of psql: 1) psql [DBNAME [USER]] and 2) psql -U USER DBNAME

Answer 261

If the PGUSER and PGDATABASE environment variables are not defined, the default values for USER and DBNAME are the name of the operating system user. If PGUSER and PGDATABASE are defined, those values are used. If just PGUSER is defined, that value is also the default database name.

Answer 262

\c DBNAME [USER]

Answer 263

\c - NEWUSER | -- (note the dash character)

Answer 264

\c DBNAME [USER]

Answer 265

\c - NEWUSER | -- (note the dash character)

Answer 266

1) If the PGHOST and PGPORT environment variables are defined, they are used. Otherwise, on UNIX systems, a local UNIX socket connection is attempted, or, on Windows, a local TCP connection.

Answer 267

psql always runs commands in ~/.psqlrc unless -X is specified

Answer 268

Shows or saves the command history

Answer 269

Edits (and then executes) the query buffer

Answer 270

\w FILENAME

Answer 271

Saves the query buffer (to the specified FILENAME)

Answer 272

Two methods. 1) (Command line): -v NAME=VALUE 2) (Inside psql): \set NAME VALUE

Answer 273

``` Three ways: 1) Unadorned (numbers) :NAME \set NAME 10 select :NAME; 2) Quoted strings :'NAME' \set NAME testing select :'NAME'; 3) Identifiers :"NAME" \set NAME empno select :"NAME" from emp; ```

Answer 274

``` -o FILENAME or \o FILENAME (FILENAME may be a pipe) Example: \o | grep 4 select empno from emp; 7499 7654 7844 7934 (14 rows) ```

Answer 275

``` Three ways: 1) Unadorned (numbers) :NAME \set NAME 10 select :NAME; 2) Quoted strings :'NAME' \set NAME 'testing' select :'NAME'; 3) Identifiers :"NAME" \set NAME 'empno' select :"NAME" from emp; ```

Answer 276

-o FILENAME or \o FILENAME (FILENAME may be a pipe)

Answer 277

\g [FILENAME] (filename may be a pipe)

Answer 278

"Tuples only" mode means that column headings are not output, and the final count of rows is not output. Within psql, \t toggle tuples-only output. On the psql command line, -t turns on tuples-only mode. An alternative to \t is \pset tuples_only

Answer 279

Toggles expanded output, where columns are output as rows.

Answer 280

"Tuples only" mode means that column headings are not output, and the final count of rows is not output. Within psql, \x

Answer 281

\echo [-n] [string] Prints a string on STDOUT, followed by a newline, unless the -n switch is used, which suppresses the newline. The output of \echo is not affected by -o or \o.

Answer 282

\qecho is like \echo, but its output *is* redirected by -o or \o.

Answer 283

name, owner, encoding, collation, ctype, access privileges

Answer 284

size, tablespace, and descripion

Answer 285

Lists schemas (namespaces)

Answer 286

name and owner

Answer 287

name, owner, access privileges, and description

Answer 288

List functions

Answer 289

schema, name, result data type, argument data types, type

Answer 290

Same as \df, but in addition: security, volatility, owner, language, source code

Answer 291

List info about indexes, sequences, tables, views, or System objects; any combination of letters is possible

Answer 292

Lists per-database role settings

Answer 293

List access privileges (for tables, views, and sequences, by default)

Answer 294

This command fetches and edits the definition of the named function (Can also be used to create new functions - just don't supply a name, and your editor will be invoked on a function definition template).

Answer 295

\d+ TABLENAME

Answer 296

\d+ also gives you storage, stats target, and description

Answer 297

``` Two ways: 1. With \g ("one-shot \o"): select * from emp \g FILENAME 2. With \o: \o FILENAME select * from emp; \o 3. From command line: psql -U edbstore -o /tmp/emp.dat -c "select * from emp" edbstore ```

Answer 298

``` \t \o FILENAME select * from emp; \o \t ```

Answer 299

Three ways: 1. \i FILENAME 2. Command line: psql ... < FILENAME 3. Command line: psql ... -f FILENAME ...

Answer 300

Two approaches, sort of: 1. \df+ myfunc (primitive) 2. \ef myfunc (opens your EDITOR to allow you to view and edit source code of the function)

Answer 301

It means either that no PostgreSQL server is running on the specified host, or that it is not listening on the specified host and port.

Answer 302

* Server/application * Database * Object

Answer 303

Checking pg_hba.conf: there must be an entry matching type (e.g. host, hostssl), database name, user name, client IP address, and authentication method (md5, trust, etc).

Answer 304

* Checking user/password combo * CONNECT privilege on the database * SCHEMA permissions

Answer 305

* Object (e.g. table) level privileges, administered with GRANT and REVOKE

Answer 306

* Type (e.g. host, hostssl) * Database name (or 'all') * User name (or 'all') * Host spec, incl IPv4, IPv6, or DNS hostname * Client IP address modified by CIDR mask * authentication method (trust, reject, md5, password, gss, sspi, krb5, ident, peer, pam, ldap, radius or cert.)

Answer 307

trust, reject, md5, password, gss, sspi, krb5, ident, peer, pam, ldap, radius or cert.

Answer 308

replication

Answer 309

* Add a line to pg_hba.conf (usually in the cluster's data directory, unless you have specified that it go elsewhere), like: host somedb someuser a.b.c.d/32 md5 * Reload the server to activate the change to pg_hba.conf - send a HUP signal to the postmaster, or execute the SQL "select pg_reload_conf();" as a superuser, or use "pg_ctl -D datadir reload" (or the OS service equivalent that wraps pg_ctl).

Answer 310

ALTER DEFAULT PRIVILEGES FOR ddl_user IN SCHEMA public GRANT SELECT ON TABLES TO readonly_user;

Answer 311

DROP OWNED or REASSIGN OWNED

Answer 312

* ) pg_hba.conf - ability to authenticate * ) CONNECT privilege on the database (this is default) * ) USAGE privilege on the relevant schema * ) SELECT, UPDATE, INSERT, DELETE, EXECUTE, etc privilege on the object in the schema (grant some_privs on all tables in schema foo to some_user)

Answer 313

bigint (int8)

Answer 314

real (float4)

Answer 315

double precision (float8)

Answer 316

numeric(p,s) (p=precision, the total number of digits, and s=scale, the number of digits in the fractional part)

Answer 317

Yes, it has a 'json' data type for storing JSON (JavaScript Object Notation) data

Answer 318

int4range, int8range, numrange, tsrange, tstzrange, daterange (note: no floating point range)

Answer 319

The column name comes first, before the column type.

Answer 320

insert into departments(dep_id, name) values (1, 'Finance');

Answer 321

insert into departments(dep_id, name) values (1, 'Finance'), (2, 'Silly Walks');

Answer 322

update departments set name='development' where dep_id = 1;

Answer 323

delete from departments where department_id = 2

Answer 324

$$Don't want no stinkin' single quotes$$ -- or: $foo$Don't want it$foo$

Answer 325

Double quotes are used to delimit database object names that clash with keywords, contain mixed case, or contain special characters (something other than a-z, 0-9, or underscore).

Answer 326

``` check constraints not-null constraints unique constraints primary keys foreign keys ```

Answer 327

CREATE [TEMPORARY | TEMP] SEQUENCE name [INCREMENT [ BY ] increment] [ MINVALUE minvalue | NO MINVALUE ] [ MAXVALUE maxvalue | NO MAXVALUE ] [ START [ WITH ] start ] [ CACHE cache ] [ [ NO ] CYCLE ]

Answer 328

nextval('myseq')

Answer 329

Advances the sequence and returns a new value. The single argument should be the name of the sequence, as a string.

Answer 330

currval returns the *most recently used* value for a specific sequence.

Answer 331

An error; currval is the most recently used/allocated value for the sequence, so it is undefined until nextval() has been called.

Answer 332

Sets the next value to be returned by the sequence.

Answer 333

Not by default; you have to create rules in order to get updatable views.

Answer 334

No, it is not secure to do so, *unless* the view is create with the "with (security_barrier)" option.

Answer 335

Subquery. Don't ask me what this means, though.

Answer 336

PG 9.3 has materialized views, which are like pre-computed views, with the option to refresh the snapshot of data stored in the materialized view. You could also view materialized views as being like tables that can only be populated with a single query.

Answer 337

CREATE MATERIALIZED VIEW myview AS SELECT blah, blah blah;

Answer 338

REFRESH MATERIALIZED VIEW myview;

Answer 339

Subqueries appearing in FROM can be preceded by the key word LATERAL. This allows them to reference columns provided by preceding FROM items. (Without LATERAL, each subquery is evaluated independently and so cannot cross-reference any other FROM item.) LATERAL is primarily useful when the cross-referenced column is necessary for computing the row(s) to be joined. A common application is providing an argument value for a set-returning function.

Answer 340

* B-tree (default) * Hash (not crash safe) * Index on expressions (use when quick retrieval is needed on a frequently used expression) * Partial index (index only rows that satisfy the WHERE clause, which need not include the indexed column; a query must use the same WHERE clause in order to use the partial index) * SP-GiST indexes (space-partitioned GiST supported partitioned search trees)

Answer 341

CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ name ] ON table_name [ USING method ] ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] ) [ WITH ( storage_parameter = value [, ... ] ) ] [ TABLESPACE tablespace_name ] [ WHERE predicate ]

Answer 342

No. (However, it *is* necessary to wrap the column argument of "USING " in parentheses in a JOIN query that uses one).

Answer 343

It is equivalent to "SELECT * FROM "

Answer 344

No, OIDs are not dumped by default. Use the -o (little o) switch to dump OIDs

Answer 345

You must use the `-Fd` (directory format) switch along with `-j`. E.g.: `pg_dump -Fd -j 4` would dump to directory format using 4 parallel jobs.

Answer 346

It should be restored into an empty database. If you drop the database dbname and recreate it, then you can run: psql dbname < dbname.sql

Answer 347

Yes, all backups made by pg_dump are portable across architectures.

Answer 348

``` r -- SELECT ("read") w -- UPDATE ("write") a -- INSERT ("append") d -- DELETE D -- TRUNCATE x -- REFERENCES t -- TRIGGER X -- EXECUTE U -- USAGE C -- CREATE c -- CONNECT T -- TEMPORARY ```

Answer 349

recovery.conf in the data directory will have been renamed to recovery.done

Answer 350

Just: | restore_command = 'cp /mnt/server/archivedir/%f "%p"' or what have you

Answer 351

``` recover_command = 'cp /mnt/server/archivedir/%f "%p"' recover_target_time = ' ```

Basic Administration Flashcards

(416 cards)