Basic Administration Flashcards
Name some security features
- SSL
- host-based authentication
- object-level permissions
- logging
- groups & roles
Name some recovery and availability features
- Streaming replication (async and sync), aka hot standby
- Cascading streaming replication
- pg_basebackup, hot backup
- PITR (point in time recovery)
Name some cool advanced features
- triggers and functions
- many procedural languages (pgSQL, Perl, TCL, PHP, Java, Python, etc)
- custom procedural languages
- upgrade using pg_upgrade
- unlogged tables
- materialized views
What does MVCC stand for
Multi-Version Concurrency Control
List some characteristics of MVCC
- Maintains data consistency.
- Each transaction sees a snapshot of the database as it was at the beginning of the transaction.
- Writers block writers; nothing else blocks.
- Concurrent transactions are isolated.
- Prevents transactions from seeing inconsistent data.
What does WAL stand for?
Write-Ahead Logs (or Write-Ahead Logging)
How does WAL work?
- Records each data change before it actually takes place.
- Data is not considered ‘safe’ until the log is written to disk.
- Provides recovery in case of crash or failure.
- As DML is executed, changes are recorded in memory in WAL buffers (as well as the shared data buffers).
- When COMMIT is executed, the WAL buffers are flushed to the WAL segment files
- Later the dirty data buffers are written to disk
What is ACID?
Atomicity, Consistency, Isolation, Durability
List some commonly used pg_dump switches
- a Data only
- s Definitions (“schema”) only
- n NAME Dump named schema (“namespace”)
- t NAME Dump named table
- Fp Plain format
- Fc Custom format
- Fd Directory format
- Ft Tar format
- f FILE Dump to named file
- o Dump OIDs
- j JOBS Parallelize directory format dumps
- v Verbose
Name four general methods for backup
- SQL dump (pg_dump, pg_dumpall)
- Filesystem dump
- Continuous archiving
- Streaming replication
List two methods (command line examples) for dealing with very large databases when doing SQL backups
- pg_dump … | gzip -c > file.sql.gz
* pg_dump … | split -b 1m - filename
Describe two methods for restoring SQL dumps done with pg_dump
- psql < TEXT_FORMAT_DUMP_FILE (or psql -f TEXT_FORMAT_DUMP_FILE)
- pg_restore NON_TEXT_FORMAT_DUMP_FILE
Using command defaults for pg_dump, when you restore a backup into a new cluster, is there anything you need to do first?
Yes, create the target database. By default, pg_dump does not include the command to create the target database.
What does the –create (-C) switch to pg_dump do?
pg_dump --create
inserts a command to create and reconnect to the target database before the commands to populate that database.
What does the –clean switch to pg_dump do?
pg_dump --clean
drops the target database prior to recreating it, when using the –create (-C) switch.
List important pg_restore options
pg_restore
- d DB Connect to the specified database, and, if -C (–create) is not specified, restore into this database also.
- C Create the database specified in the backup and restore into it
- a Restore data only (assumes the SQL objects like tables have already been created)
- s Restore object definitions (DDL) only, not data
- t NAME Restore named table
- n NAME Restore named schema/namespace
- v Verbose
What command can dump an entire cluster as SQL?
pg_dumpall
What does pg_dumpall do?
pg_dumpall dumps all databases in the cluster, and also global objects such as roles and tablespaces.
List commonly used pg_dumpall switches
pg_dumpall
- a Dump data only
- s Dump definitions only
- c Clean (drop) objects before recreating
- g Dump global objects (roles, etc) only, not databases
- r Dump roles only, not databases or other global objects
- O Skip ownership commands
- x Skip privileges
- -disable-triggers Disable triggers during data-only restore
- v Verbose
How many times does pg_dumpall ask for authentication credentials?
Once per database (and more?)
Why is it a good idea to use a pgpass file when using pg_dumpall?
A pgpass file is a good idea for pg_dumpall because the user’s credentials are requested once per database in the cluster.
List some characteristics of SQL dumps as a backup mechanism
- Generate a text file containing SQL commands (or a binary representation thereof)
- pg_dump is the relevant command (or pg_dumpall for the entire cluster)
- pg_dump does not block readers or writers
- pg_dump does not operate with special permissions
- pg_dump dumps are internally consistent and are a snapshot of the database at the time pg_dump begins running
Next to SQL dump, what is the next most simple backup approach?
Simple file system level backup is an alternative to SQL dumps, provided that either 1) the database is shut down during the backup, or 2) the native snapshot feature of the filesystem is used (if available).
Why mightn’t filesystem snapshots work without database downtime, even if your filesystem supports this feature?
Filesystem snapshots might not work if the database is spread across multiple filesystems; if not, you would have to stop the database to take the multiple snapshots.
- format: plain or tar - X - include log files created during the backup, so the backup is fully usable - z - compress with gzip - Z - compression level - P - enable progress reporting - h - host on which cluster is running - p - cluster port