Protecting Greenplum databases
with Spectrum Protect using SPFS
Sample Server Policy Domain
define domain GREENDOM desc=”SPFS”
define policyset GREENDOM GREENSET1 desc=”SPFS”
define mgmtclass GREENDOM GREENSET1 GREEN_35DAYS desc=”SPFS”
define copygr GREENDOM GREENSET1 GREEN_35DAYS RETO=35 RETE=35 VERD=NOLIMIT VERE=NOLIMIT DEST=DEDUPPOOL
define copygr GREENDOM GREENSET1 GREEN_35DAYS RETVER=35 DEST=DEDUPPOOL TYPE=ARCHIVE
assign defmgmt GREENDOM GREENSET1 GREEN_35DAYS
activate policyset GREENDOM GREENSET1
Sample /etc/spfs/spfs.opt:
* Sample spfs.opt
*
MOUNTPOINT /arch
NODENAME greennode
NODEPWDFILE /etc/spfs/TSM.PWD
OPTIONFILE /etc/spfs/dsm.opt
FACILITY log_local0
DATATYPE archive
IDLETIME 300
CACHETIME 129600
SENDSTATISTICS yes
WORKERS 48
Sample dsm.sys
* greenplum spfs api backup
* Create symbloc link from api directory to share main dsm.sys file
* ln -s /opt/tivoli/tsm/client/ba/bin/dsm.sys /opt/tivoli/tsm/client/api/bin64/dsm.sys
ServerName greennode
NODENAME greennode
COMMMethod TCPip
TCPServeraddress 192.168.100.100
TCPPort 1500
* HTTPPORT should be different for each scheduler service on same server, ie 1581,1582,etc
HTTPPort 1582
PASSWORDACCESS Generate
* Default TCPWINDOWSIZE is 64, 256 is usually the largest value that can be set
* without tuning tcp in sysctl.conf. If set too large a TCPWINDOWSIZE error will be
* silently logged in dsmerror.log, so check if changing.
TCPWINDOWSIZE 256
TCPNODELAY Yes
TCPBUFFSIZE 64
SCHEDMODE polling
QUERYSCHEDPERIOD 1
* Only Set MANAGEDSERVICES if using dsmcad for scheduler not legacy dsmc sched
MANAGEDSERVICES SCHEDULE
SCHEDLOGNAME /var/opt/logs/spfs/spfs_dsmsched.log
ERRORLOGNAME /var/opt/logs/spfs/spfs_dsmerror.log
SCHEDLOGRET 7 D
ERRORLOGRET 7 D
* Following two options enable client side dedup with compression.
* Only use compression with deduplication.
DEDUPLICATION Yes
COMPRESSION yes
/etc/spfs/dsm.opt:
ServerName greenplum
With 2 cluster nodes and 48 segments per node, each node on its own SP node name, Each node during backup will start up to 48 sessions to spectrum protect — so 96. This doubles when client side dedup is used as an out-of-band control session is also start to handle client side dedup chunk queries. So single backup will start near on 200 sessions to the spectrum protect server. This really is because of the way green plum is forcing to parallelism. The “extra” dedup sessions are not started during a restore.
Each node needs to be configured to allow WORKERS <nodesegcount> in /etc/spfs.opt. So 48 in this case.
gpbackup should NOT be used with — single-data-file or — jobs. Also if doing SP deduplication — no-compression should be set.
Mount SPFS filesystem
# mount.spfs /arch
Sample usage commands:
# Full/Incremental:
# gpbackup — backup-dir=/arch/gpbackup — dbname testdb — no-compression — leaf-partition-data — verbose
# gpbackup — backup-dir=/arch/gpbackup — dbname testdb — no-compression — leaf-partition-data — verbose — incremental
# dropdb restdb ; time gprestore — plugin-config ~/gpbackup_spfs_plugin.yaml — timestamp 20201203113330 — create-db — redirect-db restdb — verbose
#
# Table Backup/Restore:
# gpbackup — backup-dir=/arch/gpbackup — dbname testdb — no-compression — leaf-partition-data — include-table public.tab10 — verbose
# dropdb restdb ; createdb restdb ; time gprestore — backup-dir=/arch/gpbackup — timestamp 20201203133225 — redirect-db restdb — include-table public.tab10 — verbose
#
With the SP nodes set to archdel=no and backdel=no, and SPFS configured to use archives, Spectrum Protect will expire the backups as they reach their archive age and the client owner will not have any control over retention. It will not be possible to remove or mark the backups deleted with gpbackup-manager.