Protecting Greenplum databases

Tomas Dalebjörk
3 min readDec 24, 2020

with Spectrum Protect using SPFS

Sample Server Policy Domain

define domain GREENDOM desc=”SPFS”

define policyset GREENDOM GREENSET1 desc=”SPFS”

define mgmtclass GREENDOM GREENSET1 GREEN_35DAYS desc=”SPFS”

define copygr GREENDOM GREENSET1 GREEN_35DAYS RETO=35 RETE=35 VERD=NOLIMIT VERE=NOLIMIT DEST=DEDUPPOOL

define copygr GREENDOM GREENSET1 GREEN_35DAYS RETVER=35 DEST=DEDUPPOOL TYPE=ARCHIVE

assign defmgmt GREENDOM GREENSET1 GREEN_35DAYS

activate policyset GREENDOM GREENSET1

Sample /etc/spfs/spfs.opt:

* Sample spfs.opt

*

MOUNTPOINT /arch

NODENAME greennode

NODEPWDFILE /etc/spfs/TSM.PWD

OPTIONFILE /etc/spfs/dsm.opt

FACILITY log_local0

DATATYPE archive

IDLETIME 300

CACHETIME 129600

SENDSTATISTICS yes

WORKERS 48

Sample dsm.sys

* greenplum spfs api backup

* Create symbloc link from api directory to share main dsm.sys file

* ln -s /opt/tivoli/tsm/client/ba/bin/dsm.sys /opt/tivoli/tsm/client/api/bin64/dsm.sys

ServerName greennode

NODENAME greennode

COMMMethod TCPip

TCPServeraddress 192.168.100.100

TCPPort 1500

* HTTPPORT should be different for each scheduler service on same server, ie 1581,1582,etc

HTTPPort 1582

PASSWORDACCESS Generate

* Default TCPWINDOWSIZE is 64, 256 is usually the largest value that can be set

* without tuning tcp in sysctl.conf. If set too large a TCPWINDOWSIZE error will be

* silently logged in dsmerror.log, so check if changing.

TCPWINDOWSIZE 256

TCPNODELAY Yes

TCPBUFFSIZE 64

SCHEDMODE polling

QUERYSCHEDPERIOD 1

* Only Set MANAGEDSERVICES if using dsmcad for scheduler not legacy dsmc sched

MANAGEDSERVICES SCHEDULE

SCHEDLOGNAME /var/opt/logs/spfs/spfs_dsmsched.log

ERRORLOGNAME /var/opt/logs/spfs/spfs_dsmerror.log

SCHEDLOGRET 7 D

ERRORLOGRET 7 D

* Following two options enable client side dedup with compression.

* Only use compression with deduplication.

DEDUPLICATION Yes

COMPRESSION yes

/etc/spfs/dsm.opt:

ServerName greenplum

With 2 cluster nodes and 48 segments per node, each node on its own SP node name, Each node during backup will start up to 48 sessions to spectrum protect — so 96. This doubles when client side dedup is used as an out-of-band control session is also start to handle client side dedup chunk queries. So single backup will start near on 200 sessions to the spectrum protect server. This really is because of the way green plum is forcing to parallelism. The “extra” dedup sessions are not started during a restore.

Each node needs to be configured to allow WORKERS <nodesegcount> in /etc/spfs.opt. So 48 in this case.

gpbackup should NOT be used with — single-data-file or — jobs. Also if doing SP deduplication — no-compression should be set.

Mount SPFS filesystem

# mount.spfs /arch

Sample usage commands:

# Full/Incremental:

# gpbackup — backup-dir=/arch/gpbackup — dbname testdb — no-compression — leaf-partition-data — verbose

# gpbackup — backup-dir=/arch/gpbackup — dbname testdb — no-compression — leaf-partition-data — verbose — incremental

# dropdb restdb ; time gprestore — plugin-config ~/gpbackup_spfs_plugin.yaml — timestamp 20201203113330 — create-db — redirect-db restdb — verbose

#

# Table Backup/Restore:

# gpbackup — backup-dir=/arch/gpbackup — dbname testdb — no-compression — leaf-partition-data — include-table public.tab10 — verbose

# dropdb restdb ; createdb restdb ; time gprestore — backup-dir=/arch/gpbackup — timestamp 20201203133225 — redirect-db restdb — include-table public.tab10 — verbose

#

With the SP nodes set to archdel=no and backdel=no, and SPFS configured to use archives, Spectrum Protect will expire the backups as they reach their archive age and the client owner will not have any control over retention. It will not be possible to remove or mark the backups deleted with gpbackup-manager.

--

--