Q1 : What is the segment in Greenplum?
A : Database instances in the Greenplum system are called segments. Segment stores the data and carries out the query processing. In Greenplum distributed systems, each segment contains a distinct portion of the data.
A : To remove the currently configured standby master host from your Greenplum Database system, run the following command in the master only:
#
Q3 : How many user schemas are there in the database?
A : Use”dn” at psql prompt.
Q4 : How to check the size of a table?
A : Table Level:
psql> select pg_size_pretty(pg_relation_size(‘schema.tablename’));
Replace schema.tablename with your search table.
Table and Index:
psql> select pg_size_pretty(pg_total_relation_size(‘schema.tablename’));
Replace schema.tablename with your search table.
Q5 : How do I get help on syntax to alter table?
A : In psql session type h alter table which will display the syntax:
gpdb=# h alter table
Each record contains two slots,
Updating a row is treated as a delete and an insert, so the XID (transaction ID) is saved to the
Now let us see how update works with help of an example;
Sample data:
[ROW id][xmin][xmax]
[R1][T1][0] —————-> First time record is inserted, R1
Updating same records after sometime
[R1][T1][T22] ————-> Record R1 is updated; T22 is the new row transaction id
[R22][T22][0] ————-> New updated record R33; Note 0 in xmax indicates new record.
If the record is deleted simply
Q7 : How to check whether Greenplum server is up and running?
A : The gpstate is the utility to check gpdb status.
Use gpstate -Q to show a quick status. Refer to gpstate –help for more options.
Sample output:
[gpadmin@stinger2]/export/home/gpadmin>gpstate -Q
gpadmin-[INFO]:-Obtaining GPDB array type, [Brief], please wait…
gpadmin-[INFO]:-Obtaining GPDB array type, [Brief], please wait…
gpadmin-[INFO]:-Quick Greenplum database status from Master instance only
gpadmin-[INFO]:———————————————————-
gpadmin-[INFO]:-GPDB fault action value = readonly
gpadmin-[INFO]:-Valid count in status view = 4
gpadmin-[INFO]:-Invalid count in status view = 0
gpadmin-[INFO]:———————————————————-
Q8 : Why do we need gpstop -m and gpstart -m?
A : The gpstart -m command allows you to start the master only and none of the data segments and is used primarily by support to get system level information/configuration. An end user would not regularly or even normally use it.
Q9 : How to add mirrors to the array?
A : The gpaddmirrors utility configures mirror segment instances for an existing Greenplum Database system that was initially configured with primary segment instances only.
For more details check help.
# gpaddmirrors –help
Q10 : How to add new user to the database?
A : Use createuser utility to create users. See createuser –help for more details.
You can also use SQL commands in
For example: CREATE USER or ROLE ….
Q11 : How to see the value of GUC?
A : By connecting GPDB database using psql query catalog or do show parameter.
Example:
gpdb# select name,setting from pg_settings where name=’GUC’;
or
gpdb# show ;
Q12 : What is resource queues?
A : Resource queues are used to manager Greenplum database workload management. All user / queries can be prioritized using Resource queues. Refer Admin guide for more details.
Q13 : How do I clone my production databaes to PreProd / QA environment?
A : If Prod and QA on same GPDB cluster, use CREATE database template .
If Prod and QA are on different clusters, use backup and restore utilities.
Q14 : What is Greenplum performance monitor and how to install ?
A : It’s a monitoring tool that collects statistics on system and query performance and builds historical data.
Q15 : What are major differences between Oracle and Greenplum?
A : Oracle is relational database. Greenplum is MPP nature. Greenplum is shared nothing architecture. There are many other differences in terms of functinality and behaviour.
A : Yes
Q17 : What are the tools available in Greenplum to take backup and restores?
A : For non-parallel backups:
Use postgres utililities (pg_dump, pg_dumpall for backup, and pg_restore for restore).
Another useful command for getting data out of
For parallel backups:
gp_dump and
A :Use Greenplum performance monitor (
A : The
Q20 : Which parameters can you use to manage workload in a Greenplum database?
A : workload management is done by creating resource queues and assigning various limits.
A : Use this option if you already have a standby master configured, and just want to resynchronize the data between the primary and backup master host. The Greenplum system catalog tables will not be updated.
#
A : If Prod and QA on same GPDB cluster, use CREATE database
If Prod and QA are on different clusters, use backup and restore utilities.
Q23 : What is good and bad about the Greenplum, compared to Oracle and Greenplum?
A : Greenplum is built on top of Postgresql. It is shared nothing, MPP architecture best for data warehousing env. Good for big data analytics purpose.
Oracle is an all-purpose database.
A : In
“ ?” – for all
“h ” For any SQL syntax help.
Q25 : What is Analyze and how frequen should I run this?
A : ANALYZE collects statistics about the contents of tables in the database, and stores the results in the system table pg_statistic. Subsequently, the query planner uses these statistics to help determine the most efficient execution plans for queries.
It is a good idea to run ANALYZE periodically, or just after making major changes in the contents of a table. Accurate statistics will help the query planner to choose the most appropriate query plan, and thereby improve the speed of query processing. A common strategy is to run VACUUM and ANALYZE once a day during a low-usage time of day.