PostgreSQL | Buda Consulting

7 Ways To Improve SQL Query Performance

by Robert Buda | May 14, 2021 | MySQL, NoSQL, PostgreSQL, SQL Server

How do you improve SQL query performance? That is a big question, and one that we get asked all the time. There is no one answer, but there is a process that we apply to make a difference in query performance. In this post, I will discuss some of the questions we ask, some of the diagnostics we run, and some of the steps we take to reduce the amount of time a query takes.

The questions to ask are similar for any relational database software, so this discussion will apply to Oracle, SQL Server, MySQL, PostgreSQL, and others. I may mention tools or processes by a database-vendor specific name but, for the most part, each software vendor has something that is equivalent.

Query tuning is a complex and iterative process, so no blog post, including this one, would be comprehensive. The objective is to help you understand how to think about tuning from a broader perspective rather than looking only at the query in question, and is more about concepts than syntax.

Questions to Ask When Looking to Improve SQL Query Performance

To narrow down where the problems are with a SQL query, we start with some basic questions about the query and how it is being executed. I will discuss each question and talk about why we ask it, and what information the answer might give us. None of these questions will tell us definitively what the problem is, but they can point us quickly in the right direction and save precious time when a client is waiting for improved response time.

Timeframe

Is the query that we are interested in (hereafter referred to as “our query”) executed during a period when the system is heavily taxed by other processes?

Why we ask: If our query is executed during a very busy time, then the problem may not be with our query at all. Reducing load on the system by examining other queries first (using this same strategy) may be more effective. So we would start by identifying and examining the most resource intensive queries first, to try to reduce overall system load.

Proximity and Size

Does our query take the same amount of time whether it is executed locally or remotely?

Why we ask: If our query is executed remotely (executed in a browser or application on a server other than the database server) and if it returns a large number of rows, then it is possible that the data transfer is the bottleneck, rather than the retrieval of the data from the database. Asking this question may help us take the network out of the equation.

Result Set Characteristics

When our query completes, does it return a large number (millions?) of rows?

Why we ask: When executing our query locally, if it takes a long time to complete, there are two possibilities. Either it takes a long time for the database software to find the data to return, or it takes a long time to return the data to the screen or the application. The former can be fixed by tuning the query; the latter may mean that our query is returning too many rows to be practical. In the latter case, we should revisit the intent of the query to see if an aggregated form of the data would be more usable, or if breaking the result set up into more manageable chunks makes sense. Also, a very large result set may be an indication of an error in the query itself, perhaps a missing join, or missing criteria resulting in a Cartesian product. In this case, we would look at the logic being expressed in the query and ensure that it matches the intent of the query.

Is the result set both large and aggregated or sorted?

Why we ask: Aggregation and sorting on large result sets require significant temporary space. If this is a significant part of the query operations, we want to look at the management of memory buffers, and temp space (System Global Area (SGA), Program Global Area (PGA) and temporary segments or their equivalents). We want to make sure that enough memory is allocated so that we are not excessively writing out to temp space, and that temp space is optimally sized and located.

Is the result set a (relatively) small subset of a large amount of data?

Why we ask: If the database is very large, and if our query returns a small subset of the data, there are two broad solutions that may be applicable: adding or optimizing indexes, and adding or optimizing partitioning. Up to a certain data size, proper indexing alone can provide adequate performance. When data gets very large, however, a combination of indexes and partitions will be necessary to provide adequate performance when querying a subset of the data.

Historical

Has the performance of the query degraded over time?

Why we ask: If the query performed well in the past, but no longer does, look at the growth rates of data in the tables referenced by the query. If the amount of data has increased significantly, new indexes may be required that were not necessary when less data was referenced. Significant data growth may also result in optimizer statistics that no longer reflect the characteristics of the data, requiring a refresh of these statistics if they are not automatically refreshed.

Does the data being queried involve many updates or deletes (as opposed to mostly inserts)?

Why we ask: Data that is frequently updated may result in index or tablespace fragmentation. This may also result in invalid statistics as in the case of significant data growth.

Conclusion

Query tuning is an iterative process and there are many other questions to ask as we get into the details. But the above questions help us see the big picture and can steer us in the right direction very quickly and help prevent us from going down the wrong path and wasting time.

If you have any other questions that you like to ask when tuning that you’d like to share, or if you have an interesting tuning story, please share in the comments.

Migrating an Oracle Schema to PostgreSQL with Partitioning Table(s) in an On-Premises Data Center

by KarlaS | Jul 15, 2020 | Best Practices, open source, Oracle, PostgreSQL

Many companies are looking for open source database solutions to help shrink the IT budget by minimizing or avoiding support costs. One such solution would be migrating an Oracle database to PostgreSQL. PostgreSQL is a widely used open source RDBMS, and the PostgreSQL database model is very similar to Oracle. All PostgreSQL features are available at no cost or minimal cost. Also PostgreSQL professionals are less expensive to hire than Oracle DBAs.

However, migrating an Oracle database to PostgreSQL is not a straightforward task in an on-premises data center. In particular, it is challenging to migrate an entire Oracle schema with partitioning tables. This technical blog shares a simple method to migrate the Oracle schema to a PostgreSQL database using open source tools. It does not cover installation and configuration of PostgreSQL.

Caution: Every environment is different, so test this in a non-production environment before working with production.

Versions used: Oracle 12.2.0.1, PostgreSQL 12.3
Operating System: CentOS Linux 8.
Tools and utilities used: SQLines Data, SQLine Converter, and PGAdmin 4.x

Pre-migration considerations:

There are quite a few data type differences between Oracle and PostgreSQL. Commonly used types (Oracle –> PostgreSQL) include: VARCHAR2–>VARCHAR, CHAR –>CHAR, CLOB –>TEXT, BLOB RAW \–>BYTEA (1 GB limit) or Large object, DATE–>TIMESTAMP(0), TIMESTAMP WITH LOCAL TIME ZONE –> TIMESTAMPTZ, NUMBER–> NUMERIC, NUMBER(4) –>SMALLINT, NUMBER(9) –> INT, NUMBER(18) –> BIGINT.
No package stored procedure is available in PostgreSQL. Consider converting packages into procedures and functions. There are several differences in built-in functions between Oracle and PostgreSQL. Please find the equivalent function in PostgreSQL user manual.

How To Migrate Oracle To PostgreSQL

I have used the SQLines Data open source tool to do this migration. Once you create all required users (roles) and tablespaces and privileges in PostgreSQL, do the initial load using SQLines Data and exclude all partition tables.

Synatax : sqldata -sd=oracle,<OracleUser>/<pwd>@<oralcleDBhost>:<OraclePort>/<OracleDBname> -td=pg,<postgresUser>/<pwd>@<hostname>:<PostgreSQLPort>,<PostgreSQLdbname> -t=<Oracleschema>.* -texcl=partitiontables -smap=<Oracleschema>:<PostgresSchema>

Example: $sqldata -sd=oracle,system/welcome1@orahost1:1521/ORATEST -td=pg,PGTEST/postpwd1@pghost1:5432,TESTDB -t=payroll.* –texcl=employee_pay_tab -smap=payroll:payroll

Truncate all PostgreSQL tables (* truncates only tables in PostgreSQL).

Get the partition table creation script without partitions from the table DDL and change the data types as appropriate to PostgreSQL.

Example:

CREATE TABLE payroll.employee_pay_tab (

Pay_period_id numeric(6),

Pay_period text COLLATE pg_catalog.”default”,

Emp_id numeric(6),

Base_sal numeric(10,2),

fed_tax numeric(10,2),

state_tax numeric(10,2),

medicare numeric(10,2),

ssn_ded numeric(10,2),

ret_ded numeric(10,2),

tot_pay numeric(10,2)

CONSTRAINT XPK_empployee_pay_tab PRIMARY KEY (Pay_period_id,Emp_id) USING INDEX TABLESPACE PAY_INDEX)

PARTITION BY LIST (Pay_period_id);

ALTER TABLE payroll.empployee_pay_tab OWNER to payroll;

ALTER TABLE payroll.empployee_pay_tab SET TABLESPACE PAY_DATA;

CREATE TABLE payroll.empployee_pay_tab_P_TEMP PARTITION OF payroll.empployee_pay_tab DEFAULT TABLESPACE PAY_DATA;

Get a partition creation script from the Oracle partition table from the database.

set linesize 150

set pages 0

set feedback on

set serveroutput on buffer 999999999;

BEGIN

FOR r IN (select ‘CREATE TABLE ‘||table_owner||’.’|| partition_name ||’ PARTITION OF ‘||table_owner||’.’||TABLE_NAME || ‘ FOR VALUES IN (‘ as col1 , HIGH_VALUE , ‘) TABLESPACE PAY_DATA;’ as col2 from dba_tab_partitions where table_owner = ‘PAYROLL’ and table_name=’EMPLOYEE_PAY_TAB’ order by 1) LOOP

IF r.HIGH_VALUE != ‘-1’ THEN

DBMS_OUTPUT.PUT_LINE(

r.col1|| r.HIGH_VALUE || r.col2

);

END IF;

END LOOP;

END;

Get a partition index creation script from the Oracle partition table from the database.

set pages 0

set linesize 150

select ‘CREATE INDEX ‘ ||index_name||’ ON payroll.employee_pay_tab (‘||column_name||’) TABLESPACE CE_INDEX;’ from (select index_name , listagg (column_name, ‘,’) WITHIN GROUP (ORDER BY column_name) column_name FROM dba_ind_columns where table_owner=‘PAYROLL’ and table_name=’EMPLOYEE_PAY_TAB’ GROUP BY index_name order by index_name)
;

Create the partition table, partitions and indexes from above scripts in PostgreSQL. Then load the Oracle schema into PostgreSQL, including partition tables, using the SQLine Data tool.

Syntax : sqldata -sd=oracle,<OracleUser>/<pwd>@<oralcleDBhost>:<OraclePort>/<OracleDBname> -td=pg,<postgresUser>/<pwd>@<hostname>:<PostgreSQLPort>,<PostgreSQLdbname> -t=<Oracleschema>.* -smap=<Oracleschema>:<PostgresSchema>

Example: $sqldata -sd=oracle,system/welcome1@orahost1:1521/ORATEST -td=pg,PGTEST/postpwd1@pghost1:5432,TESTDB -smap=payroll:payroll -topt=truncate

Verify data and create functions, procedures, views and sequences. Get the Sequences scripts from Oracle:

set linesize 150

select ‘CREATE SEQUENCE ‘||SEQUENCE_NAME||’ MINVALUE ‘||MIN_VALUE||’ MAXVALUE ‘||MAX_VALUE||’ INCREMENT BY ‘||INCREMENT_BY||’ START WITH ‘||LAST_NUMBER||’;’ from dba_sequences where SEQUENCE_OWNER=’PAYROLL’;

Get the views, procedure and packages using an Oracle DDL generation command. Use the SQLine converter to convert Oracle code into PostgreSQL. Then get table and column comments scripts from Oracle and run on PostgreSQL.

set heading off

set linesize 300

set feedback off

set verify off

set pages 100

col table_name format a30

col column_name format a30

spool tab_col_comments.sql

select ‘COMMENT ON COLUMN ‘||table_name||’.’||column_name||’ IS ”’||comments||”’;’ from DBA_COL_COMMENTS where owner=‘PAYROLL’ and comments is not null;

select ‘COMMENT ON TABLE ‘||table_name||’ IS ”’||comments||”’;’

from DBA_TAB_COMMENTS where owner=‘PAYROLL’ and comments is not null;

spool off

Now run tab_col_comments.sql in PostgreSQL.

Finally, test your application in a non-production environment.

If you’re thinking of migrating your existing database to an open source platform, contact Buda Consulting for a free consultation on best practices and “lessons learned.”

7 Ways To Improve SQL Query Performance

Questions to Ask When Looking to Improve SQL Query Performance

Timeframe

Proximity and Size

Result Set Characteristics

Historical

Conclusion

Migrating an Oracle Schema to PostgreSQL with Partitioning Table(s) in an On-Premises Data Center

Pre-migration considerations:

How To Migrate Oracle To PostgreSQL

Categories

Recent Posts

Our Address:

Phone

Email