Saturday, February 28, 2015

How to optimize SQL Queries??

How to optimize SQL Queries??

Here I'm sharing some of valuable tips to optimize SQL queries to gain high performance

  • In a SELECT statement use the column names you need to select instead of using (*)
  • When ever possible, use LEFT JOINs instead of INNER JOINs as when you are using a INNER JOIN the DBMS runs through indexes of both tables and when you are using a LEFT JOIN it runs through the indexes of one table and search for matching record in the other table.
  • Avoid using NOT operators. It is much faster to search for an exact match (positive operator) such as using the LIKE, IN, EXIST or = symbol operator instead of a negative operator such as NOT LIKE, NOT IN, NOT EXIST or != symbol. Using a negative operator will cause the search to find every single row to identify that they are ALL not belong or exist within the table. On the other hand, using a positive operator just stop immediately once the result has been found.
  • Use EXITS instead of COUNT to check if a record is exists.
  • Avoid using IN clause. EXISTS is another option or you can use JOINs
Eg:
SELECT id,
       name,
       address,
       birthday
FROM student_master
WHERE id IN
    (SELECT id
     FROM good_student)

Doing this is very expensive because SQL query will evaluate the outer query first before proceed with the inner query. Instead we can use this instead.


SELECT good.id,
       all.name,
       all.address,
       all.birthday
FROM good_student good
LEFT JOIN student_master ALL ON ALL.id = good.id

  • Use UNION ALL instead of OR as this will utilize indexes
  • Avoid using HAVING clause. HAVING clause is used to filter the rows after all the rows are selected. It is just like a filter. Do not use HAVING clause for any other purposes. 
Eg:
SELECT name,
       count(name)
FROM student_master
WHERE name != 'marry'
  AND name != 'john'
GROUP BY name;

Instead of:
SELECT name,
       count(name)
FROM student_master
GROUP BY name
HAVING name!= 'marry'
AND name!= 'john';

  • When joining tables always use filtered subsets. Please refer below examples to see the explanation.

CITY
CITY_ID CITY_NAME 
1 Paris
2 Sydney
3 Madrid
4 Colombo
5 Auckland
6 Mumbai
7 Venice


FAVOURITE_CITY
SEQ_ID CITY_ID HAVE_VISITED
1 1 FALSE
2 3 TRUE
3 4TRUE

I need to get all the cities I have visited??
Option 1:
SELECT C.CITY_ID,
       C.CITY_NAME
FROM CITY C
INNER JOIN FAVOURITE_CITY FC ON C. CITY_ID=FC. CITY_ID
WHERE FC.HAVE_VISITED='TRUE'
Here we are using a INNER JOIN so the indexes of both tables will be scanned and even though the result is correct its a costly operation.

Option 2 (Best):
SELECT C.CITY_ID,
       C. CITY_NAME
FROM FAVOURITE_CITY FC
LEFT JOIN CITY C ON C. CITY_ID=FC. CITY_ID
WHERE FC.HAVE_VISITED='TRUE'

Here we are retrieving only the favourite cities (small subset) and joining master table to get the relevant master information. This is the way to do it. 


I need to get my favorite cities which I have visited but the Name should begin with 'C'
Option 1: 
SELECT C.CITY_ID,
       C.CITY_NAME
FROM FAVOURITE_CITY FC
LEFT JOIN CITY C ON C. CITY_ID=FC. CITY_ID
WHERE FC. HAVE_VISITED =TRUE' AND C.CITY LIKE 'C%'

This query works fine. But when joining the two tables we can increase the performance by minimizing the data subset

Option 2 (Best):
SELECT C.CITY_ID,
       C.CITY_NAME
FROM FAVOURITE_CITY FC
LEFT JOIN CITY C ON C. CITY_ID=FC. CITY_ID
AND C.CITY LIKE 'C%'
WHERE FC. HAVE_VISITED

Here we get only a filtered subset of data in CITY table. So rather than using this (CITY LIKE 'C%') condition in WHERE clause it will be more useful to use when joining the tables.