How to optimize SQL Queries??
Here I'm sharing some of valuable tips to optimize SQL queries to gain high performance
- In a SELECT statement use the column names you need to select instead of using (*)
- When ever possible, use LEFT JOINs instead of INNER JOINs as when you are using a INNER JOIN the DBMS runs through indexes of both tables and when you are using a LEFT JOIN it runs through the indexes of one table and search for matching record in the other table.
- Avoid using NOT operators. It is much faster to search for an exact match (positive operator) such as using the LIKE, IN, EXIST or = symbol operator instead of a negative operator such as NOT LIKE, NOT IN, NOT EXIST or != symbol. Using a negative operator will cause the search to find every single row to identify that they are ALL not belong or exist within the table. On the other hand, using a positive operator just stop immediately once the result has been found.
- Use EXITS instead of COUNT to check if a record is exists.
- Avoid using IN clause. EXISTS is another option or you can use JOINs
Eg:
Doing this is very expensive because SQL query will evaluate the outer query first before proceed with the inner query. Instead we can use this instead.
CITY
FAVOURITE_CITY
SELECT id, name, address, birthday FROM student_master WHERE id IN (SELECT id FROM good_student)
Doing this is very expensive because SQL query will evaluate the outer query first before proceed with the inner query. Instead we can use this instead.
SELECT good.id, all.name, all.address, all.birthday FROM good_student good LEFT JOIN student_master ALL ON ALL.id = good.id
- Use UNION ALL instead of OR as this will utilize indexes
- Avoid using HAVING clause. HAVING clause is used to filter the rows after all the rows are selected. It is just like a filter. Do not use HAVING clause for any other purposes.
Eg:
SELECT name, count(name) FROM student_master WHERE name != 'marry' AND name != 'john' GROUP BY name;
Instead of:
SELECT name, count(name) FROM student_master GROUP BY name HAVING name!= 'marry' AND name!= 'john';
- When joining tables always use filtered subsets. Please refer below examples to see the explanation.
CITY
CITY_ID | CITY_NAME |
---|---|
1 | Paris |
2 | Sydney |
3 | Madrid |
4 | Colombo |
5 | Auckland |
6 | Mumbai |
7 | Venice |
FAVOURITE_CITY
SEQ_ID | CITY_ID | HAVE_VISITED |
---|---|---|
1 | 1 | FALSE |
2 | 3 | TRUE |
3 | 4 | TRUE |
Option 1:
SELECT C.CITY_ID, C.CITY_NAME FROM CITY C INNER JOIN FAVOURITE_CITY FC ON C. CITY_ID=FC. CITY_ID WHERE FC.HAVE_VISITED='TRUE'
Here we are using a INNER JOIN so the indexes of both tables will be scanned and even though the result is correct its a costly operation.
Option 2 (Best):
SELECT C.CITY_ID, C. CITY_NAME FROM FAVOURITE_CITY FC LEFT JOIN CITY C ON C. CITY_ID=FC. CITY_ID WHERE FC.HAVE_VISITED='TRUE'
Here we are retrieving only the favourite cities (small subset) and joining master table to get the relevant master information. This is the way to do it.
I need to get my favorite cities which I have visited but the Name should begin with 'C'
Option 1:
SELECT C.CITY_ID, C.CITY_NAME FROM FAVOURITE_CITY FC LEFT JOIN CITY C ON C. CITY_ID=FC. CITY_ID WHERE FC. HAVE_VISITED =TRUE' AND C.CITY LIKE 'C%'
This query works fine. But when joining the two tables we can increase the performance by minimizing the data subset
Option 2 (Best):
SELECT C.CITY_ID, C.CITY_NAME FROM FAVOURITE_CITY FC LEFT JOIN CITY C ON C. CITY_ID=FC. CITY_ID AND C.CITY LIKE 'C%' WHERE FC. HAVE_VISITED
Here we get only a filtered subset of data in CITY table. So rather than using this (CITY LIKE 'C%') condition in WHERE clause it will be more useful to use when joining the tables.