Deference between Index scan,Index seek and table scan

Index Scan:
Since a scan touches every row in the table whether or not it qualifies, the cost is proportional to the total number of rows in the table. Thus, a scan is an efficient strategy if the table is small or if most of the rows qualify for the predicate.
Index Seek:
Since a seek only touches rows that qualify and pages that contain these qualifying rows, the cost is proportional to the number of qualifying rows and pages rather than to the total number of rows in the table.
Index Scan is nothing but scanning on the data pages from the first page to the last page. If there is an index on a table, and if the query is touching a larger amount of data, which means the query is retrieving more than 50 percent or 90 percent of the data, and then optimizer would just scan all the data pages to retrieve the data rows. If there is no index, then you might see a Table Scan (Index Scan) in the execution plan.
Index seeks are generally preferred for the highly selective queries. What that means is that the query is just requesting a fewer number of rows or just retrieving the other 10 (some documents says 15 percent) of the rows of the table.
In general query optimizer tries to use an Index Seek which means that optimizer has found a useful index to retrieve recordset. But if it is not able to do so either because there is no index or no useful indexes on the table then SQL Server has to scan all the records that satisfy query condition.
Table scan :
We know that an index seek would be preferred over an index scan, but what should we do if we see a table scan in our execution plan? This demonstration shows the difference between a table scan and an index scan.
A table scan is performed on a table which does not have an index upon it (a heap) – it looks at the rows in the table. An index scan is performed on an indexed table – the index itself.
I’ve seen it written that the two are the same, and I can understand why that conclusion is made. In fact running side-by-side there is little difference between a table scan and an index scan. In terms of performance, they both run the same. The performance balance between the two is changed when you are wanting to do any meaningful operations on the data, such as join another table.
I’d like to demonstrate first the similarity of the two without furthering the query:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
SET NOCOUNT ON;
USE [master]
GO
IF NOT EXISTS (SELECT * FROM sys.databases WHERE [name] = 'Index_Test')
    CREATE DATABASE Index_Test
GO
USE [Index_Test]
GO
IF EXISTS(select * FROM sys.tables WHERE [name] = 'without_index')
    DROP TABLE without_index
CREATE TABLE without_index
(
    ID int NOT NULL,
    value varchar(50) NOT NULL
)
IF EXISTS(select * FROM sys.tables WHERE [name] = 'with_index')
    DROP TABLE with_index
CREATE TABLE with_index
(
    ID int NOT NULL PRIMARY KEY CLUSTERED,
    value varchar(50) NOT NULL
)
INSERT INTO with_index
SELECT 1,'testvalue'
UNION
SELECT 2,'anothertestvalue'
UNION
SELECT 3, 'my last test value'
INSERT INTO without_index
SELECT 1,'testvalue'
UNION
SELECT 2,'anothertestvalue'
UNION
SELECT 3, 'my last test value'
Then if Include Actual Execution Plan (CTRL+M), and execute the following:
1
2
SELECT * FROM dbo.without_index
SELECT * FROM dbo.with_index
Execution Plan Table vs Index Scan
Execution Plan
You’ll notice that the first statement on the table without the index will cause a ‘Table Scan’, while the second statement causes an ‘Index Scan’.
Notice as well that the I/O usage and CPU cost for these two operations are different, however the sub tree cost for the two operations is identical.
Table Scan:
Table Scan cost
Table Scan cost
Clustered Index Scan:
Index Scan cost
Index Scan cost
If we were to create a non-clustered index on the table ‘without_index’:
1
2
3
4
5
CREATE NONCLUSTERED INDEX [IX_TEST] ON [dbo].[without_index]
(
[ID] ASC,
[value] ASC
)
And we perform the two selects statements we’ll get two index scans:
Index Scan vs Non-Clustered Index Scan
Index Scans
You’ll notice that the I/O and CPU cost is the same.
This is only for 3 rows of ordered data inserted. If you take that to millions of rows the results are still the same. The real benefit to having an index on the table is made aparent when you want to perform any operations on this data such as joining another table. If we magnify the scale of the inserts up to 1m, when a join onto another table is performed the table scan performs a sort and uses a paralellism to hash join the data. Whereas our indexed table is straing joined as the data is already sorted. This is when the performance cost is 96% of the batch for the without_index table and just 4% for with_index table over 1m rows.

Conclusion

So we know that index seeks are preferred over index scans. This demonstration shows that index scans are preferred over table scans. In fact, a good rule of thumb is to always include at least one index on a table this way we avoid table scans altogether.

Comments

Popular Posts