Tuesday, 5 February 2008

Removing Column does not remove data

When using ALTER TABLE DROP COLUMN SQL Server does not actually remove the data, it simply changes the Meta data in the system tables so that the column no longer exists.

To really remove the data it is necessary to rebuild the clustered index.

NB If the table is a heap, the only way to reclaim the space is to either copy the data out into a new table, or add (and then remove) a clustered index

This is demonstrated using the example below:

-- First create a table to use for the test
CREATE TABLE mytable (
id int not null identity primary key clustered,
somedata char(8000) not null
)
go

--Populate the table with some data
SET NOCOUNT ON

DECLARE @i int
SET @i = 1

WHILE @i <= 50000
BEGIN
INSERT mytable ( somedata ) VALUES( '1234' )
SET @i = @i + 1
END

--Let's have a look at the amount of space used by the new table
DBCC SHOWCONTIG( mytable )
exec sp_spaceused mytable, true
/*
DBCC SHOWCONTIG scanning 'mytable' table...
Table: 'mytable' (1207675350); index ID: 1, database ID: 1
TABLE level scan performed.
- Pages Scanned................................: 50000
- Extents Scanned..............................: 6275
- Extent Switches..............................: 6274
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 99.60% [6250:6275]
- Logical Scan Fragmentation ..................: 0.37%
- Extent Scan Fragmentation ...................: 0.13%
- Avg. Bytes Free per Page.....................: 69.0
- Avg. Page Density (full).....................: 99.15%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

name,rows,reserved,data,index_size,unused
mytable,50000 ,401544 KB,400000 KB,1496 KB,48 KB

*/
-- As you can see the table covers some 50k pages and around 40mb in size

-- Now let's drop the columns

ALTER TABLE mytable
DROP COLUMN somedata

--Let's now recheck the amount of space used by the table

DBCC SHOWCONTIG( mytable )
exec sp_spaceused mytable, true

/*
DBCC SHOWCONTIG scanning 'mytable' table...
Table: 'mytable' (1207675350); index ID: 1, database ID: 1
TABLE level scan performed.
- Pages Scanned................................: 50000
- Extents Scanned..............................: 6275
- Extent Switches..............................: 6274
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 99.60% [6250:6275]
- Logical Scan Fragmentation ..................: 0.37%
- Extent Scan Fragmentation ...................: 0.13%
- Avg. Bytes Free per Page.....................: 69.0
- Avg. Page Density (full).....................: 99.15%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

name,rows,reserved,data,index_size,unused
mytable,50000 ,401544 KB,400000 KB,1496 KB,48 KB
*/
-- As you can see, the table uses up exactly the same amount of space as it did prior to the column being dropped

-- So let's rebuild the table indexes
DBCC DBREINDEX(mytable)

-- And recheck the space used by the table
DBCC SHOWCONTIG( mytable )
exec sp_spaceused mytable, true
/*
DBCC SHOWCONTIG scanning 'mytable' table...
Table: 'mytable' (1207675350); index ID: 1, database ID: 1
TABLE level scan performed.
- Pages Scanned................................: 81
- Extents Scanned..............................: 11
- Extent Switches..............................: 10
- Avg. Pages per Extent........................: 7.4
- Scan Density [Best Count:Actual Count].......: 100.00% [11:11]
- Logical Scan Fragmentation ..................: 0.00%
- Extent Scan Fragmentation ...................: 9.09%
- Avg. Bytes Free per Page.....................: 71.3
- Avg. Page Density (full).....................: 99.12%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

name,rows,reserved,data,index_size,unused
mytable,50000 ,776 KB,648 KB,16 KB,112 KB
*/

-- Now the table is only covering 81 pages / 776KB, so the data has really gone

--Tidy up
DROP TABLE mytable