Row size and NULL values

I’ve got a question at work, how row size (and table size) vary when you introduce columns with null values in the table. The question might be very simple, but the answer is not that simple. A row is basically 2 pieces: a row header and a row data. The sum of both will give you the row size. Simple as that.

For the row header the following statements are true:
1) A row header never grows (fixed size)
2) Each row has a row header
3) For non cluster tables row header size is 3 bytes
4) 3 bytes = 1 byte for flags + 1 byte if row is locked + 1 byte for column count

If row header is 3 bytes, how will vary the row data when you introduce NULL in different order in table. As example, i’ve create a table with 3 columns datatype char(1) (1 byte).

SQL> create table t (a char(1), b char(1), c char(1));
Table created.
SQL> insert into t(a,b,c) values(NULL,NULL,NULL);
SQL> insert into t(a,b,c) values('K',NULL,NULL);
SQL> insert into t(a,b,c) values(NULL,'K',NULL);
SQL> insert into t(a,b,c) values(NULL,NULL,'K');
SQL> insert into t(a,b,c) values('K',NULL,'K');
SQL> insert into t(a,b,c) values('K','K', NULL);
SQL> commit;
Commit complete.
SQL> alter system checkpoint;
System altered.

Now we need to find the block and dump it to check the size of each row and how null values affect the size of the row. For a better understanding of table t contents, next ASCII will help:

A | B | C
---------------
NULL| NULL| NULL
'K' | NULL| NULL
NULL| 'K' | NULL
NULL| NULL| 'K'
'K' | NULL| 'K'
'K' | 'K' | NULL
SQL> select distinct dbms_rowid.rowid_relative_fno(ROWID), 
dbms_rowid.rowid_block_number(ROWID) from t;
DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------ ------------------------------------
 1 89249
SQL> alter system dump datafile 1 block 89249;
System altered.
data_block_dump,data header at 0x7f988584665c
===============
tsiz: 0x1fa0
hsiz: 0x1e
pbl: 0x7f988584665c
 76543210
flag=--------
ntab=1
nrow=6
frre=-1
fsbo=0x1e
fseo=0x1f7c
avsp=0x1f4c
tosp=0x1f4c
0xe:pti[0] nrow=6 offs=0
0x12:pri[0] offs=0x1f9d
0x14:pri[1] offs=0x1f98
0x16:pri[2] offs=0x1f92
0x18:pri[3] offs=0x1f8b
0x1a:pri[4] offs=0x1f83
0x1c:pri[5] offs=0x1f7c
block_row_dump:
tab 0, row 0, @0x1f9d
tl: 3 fb: --H-FL-- lb: 0x1 cc: 0
tab 0, row 1, @0x1f98
tl: 5 fb: --H-FL-- lb: 0x1 cc: 1
col 0: [ 1] 4b
tab 0, row 2, @0x1f92
tl: 6 fb: --H-FL-- lb: 0x1 cc: 2
col 0: *NULL*
col 1: [ 1] 4b
tab 0, row 3, @0x1f8b
tl: 7 fb: --H-FL-- lb: 0x1 cc: 3
col 0: *NULL*
col 1: *NULL*
col 2: [ 1] 4b
tab 0, row 4, @0x1f83
tl: 8 fb: --H-FL-- lb: 0x1 cc: 3
col 0: [ 1] 4b
col 1: *NULL*
col 2: [ 1] 4b
tab 0, row 5, @0x1f7c
tl: 7 fb: --H-FL-- lb: 0x1 cc: 2
col 0: [ 1] 4b
col 1: [ 1] 4b
end_of_block_dump

The things that we need to know (for this exercice) from this dump is:

1) nrow=6 : Numbers of rows in table
2) Rows go from row 0 to row 5 (6 rows)
3) tl: X represents the size of the row in bytes.
4) cc: X represents “column count” for columns without NULL values

A little analysis on tl parameter shows the following:

A | B | C
---------------
NULL| NULL| NULL --> 3 bytes (Row #0)
'K' | NULL| NULL --> 5 bytes
NULL| 'K' | NULL --> 6 bytes
NULL| NULL| 'K' --> 7 bytes
'K' | NULL| 'K' --> 8 bytes
'K' | 'K' | NULL --> 7 bytes (Row #5)

Explanation for each row size:

Row #0:
3 bytes for this row, all NULL values, means that all row size is only the size of row header (fixed 3 bytes)

Row #1:
5 bytes = ‘K’ char is 1 byte + 1 byte that used for column lenght + 3 bytes of row header (fixed size)

Row #2:
6 bytes. That’s where things get very interesting. If NULL column is (in order) before the any real data, it will use one byte to “store” the NULL value. On Row #2 6 bytes sum is: 1 byte for NULL value + 1 byte for ‘K’ char + 1 byte for column lenght + 3 bytes for row header (fixed size).
As you see, last NULL value (no data after) will not use any storage.

Row #3:
Same explanation that Row #2, except that the first 2 columns are null, and they use 2 bytes: 2 bytes for 2 first 2 NULL columns + 1 byte for ‘K’ char + 1 byte for column lenght + 3 bytes for row header and the final size is 7 bytes

Row #4:
8 bytes = 2 byte for column lenght and data (‘K’) + 1 byte for column B NULL + 2 byte for column lenght and data (‘K’) + 3 bytes of row header

Row #5:
As you can also see, column C is NULL and does not take any storage so the 7 bytes is: 4 byte for column lenght and data (‘K’) + 3 bytes of row header

Conclusions on this example for a specified row (may apply to other datatypes too :-):

1) If you add a column that contains NULL value in the end of table column order, it will NOT take any storage.
2) If you add a column that contains NULL value between a column with data, it will use additional bytes.
3) If your row only contains NULL values, it will take (for non-clustered tables) 3 bytes, no matter how many columns you have
4) For a better understanding of a row check this picture

Hope you find this useful,

Luís

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s