Page and extent architecture guide

Analytics Platform System (PDW)

SQL database in Microsoft

Fabric

This guide describes the structure of pages and extents, and the organization of pages and

extents within data files.

A

page

is a fundamental unit of data storage in the Database Engine. The disk space allocated

to a data file (.mdf or.ndf) in a database is logically divided into pages numbered contiguously

from 0 to

n. Disk I/O operations against data files are performed at the page level. That is, the

Database Engine reads or writes whole data pages.

An

extent

is a collection of eight physically contiguous pages, used to manage pages efficiently.

Every page belongs to an extent.

Transaction log files (.ldf) don’t contain pages. They contain a series of log records which don’t

have a fixed size.

In a regular book, all content is written on pages. Similar to a book, the Database Engine writes

all data rows on pages. The size of every page is the same: 8 KiB. In a book, most pages contain

the data, or the main content of the book. Some pages contain metadata describing the

content, for example, the table of contents and the index.

Similarly, most pages in the database contain actual rows of data. These are called

data pages.

Text/LOB

pages also contain data, but are used only by large object (LOB) data types.

Index

pages

contain index structures that help find data efficiently. Finally, a variety of

system pages

store the metadata describing the organization and properties of the data.

The following table describes page types.

Data

Data rows with all data. Data in columns using the LOB data types can also be

partially stored on data pages.

Text/LOB

Data in columns using the LOB data types, such as

,

,

,

,

,

,

, and.

Expand table

Data in variable length columns when the data row exceeds 8 KiB, for columns

using data types such as

,

,

, and

sql_variant.

Index

Btree index structures.

Global Allocation Map

(GAM)

Shared Global

Allocation Map (SGAM)

Information about allocated and unallocated extents.

Page Free Space (PFS)

Information about page allocation and free space available on pages.

Index Allocation Map

(IAM)

Information about the extents used by a heap or index in an allocation unit.

Bulk Changed Map

(BCM)

Information about the extents modified by bulk operations since the last

transaction log backup.

Differential Changed

Map (DCM)

Information about the extents that have changed since the last full database

backup.

Each page begins with a 96-byte header that is used to store system information about the

page. This information includes the page number, the page type, and can include other

metadata such as the object ID and the index ID of the object and index that own the page.

A structure called the

slot array

is stored at the end of the page. Each 2-byte element in the

slot array corresponds to a row stored on the page. A slot array element stores the byte offset

of the row relative to the start of the page. The Database Engine uses these offsets to locate

rows on a page.

When the Database Engine adds a row to an empty page, it stores the row immediately after

the header. The slot array element for the first row is stored at the very end of the page. As

more rows are added, they are stored one after another from the beginning to the end of the

page, while the slot array grows from the end to the beginning of the page, as shown on the

following diagram.