Backup and restore huge partitioned database - SQL Server
This example fits the case when you have a big database in SQL Server where:
- one ore more table are partitioned
- one or more filegroup are in readonly, for example filegroups with old data
Suggestion: when you have a big database with historical data, it is a good idea to design it similar to this example.
In this case you can backup and restore filegroups separately. If you are doing full backups of this kind of database you are wasting time, space and power. You are wasting money.
We are going to create a simple database:
CREATE DATABASE [HistoryDemo]
ON PRIMARY
(NAME = N'HistoryDemo', FILENAME = N'S:\Data\HistoryDemo.mdf' , SIZE = 8192KB , FILEGROWTH = 65536KB )
LOG ON
(NAME = N'HistoryDemo_log', FILENAME = N'L:\Log\HistoryDemo_log.ldf' , SIZE = 8192KB , FILEGROWTH = 65536KB )
Now we add filegroups where partitioned data will relay:
ALTER DATABASE [HistoryDemo] ADD FILEGROUP FGVeryOldStuff;
ALTER DATABASE [HistoryDemo] ADD FILEGROUP FG_2020_Stuff;
ALTER DATABASE [HistoryDemo] ADD FILEGROUP FG_2021_Stuff;
ALTER DATABASE [HistoryDemo] ADD FILEGROUP FG_2022_Stuff;
ALTER DATABASE [HistoryDemo] ADD FILEGROUP FG_2023_Stuff;
This filegroups are still empty! We need to add datafiles. At least one for each filegroup. Let's do it:
ALTER DATABASE [HistoryDemo]
ADD FILE (NAME = N'OldStuff', FILENAME = N'S:\Data\OldStuff.ndf', SIZE = 512KB) TO FILEGROUP FGVeryOldStuff;
ALTER DATABASE [HistoryDemo]
ADD FILE (NAME = N'2020_Stuff', FILENAME = N'S:\Data\2020_Stuff.ndf', SIZE = 512KB) TO FILEGROUP FG_2020_Stuff;
ALTER DATABASE [HistoryDemo]
ADD FILE (NAME = N'2021_Stuff', FILENAME = N'S:\Data\2021_Stuff.ndf', SIZE = 512KB) TO FILEGROUP FG_2021_Stuff;
ALTER DATABASE [HistoryDemo]
ADD FILE (NAME = N'2022_Stuff', FILENAME = N'S:\Data\2022_Stuff.ndf', SIZE = 512KB) TO FILEGROUP FG_2022_Stuff;
ALTER DATABASE [HistoryDemo]
ADD FILE (NAME = N'2023_Stuff', FILENAME = N'S:\Data\2023_Stuff.ndf', SIZE = 512KB) TO FILEGROUP FG_2023_Stuff;
We need to put data in these files. So we need to prepare a partition function. See Official documentation.
USE [HistoryDemo];
CREATE PARTITION FUNCTION PFStuff (datetime)
AS RANGE RIGHT FOR VALUES ('20200101', '20210101', '20220101', '20230101');
Now we need to map this function to a partition scheme. Also see official documentation.
USE [HistoryDemo];
CREATE PARTITION SCHEME PSStuff
AS PARTITION PFStuff TO (FGVeryOldStuff, FG_2020_Stuff, FG_2021_Stuff, FG_2022_Stuff, FG_2023_Stuff);
You can see scheme and function in SSMS:
Now we create a simple table:
USE [HistoryDemo];
CREATE TABLE dbo.big_one
(
id int NOT NULL IDENTITY (1, 1),
event_time datetime NOT NULL,
some_text nchar(2000) NULL
)
ON PSStuff(event_time);
ALTER TABLE dbo.big_one
ADD CONSTRAINT PK_big_one PRIMARY KEY CLUSTERED (id, event_time);
This is not nice, loot at primary key. Not elegant, but not a problem for a demo. It is important to notice this line:
ON PSStuff(event_time);
It means that table will relay on partion scheme PSStuff.
Populate table big_one with some data, in order to have few data in every partition. This is not necessary, we just want to see our data restored!
INSERT INTO big_one (event_time, some_text) VALUES ('20180501 12:30', 'Good lunch with Emma');
INSERT INTO big_one (event_time, some_text) VALUES ('20180501 20:30', 'Good dinner with Emma');
INSERT INTO big_one (event_time, some_text) VALUES ('20200401 06:00', 'Epic prank to Dimitri');
INSERT INTO big_one (event_time, some_text) VALUES ('20200402 20:30', 'Dinner offered to Dimitri to apologize');
INSERT INTO big_one (event_time, some_text) VALUES ('20211231 23:50', 'This year was boring');
INSERT INTO big_one (event_time, some_text) VALUES ('20220915 16:00', 'Had a tea with boss');
INSERT INTO big_one (event_time, some_text) VALUES ('20230101 13:00', 'Lunch with family. Really???');
INSERT INTO big_one (event_time, some_text) VALUES ('20230701 00:11', 'Reading future');
We alse create a table on PRIMARY filegroup, just to have some data on in:
CREATE TABLE dbo.primary_table
(
id int PRIMARY KEY IDENTITY (1, 1),
some_text nvarchar(50) NULL
)
ON [PRIMARY];
And we populate it:
INSERT INTO primary_table (some_text) VALUES ('Some random text');
INSERT INTO primary_table (some_text) VALUES ('Another random text');
INSERT INTO primary_table (some_text) VALUES ('Klaatu barada nikto');
Now suppose we are in year 2023, which is true at the moment I wrote this page, and we know that data before 2023 will never be modified.
We can freeze that data by setting in READ_ONLY the correct filegroups:
USE master;
ALTER DATABASE HistoryDemo MODIFY FILEGROUP FGVeryOldStuff READ_ONLY;
ALTER DATABASE HistoryDemo MODIFY FILEGROUP FG_2020_Stuff READ_ONLY;
ALTER DATABASE HistoryDemo MODIFY FILEGROUP FG_2021_Stuff READ_ONLY;
ALTER DATABASE HistoryDemo MODIFY FILEGROUP FG_2022_Stuff READ_ONLY;
If an application tries to write on a frozen filegroup, like this query:
USE HistoryDemo;
INSERT INTO big_one (event_time, some_text) VALUES ('20180201 12:30', 'Lunch with coworker');
It gets an error:
Msg 652, Level 16, State 1, Line 2
The index "PK_big_one" for table "dbo.big_one" (RowsetId 72057594041597952) resides on a read-only filegroup ("FGVeryOldStuff"), which cannot be modified.
And this is good, because we want to keep our data consistent.
But anyone can still write on fresh data. Like this:
USE HistoryDemo;
INSERT INTO big_one (event_time, some_text) VALUES ('20230201 12:30', 'Lunch at "La quercia" reastaurant');
And we get no problem: (1 row affected).
Now the backup. First we make a backup for the readonly filegroups:
BACKUP DATABASE [HistoryDemo] FILEGROUP = 'FGVeryOldStuff' TO DISK = 'B:\BACKUP\HistoryDemo_FGVeryOldStuff.bck';
BACKUP DATABASE [HistoryDemo] FILEGROUP = 'FG_2020_Stuff' TO DISK = 'B:\BACKUP\HistoryDemo_FG_2020_Stuff.bck';
BACKUP DATABASE [HistoryDemo] FILEGROUP = 'FG_2021_Stuff' TO DISK = 'B:\BACKUP\HistoryDemo_FG_2021_Stuff.bck';
BACKUP DATABASE [HistoryDemo] FILEGROUP = 'FG_2022_Stuff' TO DISK = 'B:\BACKUP\HistoryDemo_FG_2022_Stuff.bck';
OK, now a full backup.
But do we need it really full? No, because we already have a backup of frozen filegroups!
The key, in the following script is READ_WRITE_FILEGROUPS.
USE master;
BACKUP DATABASE [HistoryDemo] READ_WRITE_FILEGROUPS
TO DISK = 'B:\BACKUP\HistoryDemo_partial.bck';
Look at the output:
Processed 432 pages for database 'HistoryDemo', file 'HistoryDemo' on file 1. Processed 24 pages for database 'HistoryDemo', file '2023_Stuff' on file 1. Processed 2 pages for database 'HistoryDemo', file 'HistoryDemo_log' on file 1. BACKUP DATABASE...FILE=<name> successfully processed 458 pages in 0.031 seconds (115.218 MB/sec).
It skipped all the readonly filegroups!
Now the restore. For this example we suppose, and tested, to go on a different server, just to simulate the loss of our source server. So we have different paths.
First we restore backup of writable filegroups:
RESTORE DATABASE [HistoryDemo]
FROM DISK = N'B:\BACKUP\HistoryDemo_partial.bck' WITH PARTIAL, RECOVERY,
MOVE N'HistoryDemo' TO N'K:\Data\HistoryDemo.mdf',
MOVE N'2023_Stuff' TO N'K:\Data\2023_Stuff.ndf',
MOVE N'HistoryDemo_log' TO N'L:\Log\HistoryDemo.ldf'
Then we restore the frozen filegroups:
RESTORE DATABASE [HistoryDemo] FILE = N'OldStuff' FROM DISK = N'B:\BACKUP\HistoryDemo_FGVeryOldStuff.bck'
WITH FILE = 1, MOVE N'OldStuff' TO N'K:\Data\OldStuff.ndf'
RESTORE DATABASE [HistoryDemo] FILE = N'2020_Stuff' FROM DISK = N'B:\BACKUP\HistoryDemo_FG_2020_Stuff.bck'
WITH FILE = 1, MOVE N'2020_Stuff' TO N'K:\Data\2020_Stuff.ndf'
RESTORE DATABASE [HistoryDemo] FILE = N'2021_Stuff' FROM DISK = N'B:\BACKUP\HistoryDemo_FG_2021_Stuff.bck'
WITH FILE = 1, MOVE N'2021_Stuff' TO N'K:\Data\2021_Stuff.ndf'
RESTORE DATABASE [HistoryDemo] FILE = N'2022_Stuff' FROM DISK = N'B:\BACKUP\HistoryDemo_FG_2022_Stuff.bck'
WITH FILE = 1, MOVE N'2022_Stuff' TO N'K:\Data\2022_Stuff.ndf'
And then we restore the 2023 stuff:
RESTORE DATABASE [HistoryDemo]
FILEGROUP = 'FG_2023_Stuff'
FROM DISK = N'B:\BACKUP\HistoryDemo_partial.bck'
WITH RECOVERY
GO
