Welcome back. As promised, here is the follow-up to the first installment, Why are My Backups so Big?
Switching views from the Macro to the Micro, here is a more specific example of how daily change in data may affect the size of the backups:
In one day, the following events take place:
- 2 GBs of new email are sent / received by the company. This can consist of external or internal emails.
- User X takes 500 MBs of email and puts it in a PST archive file.
- Online Maintenance / Defragmentation runs overnight on the Email Server, optimizing the storage used by the Email Server’s Database.
What the Email Server sees:
A net increase in the DB size of about 1GB
What the backups see:
A delta of 2+ GBs in the data. This delta is then added to the backup set for the Email Server, where it is compressed and deduplicated, resulting in a net increase of 1GB to the size of the data. The retention policy is then applied to this copy of the data.
In short, if this had to be summarized into a mathematical formula, it would be the following:
The Data I Want to Back Up + How Much the Data Changes + How Long I Want to Keep the Data x Compression and Deduplication Ratio = Size of My Backups.
So what can you do to keep the size manageable?
- Stop storing (and definitely stop backing up) junk on your systems. There are several reasons why this is a good idea. Archiving older files and emails off of your production systems is a good way to manage this. Drafting and enforcing policies for your company on what can and can’t be stored on the company’s assets is another.
- Make a business decision on how long you really need to retain backup data. Remember, backups are kept in order to restore. Do you really need to be able to restore something from 3 years earlier? If you are a bank or a publically traded company, the answer is probably yes. If not, then it’s your call.
Lastly, at some point, this needs to be accepted as a cost of doing business. In today’s day an age, we are all in the digital data business to some degree. The ability to quickly and reliably recover from deleted data, hardware failure, fire, hurricane, or other impactful event will likely determine whether or not the business will survive that event. This is the true purpose and focus of any Backup solution, and should frame the conversation when discussing the size of the data.
