Bacula Offsite Replication

Home » CentOS » Bacula Offsite Replication

July 1, 2020 Alessandro Baggi CentOS 15 Comments

Hi everyone,

I have updated my backup server to CentOS 8.2. It runs bacula performing backup on disks. I would like to replicate backups on another offsite machine.

I read about the ability to configure a new storage daemon in the offsite location and create a Migration/Copy Jobs. If I’m not wrong, it replicates only volumes but not replicate the catalog. I will try this.

Another way to replicate the volumes on another server is using rsync.

What is your suggestion about this topic?

Thank you in advance.

Alessandro.

15 thoughts on - Bacula Offsite Replication

Leroy Tennison says:

July 1, 2020 at 9:04 am

I’ve used rsync (but probably not for the size you’re referring to), it works and has enough features to meet most needs. I have had a single situation where corruption occurred during transfer (a few times, have no idea why), might want to independently confirm the integrity of the transfer.
Alessandro Baggi says:

July 1, 2020 at 9:27 am

Hi Leroy,

How I can confirm that during rsync transfer corruption are not encountered?

Thank you in advance.

Il 01/07/20 16:04, Leroy Tennison ha scritto:
Leroy Tennison says:

July 1, 2020 at 9:37 am

What I did was used cksum to create a checksum of the source file putting it in a separate file, transmitted that via rsync as well and compared that to a cksum computed on the remote end. There are far more accurate alternatives to cksum but I felt cksum was good enough for a basic check. Like most things in the UNIX world, there are probably other ways to do this as well.

Interestingly enough, after I sent my previous response I discovered that I had yet another instance of the problem.
Leroy Tennison says:

July 1, 2020 at 10:14 am

I realize this shouldn’t happen, the file is a tgz and isn’t being modified while being transmitted. This has happened maybe three times this year and unfortunately I’ve just had to deal with it rather than invest the time to do the research.

Harriscomputer

Leroy Tennison Network Information/Cyber Security Specialist E: leroy@datavoiceint.com

[cid:Data-Voice-International-LOGO_aa3d1c6e-5cfb-451f-ba2c-af8059e69609.PNG]

2220 Bush Dr McKinney, Texas
75070
http://www.datavoiceint.com<http://www..com>

This message has been sent on behalf of a company that is part of the Harris Operating Group of Constellation Software Inc.

If you prefer not to be contacted by Harris Operating Group please notify us<http://subscribe.harriscomputer.com/>.

This message is intended exclusively for the individual or entity to which it is addressed. This communication may contain information that is proprietary, privileged or confidential or otherwise legally exempt from disclosure. If you are not the named addressee, you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this message in error, please notify the sender immediately by e-mail and delete all copies of the message.
Johnny Hughes says:

July 1, 2020 at 3:54 pm

Am 01.07.20 um 17:13 schrieb Leroy Tennison:

Maybe a “RunAfterJob” configuration would help to serialize it and prevent this race condition?
Alessandro Baggi says:

July 2, 2020 at 3:22 am

Il 01/07/20 17:13, Leroy Tennison ha scritto:

Hi Leroy,

I think that in my case I could not use a tgz archive. I’m speaking about full backups that reach 600/700GiB, compressing them and then rsync them could take so much time that it will be useless.
John Pierce says:

July 2, 2020 at 3:43 am

I setup drbd to replicate a ~50TB BackupPC hive to the DR copy, an identical box in a different DC on the same campus, with approximately gigE
speeds, and ran this for a year or two. It worked well enough but required babysitting from time to time. Both nodes were mdraid lvm logical volumes formatted as a single huge xfs on CentOS. I never automated the failover as it never failed, and as a dev/test backup, 8 hour response seemed adequate
Alessandro Baggi says:

July 2, 2020 at 3:51 am

Hi John,

thank you for your answer, I already take in consideration DRBD but I
need some test before start.

Reading you seems that this solution is not anymore available. What do you use for this?

Thank you in advance.

Il 02/07/20 10:43, John Pierce ha scritto:
John Pierce says:

July 2, 2020 at 4:01 am

That system was surplused 2 years ago when my group was shut down and I
retired.
Valeri Galtsev says:

July 2, 2020 at 8:03 am

unless you use tape (of that high capacity), it is advantageous to restrict volume size to, say, 50GB. Then when you restore, search for specific files will be faster. And it will help your backup volumes transfers as well.

Valeri

—
++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++
Alessandro Baggi says:

July 2, 2020 at 8:28 am

Il 02/07/20 15:02, Valeri Galtsev ha scritto:

Hi Valeri,

thank you for your suggestion.

Is bacula the right backup system when I need to replicate data offsite?
There are other backup solution that simplify this process?

Thank you in advance
Leroy Tennison says:

July 2, 2020 at 8:47 am

Depending on the definition of offsite, you have a fundamental problem: either invest the time/effort compressing or take extra bandwidth, which is less costly? Hopefully a delta transfer makes sense in your situation and should save far more than compression would once the original copy is offsite.
Valeri Galtsev says:

July 2, 2020 at 9:39 am

Bacula is great enterprise level open source backup system. I switched to its fork bareos at some point; I use bacula/bareos for at least a decade. And with this your extra requirement I still would stay with bareos (or bacula).

If I were to have two sets of backup: on site and off site, I would just set up separate bacula/bareos director and storage daemon(s) off site. Add to FDs (file daemons) extra instances of director – offsite one with different passwords for the sake of security. Then there will be a set of everything off site, not only a set of volumes. Of course, if you only have a set of volumes, but everything else has evaporated, you still will be able to restore everything, including database records by scanning set of volumes. Which will take forever. I would alternate dates of backups in offsite/onsite schedules, or define times of backups so that they do not overlap.

Another good news of this vs just rsyncing volumes is: bacula/bareos verifies checksum of every backed up file after receiving it. This will ensure consistency of files in remote volumes, for rsync you will have to at least verify checksum of each volume transferred to destination
(unless I miss my wits and rsync does verify checksums of files transferred, I just re-read rsync man and don’t see verification –
hopefully rsync expert will chime in and correct me if I’m wrong about rsync).

Anyway, that is what I would do.

Valeri

—
++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev says:

July 2, 2020 at 10:03 am

I use open source – community edition. I work for the department of the university, they do not have much money, thus pretty much everything I
set up for them is free as in free bear. I have to invest my time and knowledge – and help of others – instead of paid support, and this way I
earn my salary ;-) Suits both myself and my employers.

Valeri
Alessandro Baggi says:

July 8, 2020 at 2:09 am

Il 02/07/20 16:39, Valeri Galtsev ha scritto:

I’m in late but thank you for your suggestion.

Bacula Offsite Replication

15 thoughts on - Bacula Offsite Replication

Recommended

Recent Posts

Recent Comments

Archives

Categories

Meta