Skip to main content

[Ubuntu 16.04] Node is corrupted

Thread solved
Beginner
Posts: 1
Comments: 6

Since the second backup of a Ubuntu 16.04 VM with 2 disks (1 disk 50GB, 1 disk 250GB, both XFS) - using incremental - we receive the a message regarding the "Node is corrupted". What does this mean? I have no reason to believe that my disk is corrupted. I guess one would notice that if that would be true. Since it is a production machine, I cannot test with fsck because then one would have to unmount the disk. Is there a method to check why this is happening? The problems are not there with other VMs.

2016-10-31 19:09:28:858 140454809007872 I0007002E: Error 0x7002e: Forced sector-by-sector mode.
| trace level: information
| channel: tol-activity#4C16242E-50CA-4DEE-93B9-1CF0E99ACE00
| line: 0xa5695862aaf8e7dc
| file: k:/3539/resizer/backup/backup.cpp:1151
| function: BackupPartitions
| volume: data-250gb
| fstype: XFS
| $module: disk_bundle_lxa64_3539
|
| error 0x7001d: Node is corrupted.
| line: 0xa277b1a098f4d7c4
| file: k:/3539/resizer/xfs/fs_xfs.cpp:155
| function: ScanSpaceBTree
| $module: disk_bundle_lxa64_3539

0 Users found this helpful
Beginner
Posts: 1
Comments: 6

#1

I was incorrect in saying that "The problems are not there with other VMs.". I have got another Ubuntu VM which is suffering from exactly the same message.

Beginner
Posts: 1
Comments: 6

#2

On the other machine though, the error occurs on a different line of the fs_xfs.cpp.

2016-11-02 10:36:19:197 140270599378688 I0007002E: Error 0x7002e: Forced sector-by-sector mode.
| trace level: information
| channel: tol-activity#D15B6876-8C88-4B92-B912-AE9AA87F6C16
| line: 0xa5695862aaf8e7dc
| file: k:/3539/resizer/backup/backup.cpp:1151
| function: BackupPartitions
| volume: genkgo--web1--ceph--vg-root
| fstype: XFS
| $module: disk_bundle_lxa64_3539
|
| error 0x7001d: Node is corrupted.
| line: 0xa277b1a098f4d7ba
| file: k:/3539/resizer/xfs/fs_xfs.cpp:145
| function: ScanSpaceBTree
| $module: disk_bundle_lxa64_3539

Acronis Program Manager
Posts: 22
Comments: 3206

#3

Hi Frederick,

Such problems may appear if there are no VMware Tools installed inside the VM (assuming that you run agent-less backup of VMware VMs) - in this case there is no file system quiescing performed and thus the file systems may appear in non-consistent state within the snapshot which causes backup to failover to sector-by-sector mode. If this is not the case, then please clarify the following:

1) What is the virtualization platform?

2) Are you backing up the machine using Agent for Linux installed inside, or whether it is performed in agent-less mode (i.e. what exactly you select as source in devices list in web console - a screen shot would be really helpful)?

3) Please provide the full log from the operation: go to Activities tab on the backed up machine and press "Collect System Information". The log will be present under \ServiceProcess\ folder inside the downloaded .zip package

Thank you.

Beginner
Posts: 1
Comments: 6

#4

Hi Vasily,

Thanks for your quick reply. What do you mean with VMware Tools? I installed the Backup_Agent_for_Linux_x86_64.bin on my Ubuntu 16.04 machine. So I think I am not running in agent-less mode.

1) The virtualization platform is CEPH/Proxmox.
2) I backup an entire machine.
3) Where can I post the system information zip? I do not want to make that publicly available.

Regards,
Frederik

Acronis Program Manager
Posts: 22
Comments: 3206

#5

Hi Frederick,

Thank you for the clarifications - the fact that you're using agent installed inside is the most important one, since the backup flows are quite different in agent-less and agent-based modes (my suggestion about VMware Tools makes sense for agent-less mode only). In case of agent-based backup (as in your case) the sector-by-sector backup can be triggered only if there is real corruption detected on the file system of the backed up machine and therefore checking it with "fsck" should be the first thing to do for the affected logical volumes (note the volume name in the error message which shows where we detected problems):

volume: genkgo--web1--ceph--vg-root

If it doesn't help then please contact our support team with the collected system information outputs and the outputs of the "fsck" run results.

P.S. The fact that your other similar VM has the same symptoms may indicate corruption in the initial setup, for example if both VMs were deployed from the same template.

Thank you.

Beginner
Posts: 1
Comments: 6

#6

Well, there seems to be no problem with our disks (as I expected). I just ran xfs_repair without any problems.

genkgo@genkgo-services1-ceph:/$ sudo xfs_repair -n /dev/mapper/data-250gb
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.
 

Beginner
Posts: 1
Comments: 6

#7

That is the disk where the backup is complaining about as I stated when opening this thread.

Acronis Program Manager
Posts: 22
Comments: 3206

#8

Frederik,

In this case as I mentioned in my previous reply, you should contact our support team for further assistance with the investigation. I've already confirmed with our QA team that we've run backup tests on Ubuntu with XFS on LVMs with Acronis Backup 12, so in your case it must be some additional specific in the setup which needs to be discovered (for example specifics in how LVMs were originally created/formatted).

Thank you.

 

Beginner
Posts: 1
Comments: 6

#9

Vasily, will do that but before I have one question left. Does it matter that contents of the disk are changing extremely fast? We are running an Elasticsearch services on that machines and disk that has complaints is being used for data of Elasticsearch. That program constanteneously adds and removes files to the disk. Could that be the cause?

Acronis Program Manager
Posts: 22
Comments: 3206

#10

Hi Frederik,

Yes, the system intensive I/O operations may indeed affect backup, since the snapshot storage may be growing to fast and overfill the drive To avoid such situations we use cycled snapshot storage clean up via reading the data from the sectors pending to be changed with higher priority than "cold" non-changed sectors, but even with this technique there can be cases where snapshot storage grows too fast. As an alternative method to Acronis snapshot technologie you can also switch to LVM-snapshotting approach, where Acronis Backup will use native LVM snapshots to read data from. Probably this would make sense to try in your case too.

Thank you.

Beginner
Posts: 0
Comments: 3

#11

Hello, I have the same exact problem with a physical Ubuntu server machine.

First full backup ok, the inode corrupt, even recreating a new Plan and a full backup.

I ve opened a support ticket and I am planning an fsck but machine Is in production so Is not easy to do.

Have you solved It and may you can post some informations to fix It?

Bye

Support specialist
Posts: 0
Comments: 1400

#12

Hello Giorgio,

Welcome to Acronis forums!

It is not a bug in Acronis Cyber Backup 12.5, but rather a bug in Linux kernel's XFS driver.
It was fixed in kernel in patches https://lore.kernel.org/patchwork/patch/650577/ and https://www.spinics.net/lists/linux-xfs/msg16486.html

Basically, because of a kernel bug, the XFS sporadically enters a slightly inconsistent state for a limited amount of time. Even then, a kernel itself does not consider it inconsistent because of the same bug.
If a backup starts during this time, our product sees that XFS is inconsistent (because it strictly follows official XFS on-disk layout documentation) and enables sector-by-sector mode.

We recommend that you update the Ubuntu kernel.

Beginner
Posts: 0
Comments: 3

#13

Thank you for your answer, I will try and let you know if solved!
It will take some time since the ubuntu machine is in production and we can't reboot now.
Think I will test it in 3 weeks.

 

Beginner
Posts: 0
Comments: 3

#14

Hello, Thank you very much, we have upgraded from Ubuntu 16.04 to Ubuntu 18.04 and it solved all backup issues.
There was even another Acronis issue that is fixed now: we store backups on a NAS folder, Acronis wasn't able to connect it as a location for the backup in the Acronis Console, exploring the network folders as the "Ubuntu machine", now after the upgrade it works smoothly.

 

Support specialist
Posts: 0
Comments: 1400

#15

Hello Giorgio,

thank you for sharing the outcome here!