Skip to main content

[Ubuntu 16.04] Node is corrupted

Thread needs solution
Beginner
Posts: 1
Comments: 6

Since the second backup of a Ubuntu 16.04 VM with 2 disks (1 disk 50GB, 1 disk 250GB, both XFS) - using incremental - we receive the a message regarding the "Node is corrupted". What does this mean? I have no reason to believe that my disk is corrupted. I guess one would notice that if that would be true. Since it is a production machine, I cannot test with fsck because then one would have to unmount the disk. Is there a method to check why this is happening? The problems are not there with other VMs.

2016-10-31 19:09:28:858 140454809007872 I0007002E: Error 0x7002e: Forced sector-by-sector mode.
| trace level: information
| channel: tol-activity#4C16242E-50CA-4DEE-93B9-1CF0E99ACE00
| line: 0xa5695862aaf8e7dc
| file: k:/3539/resizer/backup/backup.cpp:1151
| function: BackupPartitions
| volume: data-250gb
| fstype: XFS
| $module: disk_bundle_lxa64_3539
|
| error 0x7001d: Node is corrupted.
| line: 0xa277b1a098f4d7c4
| file: k:/3539/resizer/xfs/fs_xfs.cpp:155
| function: ScanSpaceBTree
| $module: disk_bundle_lxa64_3539

0 Users found this helpful
Beginner
Posts: 1
Comments: 6

I was incorrect in saying that "The problems are not there with other VMs.". I have got another Ubuntu VM which is suffering from exactly the same message.

Beginner
Posts: 1
Comments: 6

On the other machine though, the error occurs on a different line of the fs_xfs.cpp.

2016-11-02 10:36:19:197 140270599378688 I0007002E: Error 0x7002e: Forced sector-by-sector mode.
| trace level: information
| channel: tol-activity#D15B6876-8C88-4B92-B912-AE9AA87F6C16
| line: 0xa5695862aaf8e7dc
| file: k:/3539/resizer/backup/backup.cpp:1151
| function: BackupPartitions
| volume: genkgo--web1--ceph--vg-root
| fstype: XFS
| $module: disk_bundle_lxa64_3539
|
| error 0x7001d: Node is corrupted.
| line: 0xa277b1a098f4d7ba
| file: k:/3539/resizer/xfs/fs_xfs.cpp:145
| function: ScanSpaceBTree
| $module: disk_bundle_lxa64_3539

Acronis Program Manager
Posts: 22
Comments: 3115

Hi Frederick,

Such problems may appear if there are no VMware Tools installed inside the VM (assuming that you run agent-less backup of VMware VMs) - in this case there is no file system quiescing performed and thus the file systems may appear in non-consistent state within the snapshot which causes backup to failover to sector-by-sector mode. If this is not the case, then please clarify the following:

1) What is the virtualization platform?

2) Are you backing up the machine using Agent for Linux installed inside, or whether it is performed in agent-less mode (i.e. what exactly you select as source in devices list in web console - a screen shot would be really helpful)?

3) Please provide the full log from the operation: go to Activities tab on the backed up machine and press "Collect System Information". The log will be present under \ServiceProcess\ folder inside the downloaded .zip package

Thank you.

Beginner
Posts: 1
Comments: 6

Hi Vasily,

Thanks for your quick reply. What do you mean with VMware Tools? I installed the Backup_Agent_for_Linux_x86_64.bin on my Ubuntu 16.04 machine. So I think I am not running in agent-less mode.

1) The virtualization platform is CEPH/Proxmox.
2) I backup an entire machine.
3) Where can I post the system information zip? I do not want to make that publicly available.

Regards,
Frederik

Acronis Program Manager
Posts: 22
Comments: 3115

Hi Frederick,

Thank you for the clarifications - the fact that you're using agent installed inside is the most important one, since the backup flows are quite different in agent-less and agent-based modes (my suggestion about VMware Tools makes sense for agent-less mode only). In case of agent-based backup (as in your case) the sector-by-sector backup can be triggered only if there is real corruption detected on the file system of the backed up machine and therefore checking it with "fsck" should be the first thing to do for the affected logical volumes (note the volume name in the error message which shows where we detected problems):

volume: genkgo--web1--ceph--vg-root

If it doesn't help then please contact our support team with the collected system information outputs and the outputs of the "fsck" run results.

P.S. The fact that your other similar VM has the same symptoms may indicate corruption in the initial setup, for example if both VMs were deployed from the same template.

Thank you.

Beginner
Posts: 1
Comments: 6

Well, there seems to be no problem with our disks (as I expected). I just ran xfs_repair without any problems.

genkgo@genkgo-services1-ceph:/$ sudo xfs_repair -n /dev/mapper/data-250gb
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.
 

Beginner
Posts: 1
Comments: 6

That is the disk where the backup is complaining about as I stated when opening this thread.

Acronis Program Manager
Posts: 22
Comments: 3115

Frederik,

In this case as I mentioned in my previous reply, you should contact our support team for further assistance with the investigation. I've already confirmed with our QA team that we've run backup tests on Ubuntu with XFS on LVMs with Acronis Backup 12, so in your case it must be some additional specific in the setup which needs to be discovered (for example specifics in how LVMs were originally created/formatted).

Thank you.

 

Beginner
Posts: 1
Comments: 6

Vasily, will do that but before I have one question left. Does it matter that contents of the disk are changing extremely fast? We are running an Elasticsearch services on that machines and disk that has complaints is being used for data of Elasticsearch. That program constanteneously adds and removes files to the disk. Could that be the cause?

Acronis Program Manager
Posts: 22
Comments: 3115

Hi Frederik,

Yes, the system intensive I/O operations may indeed affect backup, since the snapshot storage may be growing to fast and overfill the drive To avoid such situations we use cycled snapshot storage clean up via reading the data from the sectors pending to be changed with higher priority than "cold" non-changed sectors, but even with this technique there can be cases where snapshot storage grows too fast. As an alternative method to Acronis snapshot technologie you can also switch to LVM-snapshotting approach, where Acronis Backup will use native LVM snapshots to read data from. Probably this would make sense to try in your case too.

Thank you.