PLEASE BE AWARE THAT ANY INFORMATION YOU MAY FIND HERE MAY BE INACCURATE, AND COULD INCLUDE TECHNICAL INACCURACIES, TYPOGRAPHICAL ERRORS, AND EVEN SPELLING ERRORS.

 From the MANUAL page:
 The zdb command is used by  support  engineers  to  diagnose
 failures and gather statistics. Since the ZFS file system is
 always consistent on disk and is self-repairing, zdb  should
 only be run under the direction by a support engineer.

DO NOT TRY IT IN PRODUCTION. USE AT YOUR OWN RISK!



Today, we will see if what we did with the ext3 filesystem can be done with ZFS. We start creating a brand new filesystem, and putting our file into it…

# mkfile 100m /var/fakedevices/disk0
# zpool create cow /var/fakedevices/disk0
# zfs create cow/fs01
# cp -pRf /root/bash_completion /cow/fs01/
# ls -i /cow/fs01/
4 bash_completion

Ok, now we can start to play..

# zdb -dddddd cow/fs01 4
... snipped...
        path    /bash_completion
        atime   Sun Sep 21 12:03:56 2008
        mtime   Sun Sep 21 12:03:56 2008
        ctime   Sun Sep 21 12:10:39 2008
        crtime  Sun Sep 21 12:10:39 2008
        gen     16
        mode    100644
        size    216071
        parent  3
        links   1
        xattr   0
        rdev    0x0000000000000000
Indirect blocks:
               0 L1  0:a000:400 0:120a000:400 4000L/400P F=2 B=16
               0  L0 0:40000:20000 20000L/20000P F=1 B=16
           20000  L0 0:60000:20000 20000L/20000P F=1 B=16

                segment [0000000000000000, 0000000001000000) size   16M

So, now we have the DVA‘s for the two data blocks (0:40000 and 0:60000). Let’s get our data, umount the filesystem, and try to put the data in place again. For that, we just need the first block…

# zdb -R 0:40000:20000:r 2> /tmp/file-part1
# head /tmp/file-part1

#   bash_completion - programmable completion functions for bash 3.x
#                     (backwards compatible with bash 2.05b)
#
#   $Id: bash_completion,v 1.872 2006/03/01 16:20:18 ianmacd Exp $
#
#   Copyright (C) Ian Macdonald 
#
#   This program is free software; you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation; either version 2, or (at your option)

Let’s change the file’s content, and put it there again. But let’s do a change more difficult to catch this time (just one byte).

# vi /tmp/file-part1
change
#   bash_completion - programmable completion functions for bash 3.x
for
#   bash_completion - programmable completion functions for bash 1.x

# zpool export cow
# dd if=/var/fakedevices/disk0 of=/tmp/fs01-part1 bs=512 count=8704
# dd if=/var/fakedevices/disk0 of=/tmp/file-part1 bs=512 iseek=8704 count=256
# dd if=/var/fakedevices/disk0 of=/tmp/fs01-part2 bs=512 skip=8960
# dd if=/tmp/file-part1 of=/tmp/payload bs=131072 count=1
# cp -pRf /tmp/fs01-part1 /var/fakedevices/disk0
# cat /tmp/payload >> /var/fakedevices/disk0
# cat /tmp/fs01-part2 >> /var/fakedevices/disk0
# zpool import -d /var/fakedevices/ cow
# zpool status
  pool: cow
 state: ONLINE
 scrub: none requested
config:

        NAME                       STATE     READ WRITE CKSUM
        cow                        ONLINE       0     0     0
          /var/fakedevices//disk0  ONLINE       0     0     0

errors: No known data errors

Ok, everything seems to be fine. So, let’s get our data…

# head /cow/fs01/bash_completion
#

Nothing? But our file is there…

# ls -l /cow/fs01
total 517
-rw-r--r--   1 root     root      216071 Sep 21 12:03 bash_completion
# ls -l /root/bash_completion
-rw-r--r--   1 root     root      216071 Sep 21 12:03 /root/bash_completion

Let’s see the zpool status command again…

# zpool status -v
  pool: cow
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

        NAME                       STATE     READ WRITE CKSUM
        cow                        ONLINE       0     0     2
          /var/fakedevices//disk0  ONLINE       0     0     2

errors: 1 data errors, use '-v' for a list

Oh, trying to access the file, ZFS could see the checksum error on the block pointer. That’s why is important to schedule a scrub, because it will traverse the entire pool looking for errors like that. In this example i did use a pool with just one disk, in a real situation, don’t do that! If we had a mirror for example, ZFS would fix the problem using a “good” copy (in this case, if the bad guy did not mess with it too). What zdb can show to us?

# zdb -c cow  

Traversing all blocks to verify checksums and verify nothing leaked ...
zdb_blkptr_cb: Got error 50 reading <21, 4, 0, 0>  -- skipping

Error counts:

        errno  count
           50  1
leaked space: vdev 0, offset 0x40000, size 131072
block traversal size 339456 != alloc 470528 (leaked 131072)

        bp count:              53
        bp logical:        655360        avg:  12365
        bp physical:       207360        avg:   3912    compression:   3.16
        bp allocated:      339456        avg:   6404    compression:   1.93
        SPA allocated:     470528       used:  0.47%

Ok, we have another copy (from a trusted media ;)…

# cp -pRf /root/bash_completion /cow/fs01/bash_completion
# head /cow/fs01/bash_completion

#   bash_completion - programmable completion functions for bash 3.x
#                     (backwards compatible with bash 2.05b)
#
#   $Id: bash_completion,v 1.872 2006/03/01 16:20:18 ianmacd Exp $
#
#   Copyright (C) Ian Macdonald 
#
#   This program is free software; you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation; either version 2, or (at your option)

Now everything is in a good shape again…
see ya.