Configuring storage pools with ZFS
Jump to navigation
Jump to search
(Optional+Recommended) By-Passing the LSI Controller
- [Dangerous!] Clear the controller configuration (it will wipe ALL configuration, will cause data loss)
/opt/MegaRAID/MegaCli/MegaCli64 -CfgClr -a0
- [Optional] For old LSI controllers, we have to create a RAID0 per each disk (for some reason it does not work without RAID0, with newer controllers this is not needed)
/opt/MegaRAID/MegaCli/MegaCli64 -CfgEachDskRaid0 -a0
Map your storage disks by-partuuid, by-path or by-id
- Devices can be used with the human-friendly device name (i.e. sda, sdb, etc.). As Linux can remap this it may not map with the correct physical disk, and can carry problems if a different mapping is being performed.
- The use of by-partuuid, by-path or by-id is recommended instead of the use of device names.
- /dev/disk/by-path -> This is the default method that will be used at PIC
- /dev/disk/by-partuuid
- /dev/disk/by-id
- For instance, for dc106.pic.es which is a SuperMicro X8DT3 we have:
[root@dc106 ~]# ls -lha /dev/disk/by-path/ | grep -v part total 0 drwxr-xr-x 2 root root 2.2K Jun 25 14:25 . drwxr-xr-x 7 root root 140 Jun 25 14:25 .. lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:0:0 -> ../../sda lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:1:0 -> ../../sdb lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:2:0 -> ../../sdc lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:3:0 -> ../../sdd lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:4:0 -> ../../sde lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:5:0 -> ../../sdf lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:6:0 -> ../../sdg lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:7:0 -> ../../sdh lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:8:0 -> ../../sdi lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:9:0 -> ../../sdj lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:10:0 -> ../../sdk lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:11:0 -> ../../sdl lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:12:0 -> ../../sdm lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:13:0 -> ../../sdn lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:14:0 -> ../../sdo lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:15:0 -> ../../sdp lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:16:0 -> ../../sdq lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:17:0 -> ../../sdr lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:18:0 -> ../../sds lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:19:0 -> ../../sdt lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:20:0 -> ../../sdu lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:21:0 -> ../../sdv lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:22:0 -> ../../sdw lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:23:0 -> ../../sdx lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:24:0 -> ../../sdy lrwxrwxrwx 1 root root 9 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:25:0 -> ../../sdz lrwxrwxrwx 1 root root 10 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:26:0 -> ../../sdaa lrwxrwxrwx 1 root root 10 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:27:0 -> ../../sdab lrwxrwxrwx 1 root root 10 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:28:0 -> ../../sdac lrwxrwxrwx 1 root root 10 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:29:0 -> ../../sdad lrwxrwxrwx 1 root root 10 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:30:0 -> ../../sdae lrwxrwxrwx 1 root root 10 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:31:0 -> ../../sdaf lrwxrwxrwx 1 root root 10 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:32:0 -> ../../sdag lrwxrwxrwx 1 root root 10 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:33:0 -> ../../sdah lrwxrwxrwx 1 root root 10 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:34:0 -> ../../sdai lrwxrwxrwx 1 root root 10 Jun 25 14:25 pci-0000:05:00.0-scsi-0:2:35:0 -> ../../sdaj
Create zpool partition with 2 parity disks
zpool create <zpool_name> <raid_type_1> <device_X> ... <device_X(n)> ... <raid_type_N> <device_Y> ... <device_Y(n)>
- raidz2 will be used in order to create 2 parity disks.
- We can create several raidz2 in the same zpool
- In order to improve performance, is strictly recommended to use power of 2 disks on each raidz2 (i.e. 4,8,16,etc.)
- Example:
zpool create -f pool raidz2 "pci-0000:05:00.0-scsi-0:2:0:0" "pci-0000:05:00.0-scsi-0:2:1:0" "pci-0000:05:00.0-scsi-0:2:2:0" "pci-0000:05:00.0-scsi-0:2:3:0" "pci-0000:05:00.0-scsi-0:2:4:0" "pci-0000:05:00.0-scsi-0:2:5:0" "pci-0000:05:00.0-scsi-0:2:6:0" "pci-0000:05:00.0-scsi-0:2:7:0" "pci-0000:05:00.0-scsi-0:2:8:0" "pci-0000:05:00.0-scsi-0:2:9:0" "pci-0000:05:00.0-scsi-0:2:10:0" "pci-0000:05:00.0-scsi-0:2:11:0" "pci-0000:05:00.0-scsi-0:2:12:0" "pci-0000:05:00.0-scsi-0:2:13:0" "pci-0000:05:00.0-scsi-0:2:14:0" "pci-0000:05:00.0-scsi-0:2:15:0" "pci-0000:05:00.0-scsi-0:2:16:0" "pci-0000:05:00.0-scsi-0:2:17:0" raidz2 "pci-0000:05:00.0-scsi-0:2:18:0" "pci-0000:05:00.0-scsi-0:2:19:0" "pci-0000:05:00.0-scsi-0:2:20:0" "pci-0000:05:00.0-scsi-0:2:21:0" "pci-0000:05:00.0-scsi-0:2:22:0" "pci-0000:05:00.0-scsi-0:2:23:0" "pci-0000:05:00.0-scsi-0:2:24:0" "pci-0000:05:00.0-scsi-0:2:25:0" "pci-0000:05:00.0-scsi-0:2:26:0" "pci-0000:05:00.0-scsi-0:2:27:0" "pci-0000:05:00.0-scsi-0:2:28:0" "pci-0000:05:00.0-scsi-0:2:29:0" "pci-0000:05:00.0-scsi-0:2:30:0" "pci-0000:05:00.0-scsi-0:2:31:0" "pci-0000:05:00.0-scsi-0:2:32:0" "pci-0000:05:00.0-scsi-0:2:33:0" "pci-0000:05:00.0-scsi-0:2:34:0" "pci-0000:05:00.0-scsi-0:2:35:0"
zpool status
[root@dc106 vpool1]# zpool status pool: dcpool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM dcpool ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:0:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:1:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:2:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:3:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:4:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:5:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:6:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:7:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:8:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:9:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:10:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:11:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:12:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:13:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:14:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:15:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:16:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:17:0 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:18:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:19:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:20:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:21:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:22:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:23:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:24:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:25:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:26:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:27:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:28:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:29:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:30:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:31:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:32:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:33:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:34:0 ONLINE 0 0 0 pci-0000:05:00.0-scsi-0:2:35:0 ONLINE 0 0 0 errors: No known data errors
Tuning up ZFS
- Specify zfs_arc_min and zfs_arc_max values that will be used. Also zfs_txg_timeout can be specified.
- zfs_arc_min: Determines the maximum size of the ZFS Adjustable Replacement Cache (ARC).
- zfs_arc_max: Determines the maximum size of the ZFS Adjustable Replacement Cache (ARC).
- zfs_txg_timeout: Specifies the transaction group timeout
- Setting ZFS tunables is specific to each environment:
- Usually we will set ARC cache min to 33% and max to 75% of installed RAM. More RAM is better with ZFS, but be careful with current Linux bugs for ZFS. Actually our max is 50%
- We will set transaction group timeout to 5 seconds to prevent the volume from appearing to freeze due to a large batch of writes. 5 seconds is the default, but is safer to force this.
- Example:
echo "options zfs zfs_arc_min=8589934592 zfs_arc_max=25769803776 zfs_txg_timeout=5" > /etc/modprobe.d/zfs.conf
Administrating zpool/zfs
Administrating zpool
- Monitor with iostat once per second:
zpool iostat -v 1
- Check zpool status:
zpool status
- Check data consistency with scrub:
zpool scrub <zpool_name> zpool status
- List zpool properties:
zpool list
- List history action on a zpool partition:
zpool history
Administrating zfs
- Get ZFS properties:
zfs get all
- Check ZFS volumes:
zfs list