Index of Literature: The ZFS Intent Log (ZIL), Adjustable Replacement Cache (ARC), and L2ARC

Here are the best sources, in order of superiority of quality and information, which I found on the Interwebs in my studies of this subject matter:

The ZFS Intent Log (ZIL)

The ZFS Adjustable Replacement Cache (ARC) and L2ARC

Preliminary Hypotheses

  • Mirroring the ZFS Intent Log (ZIL)
    • Pro:  It’s can save you from a simultaneous power failure and single-disk failure (of one of the ZIL mirrors).
      • If both occur, you’d lose the last few seconds of data which were written to the ZIL, but had yet to be written to disk (and thus, were lost when the loss of power caused the RAM to lose its charge and therefore its memory), and that’s really all that you’re protecting against here.
    • Con:  Double all ZIL write options (to account for the mirror), so some load on the CPU.
    • Pertinent:  The ZIL is only used by synchronous write operations.  It’s unclear to what extent various common facilities make use of synchronous write operations over asynchronous write operations, but it seems it could range from insignificant to significant.
      • Performance benefits of the ZIL, therefore, are important to optimal operation.
    • Pertinent:  Loss of the SLOG (Separate Intent Log) causes the system to fail over to use the disks in the zpool themselves, as everything in the SLOG is also in the system’s RAM (and can therefore be copied to the new disks to be used in place of the recently-failed SLOG).
    • Pertinent:  (Therefore) The ZIL is restricted by ZFS from reaching a size larger than half of the total amount of RAM assigned to the system (and seems to remain much lower in practice, but perhaps under heavy disk contention, it could be sizeable), so the cache device need be no larger than that value.
    •  Hypothesis:
      • One small SSD used as a log device (maybe even a flash memory device over USB 3.0 if you’re desperate) should:
        • benefit system performance,
        • add no risk, suffering no irrecoverable fault in the file system if lost.
  • L2ARC:
    • Pertinent:  It was designed to either improve performance or do nothing, so there isn’t anything that should be bad.
      • To explain what I mean by do nothing – if you use the L2ARC for a streaming or sequential workload, then the L2ARC will mostly ignore it and not cache it.
        • This is because the default L2ARC settings assume you are using current SSD devices, where caching random read workloads is most favourable;
        • with future SSDs (or other storage technology), we can use the L2ARC for streaming workloads as well.
      • Note:  My guess is that this problem has largely passed by now (Gregg’s article was from 2008), at least enough for garage work, I’ll need to investigate my L2ARC configuration to ensure streaming data is cached..and I think I’ll dedicate as much to my L2ARC as I can, now that I know more about it.
    • Hypothesis:
      • One SSD of any size, with good read/write performance, used as a cache device, will increase the lifespan of your storage devices by bearing potentially huge percentages of the burden put on those devices; results in this manner seem to increase algebraically with the size of the device until capping out at 100% benefit upon reaching the size of the total capacity of the zpool.
      • Modern L2ARC device candidates outpace modest performance-optimized zpools (say, four 7200 RPM disks split into two mirrored vdevs of two disks each). I selected two random, low-priced drives of a configuration likely to form a low-budget storage service solution:
        • The Samsung 850 EVO 250 GB SSD gives about 487 MB/s sequential read and 396 MB/s sequential write @ >0.1ms latency
        • Western Digital Blue WD10EZEX 1 TB, 7200 RPM drive pool throughput: 2 x 180 = 360 MB/s sequential read and 2 x 170 = 340 MB/s sequential write @ ~6 ms total latency
        • Conclusion:  Yes, a low-priced consumer-grade SSD is likely to beat out low-priced consumer-grade SSD in both throughput and latency.
          • Even if your SSD doesn’t quite match up with the throughput of your zpool, if you’re not hurting for absolutely optimal performance, you might consider offloading your mechanical disk wear and tear onto what is likely a cheaper SSD.

So it looks like an SSD lying around with nothing better to do should be partitioned and used as a cache device and a SLOG device for an underbuilt system.  It should really help with performance and there should be no consequence for its loss, aside from a loss of the performance benefit.

This entry was posted in Information Technology and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s