I want to share my findings regarding SMART support on NVMe as I also stumbled across this issue.
SMART on SATA vs NVMe
For SATA drives (HDDs & SSDs) there is a list of ATA-SMART-attributes with the idea of having normalized values, where higher values are always better for any attribute.
Unfortunately this system of vendor independant values never worked well and had a lot of vendor-specific differences and exceptions in how these values are treated.
For NVMe SSDs the SMART standard became mandatory, The consortium decided to not continue the ATA-SMART-attributes and use log pages instead. The first log page contains unified NVMe-SMART-attributes which are completly different to the ATA-SMART-attributes.
Here is also a great blog post about this:
https://utcc.utoronto.ca/~cks/space/blog/tech/NVMeAndSMART
Smartmontool NVMe support
The NVMe support of smartmontool (smartctl
) is still considered experimental (according to their wiki).
For example, here are the attributes (-A
) reported by smartctl
of my drives:
SATA SSD:
admin@Kolibri:~> sudo smartctl -i -A /dev/sdb
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.4.0-150600.23.33-default] (SUSE RPM)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Crucial/Micron Client SSDs
Device Model: Crucial_CT525MX300SSD1
Serial Number: 1712166362A2
LU WWN Device Id: 5 00a075 1166362a2
Firmware Version: M0CR040
User Capacity: 525.112.713.216 bytes [525 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
TRIM Command: Available, deterministic, zeroed
Device is: In smartctl database 7.3/5528
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Feb 1 14:53:47 2025 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 37
5 Reallocate_NAND_Blk_Cnt 0x0032 099 099 010 Old_age Always - 19
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 11168
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 3686
171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0
173 Ave_Block-Erase_Count 0x0032 088 088 000 Old_age Always - 193
174 Unexpect_Power_Loss_Ct 0x0032 100 100 000 Old_age Always - 236
183 SATA_Interfac_Downshift 0x0032 100 100 000 Old_age Always - 0
184 Error_Correction_Count 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 34
194 Temperature_Celsius 0x0022 055 039 000 Old_age Always - 45 (Min/Max 7/61)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 19
197 Current_Pending_ECC_Cnt 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 2
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 1
202 Percent_Lifetime_Remain 0x0030 088 088 001 Old_age Offline - 12
206 Write_Error_Rate 0x000e 100 100 000 Old_age Always - 0
246 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 76023495678
247 Host_Program_Page_Count 0x0032 100 100 000 Old_age Always - 2376709354
248 FTL_Program_Page_Count 0x0032 100 100 000 Old_age Always - 2438291973
180 Unused_Reserve_NAND_Blk 0x0033 000 000 000 Pre-fail Always - 1925
210 Success_RAIN_Recov_Cnt 0x0032 100 100 000 Old_age Always - 88
NVMe SSD:
admin@Kolibri:~> sudo smartctl -i -A /dev/nvme0
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.4.0-150600.23.33-default] (SUSE RPM)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: SSD_M.2_PCIe4_4TB_InnovationIT_Y
Serial Number: H031302309130400
Firmware Version: H230829a
PCI Vendor/Subsystem ID: 0x1e4b
IEEE OUI Identifier: 0x000000
Total NVM Capacity: 4.096.805.658.624 [4,09 TB]
Unallocated NVM Capacity: 0
Controller ID: 0
NVMe Version: 2.0
Number of Namespaces: 1
Namespace 1 Size/Capacity: 4.096.805.658.624 [4,09 TB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 000000 2309130400
Local Time is: Sat Feb 1 14:56:52 2025 CET
=== START OF SMART DATA SECTION ===
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 40 Celsius
Available Spare: 100%
Available Spare Threshold: 1%
Percentage Used: 0%
Data Units Read: 38.835.756 [19,8 TB]
Data Units Written: 11.391.884 [5,83 TB]
Host Read Commands: 241.217.843
Host Write Commands: 161.121.091
Controller Busy Time: 389
Power Cycles: 66
Power On Hours: 6.461
Unsafe Shutdowns: 21
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 40 Celsius
Temperature Sensor 2: 50 Celsius
Discussion about NVMe SMART attributes
The best discussion I found so far about interpreting the NVMe SMART attributes, was on Reddit:
https://www.reddit.com/r/linuxadmin/comments/15g7dh1/identify_the_wear_level_of_the_ssd_using/
I find the information presented by NVMe SSDs much more clear and useful (than the ATA-SMART-attributes):
Available Spare
: represents the percentage of spare blocks available by the SSD controllerAvailable Spare Threshold
: is the limit of the associatedAvailable Spare
value, at which point the SSD will consider itself failingPercentage Used
: is the estimated used percent of the drives lifespan
Here is another example of the NVMe SSD in my workstation after about 2 years in service there is some notable usage of 3 %
[simon@Bussard: ~]$ sudo smartctl -i -A /dev/nvme0
[sudo] password for root:
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.4.0-150600.23.33-default] (SUSE RPM)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: Patriot M.2 P310 1920GB
Serial Number: P310BACA2112100173
Firmware Version: ECFM53.1
PCI Vendor/Subsystem ID: 0x1987
IEEE OUI Identifier: 0x6479a7
Total NVM Capacity: 1.920.383.410.176 [1,92 TB]
Unallocated NVM Capacity: 0
Controller ID: 1
NVMe Version: 1.3
Number of Namespaces: 1
Namespace 1 Size/Capacity: 1.920.383.410.176 [1,92 TB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 6479a7 5010200e64
Local Time is: Sat Feb 1 15:02:38 2025 CET
=== START OF SMART DATA SECTION ===
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 28 Celsius
Available Spare: 100%
Available Spare Threshold: 5%
Percentage Used: 3%
Data Units Read: 94.352.586 [48,3 TB]
Data Units Written: 44.777.078 [22,9 TB]
Host Read Commands: 563.588.212
Host Write Commands: 333.236.343
Controller Busy Time: 1.767
Power Cycles: 804
Power On Hours: 3.635
Unsafe Shutdowns: 24
Media and Data Integrity Errors: 0
Error Information Log Entries: 2.191
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
my thoughts
As the (big) change in SMART attributes is not implemented in Rockstor I configured an automatic monitoring of all my drives via smartd
including e-mail notifications as expained here.
Honestly, I am very happy with the new SMART attributes of NVMe drives. Some years ago I tried to use the ATA-SMART-attributes with a buch of SATA drives from different vendors and it was a nightmare to find which vendor is using which attributes and what the values actually mean for each drive.
The new NVMe SMART attributes seem to be much more expressive and comparable between vendors.
I think Rockstor could parse these NVMe SMART attributes and display the values of all drives simultaneously in a clear, simple but instructive table (Temperature, Percent Used, Data Written, …)