Discussion:
writing zeros to bad sector results in persistent read error
Chris Murphy
2014-06-07 00:11:03 UTC
Permalink
This is a bit off topic as it doesn't involved md raid. But bad sectors=
are common sources of md raid problems, so I figured I'd post this her=
e.

Summary: Hitachi/HGST Travelstar 5K750. smartctl will not complete an e=
xtended offline test, it stops 60% remaining reporting the LBA of the f=
irst error. Whether I use dd to read that LBA, or write zeros to it, or=
to a 1MB block surrounding it, I always get back a read error. Not a w=
rite error. I can't get rid of this bad sector. I have used the ATA sec=
ure erase command via hdparm and get the same results. Very weird, I'd =
expect a write error to occur.



### This is the entry from smartctl:
Num Test_Description Status Remaining LifeTime(ho=
urs) LBA_of_first_error
# 1 Extended offline Completed: read failure 60% 1206 =
430197584

### Link to the full smartctl -x output
https://docs.google.com/file/d/0B_2Asp8DGjJ9VmdIZVo4UzdGaEE/edit


### This is the command I used to try to write zeros over it, and the =
result:
# dd if=3D/dev/zero of=3D/dev/sda seek=3D430197584 count=3D1
dd: writing to =91/dev/sda=92: Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 3.6149 s, 0.0 kB/s

### And this is the kernel message that appears as a result:

[15110.142071] ata1.00: exception Emask 0x0 SAct 0x20000 SErr 0x0 actio=
n 0x0
[15110.142079] ata1.00: irq_stat 0x40000008
[15110.142084] ata1.00: failed command: READ FPDMA QUEUED
[15110.142092] ata1.00: cmd 60/08:88:50:4b:a4/00:00:19:00:00/40 tag 17 =
ncq 4096 in
res 51/40:08:50:4b:a4/00:00:19:00:00/40 Emask 0x409 (media err=
or) <F>
[15110.142096] ata1.00: status: { DRDY ERR }
[15110.142099] ata1.00: error: { UNC }
[15110.144802] ata1.00: configured for UDMA/133
[15110.144826] sd 0:0:0:0: [sda] Unhandled sense code
[15110.144830] sd 0:0:0:0: [sda] =20
[15110.144832] Result: hostbyte=3DDID_OK driverbyte=3DDRIVER_SENSE
[15110.144835] sd 0:0:0:0: [sda] =20
[15110.144837] Sense Key : Medium Error [current] [descriptor]
[15110.144841] Descriptor sense data with sense descriptors (in hex):
[15110.144843] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00=20
[15110.144854] 19 a4 4b 50=20
[15110.144860] sd 0:0:0:0: [sda] =20
[15110.144863] Add. Sense: Unrecovered read error - auto reallocate fai=
led
[15110.144865] sd 0:0:0:0: [sda] CDB:=20
[15110.144867] Read(10): 28 00 19 a4 4b 50 00 00 08 00
[15110.144892] end_request: I/O error, dev sda, sector 430197584
[15110.144934] ata1: EH complete

### This is the complete dmesg
https://docs.google.com/file/d/0B_2Asp8DGjJ9c3hfelQyTnNoMU0/edit

At first I thought it was because I'm writing one 512 byte logical sect=
or, but this drive has 4096 physical sectors. OK so I write out 8 logic=
al sectors instead, still get a read error. If I do this, to put the ba=
d sector in the middle of a 1MB write:

# dd if=3D/dev/zero of=3D/dev/sda seek=3D430196560 count=3D2048
dd: writing to =91/dev/sda=92: Input/output error
1025+0 records in
1024+0 records out

It stops right at LBA 430197584, again with a read error. So even thoug=
h the drive SMART health assessment is "pass" and there are no other SM=
ART values below threshold indicating "works as designed" this drive ha=
s effectively failed because any write operation to this LBA results in=
unrecoverable failure.

Anyway I find this confusing and unexpected.


Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Roger Heflin
2014-06-07 01:26:44 UTC
Permalink
hdparm --write-sector <sectornum>

is the only thing I was ever able to force a rewrite and/or relocate.

That worked on both seagate and wd disks to make the offline test go
past that point (at least until the next bad sector).

I did note that bad sectors appear to come in groups.
This is a bit off topic as it doesn't involved md raid. But bad sectors are common sources of md raid problems, so I figured I'd post this here.
Summary: Hitachi/HGST Travelstar 5K750. smartctl will not complete an extended offline test, it stops 60% remaining reporting the LBA of the first error. Whether I use dd to read that LBA, or write zeros to it, or to a 1MB block surrounding it, I always get back a read error. Not a write error. I can't get rid of this bad sector. I have used the ATA secure erase command via hdparm and get the same results. Very weird, I'd expect a write error to occur.
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 60% 1206 430197584
### Link to the full smartctl -x output
https://docs.google.com/file/d/0B_2Asp8DGjJ9VmdIZVo4UzdGaEE/edit
# dd if=/dev/zero of=/dev/sda seek=430197584 count=1
dd: writing to '/dev/sda': Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 3.6149 s, 0.0 kB/s
[15110.142071] ata1.00: exception Emask 0x0 SAct 0x20000 SErr 0x0 action 0x0
[15110.142079] ata1.00: irq_stat 0x40000008
[15110.142084] ata1.00: failed command: READ FPDMA QUEUED
[15110.142092] ata1.00: cmd 60/08:88:50:4b:a4/00:00:19:00:00/40 tag 17 ncq 4096 in
res 51/40:08:50:4b:a4/00:00:19:00:00/40 Emask 0x409 (media error) <F>
[15110.142096] ata1.00: status: { DRDY ERR }
[15110.142099] ata1.00: error: { UNC }
[15110.144802] ata1.00: configured for UDMA/133
[15110.144826] sd 0:0:0:0: [sda] Unhandled sense code
[15110.144830] sd 0:0:0:0: [sda]
[15110.144832] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[15110.144835] sd 0:0:0:0: [sda]
[15110.144837] Sense Key : Medium Error [current] [descriptor]
[15110.144843] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[15110.144854] 19 a4 4b 50
[15110.144860] sd 0:0:0:0: [sda]
[15110.144863] Add. Sense: Unrecovered read error - auto reallocate failed
[15110.144867] Read(10): 28 00 19 a4 4b 50 00 00 08 00
[15110.144892] end_request: I/O error, dev sda, sector 430197584
[15110.144934] ata1: EH complete
### This is the complete dmesg
https://docs.google.com/file/d/0B_2Asp8DGjJ9c3hfelQyTnNoMU0/edit
# dd if=/dev/zero of=/dev/sda seek=430196560 count=2048
dd: writing to '/dev/sda': Input/output error
1025+0 records in
1024+0 records out
It stops right at LBA 430197584, again with a read error. So even though the drive SMART health assessment is "pass" and there are no other SMART values below threshold indicating "works as designed" this drive has effectively failed because any write operation to this LBA results in unrecoverable failure.
Anyway I find this confusing and unexpected.
Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Roman Mamedov
2014-06-07 01:51:40 UTC
Permalink
On Fri, 6 Jun 2014 18:11:03 -0600
# dd if=/dev/zero of=/dev/sda seek=430196560 count=2048
dd: writing to ‘/dev/sda’: Input/output error
1025+0 records in
1024+0 records out
It stops right at LBA 430197584, again with a read error. So even though the drive SMART health assessment is "pass" and there are no other SMART values below threshold indicating "works as designed" this drive has effectively failed because any write operation to this LBA results in unrecoverable failure.
Hello,

Try again with "oflag=direct";

If that doesn't help, remember this is a 4K-sector drive, maybe you should
retry with bs=4096 (recalculating the offset so it still writes to the proper
place).
--
With respect,
Roman
Chris Murphy
2014-06-07 16:42:50 UTC
Permalink
I "embargoed" the bad sector with partitioning to get the user back up =
and running. In the course of an OS X install, it managed to create a "=
recovery/boot" partition right on top of the bad sector. It no longer r=
eturns a read error for that sector. Clearly it was fixed with whatever=
write command the installer used, and dd as I used it just does someth=
ing different and fails. There are more pending sectors so once I find =
their LBAs with SMART offline testing I'll try the other mentioned tech=
niques.

What I'm still confused about is, an ATA secure erase had been done and=
yet the Current_Pending_Sector count was still 48, before and after th=
e secure erase. That tells me that secure erase is just about zeroing a=
nd the drive firmware isn't actually confirming if the writes were succ=
essful. Tragic. I'd have thought it would write such sectors, confirm t=
hey're bad, and remove them from use in one whack, this is apparently n=
ot the case.
Try again with "oflag=3Ddirect";
=20
If that doesn't help, remember this is a 4K-sector drive, maybe you s=
hould
retry with bs=3D4096 (recalculating the offset so it still writes to =
the proper
place).
# smartctl -t select,430195536-max /dev/sda

The next bad LBA reported by SMART is 430235856.

# dd if=3D/dev/sda skip=3D430235856 count=3D1 | hexdump -C
dd: error reading =91/dev/sda=92: Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 6.97353 s, 0.0 kB/s

# dd if=3D/dev/zero of=3D/dev/sda seek=3D430235856 count=3D1
dd: writing to =91/dev/sda=92: Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 3.69641 s, 0.0 kB/s

# dd if=3D/dev/zero of=3D/dev/sda seek=3D430235856 count=3D8
8+0 records in
8+0 records out
4096 bytes (4.1 kB) copied, 2.50232 s, 1.6 kB/s

# dd if=3D/dev/sda skip=3D430235856 count=3D1 | hexdump -C
1+0 records in
1+0 records out
512 bytes (512 B) copied00000000 00 00 00 00 00 00 00 00 00 00 00 00 =
00 00 00 00 |................|
, 0.287629 s, 1.8 kB/s
*
00000200

###This command with count=3D8 worked. I don't know why it worked this =
time, it didn't with the earlier LBA. When I issue the read command ab=
ove piped through hexdump that had failed, now it works. Further when I=
check smart attributes, the current pending sector count has dropped b=
y a value of 8. That seems conclusive the bad sector has been remapped.

So I'll keep doing selective offline tests to find bad sectors, and wri=
te to them this way and report back.


Chris Murphy


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Murphy
2014-06-07 18:26:55 UTC
Permalink
OK the selective offline is done, and now this is damn strange.

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Selective offline Completed without error 00% 1212 -

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 24

How can there still be pending bad sectors, and yet no error and LBA reported?


Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Murphy
2014-06-08 00:52:16 UTC
Permalink
How can there still be pending bad sectors, and yet no error and LBA =
reported?

So I started another -t long test. And it comes up with an LBA not prev=
iously reported.

# 1 Extended offline Completed: read failure 60% 1214 =
430234064

# dd if=3D/dev/zero of=3D/dev/sda seek=3D430234064 count=3D8
dd: writing to =91/dev/sda=92: Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 3.63342 s, 0.0 kB/s

On this sector the technique fails.

# dd if=3D/dev/zero of=3D/dev/sda seek=3D430234064 count=3D8 oflag=3Ddi=
rect
8+0 records in
8+0 records out
4096 bytes (4.1 kB) copied, 3.73824 s, 1.1 kB/s

This technique works.

However, this seems like a contradiction. A complete -t long results in=
:

# 1 Extended offline Completed without error 00% 1219 =
-

and yet

197 Current_Pending_Sector 0x0022 100 100 000 Old_age Alway=
s - 16

How are there 16 pending sectors, with no errors found during the exten=
ded offline test? In order to fix this without SMART reporting the affe=
cted LBAs, I'd have to write to every sector on the drive. This seems l=
ike bad design or implementation.

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Roger Heflin
2014-06-08 01:50:43 UTC
Permalink
Check messages file and see if it has in the last few weeks reporting
sectors bad.

Or do a dd if=/dev/sda of=/dev/null read test until it hits something,
then correct it, then continue on.

Or do repeated long/selective tests to see if you can find them.

Though, I had a seagate disk that I was able to get all of the pending
to be fixed, I had to remove the disk from the raid as it still would
randomly pause for 7 seconds while reading sectors that were not yet
classified as pending. I tried a number of things to try to get the
disk to behave and/or replace those bad sectors, but finally gave up
on that disk and just replaced it (out of warranty) as I could not
ever get it to behave right.
Post by Chris Murphy
How can there still be pending bad sectors, and yet no error and LBA reported?
So I started another -t long test. And it comes up with an LBA not previously reported.
# 1 Extended offline Completed: read failure 60% 1214 430234064
# dd if=/dev/zero of=/dev/sda seek=430234064 count=8
dd: writing to '/dev/sda': Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 3.63342 s, 0.0 kB/s
On this sector the technique fails.
# dd if=/dev/zero of=/dev/sda seek=430234064 count=8 oflag=direct
8+0 records in
8+0 records out
4096 bytes (4.1 kB) copied, 3.73824 s, 1.1 kB/s
This technique works.
# 1 Extended offline Completed without error 00% 1219 -
and yet
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 16
How are there 16 pending sectors, with no errors found during the extended offline test? In order to fix this without SMART reporting the affected LBAs, I'd have to write to every sector on the drive. This seems like bad design or implementation.
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Murphy
2014-06-08 21:50:03 UTC
Permalink
Post by Roger Heflin
Check messages file and see if it has in the last few weeks reporting
sectors bad.
No errors except the Current Pending sectors reported by smartd which dumps into the journal.
Post by Roger Heflin
Or do a dd if=/dev/sda of=/dev/null read test until it hits something,
then correct it, then continue on.
No errors.
Post by Roger Heflin
Or do repeated long/selective tests to see if you can find them.
No (additional) errors.
Post by Roger Heflin
Though, I had a seagate disk that I was able to get all of the pending
to be fixed, I had to remove the disk from the raid as it still would
randomly pause for 7 seconds while reading sectors that were not yet
classified as pending. I tried a number of things to try to get the
disk to behave and/or replace those bad sectors, but finally gave up
on that disk and just replaced it (out of warranty) as I could not
ever get it to behave right.
I think this drive isn't behaving correctly, to say there are pending sectors yet passes the extended self-test.


Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Wilson Jonathan
2014-06-08 08:10:45 UTC
Permalink
How can there still be pending bad sectors, and yet no error and LB=
A reported?
=20
So I started another -t long test. And it comes up with an LBA not pr=
eviously reported.
=20
# 1 Extended offline Completed: read failure 60% 1214 =
430234064
=20
# dd if=3D/dev/zero of=3D/dev/sda seek=3D430234064 count=3D8
dd: writing to =E2=80=98/dev/sda=E2=80=99: Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 3.63342 s, 0.0 kB/s
=20
On this sector the technique fails.
=20
# dd if=3D/dev/zero of=3D/dev/sda seek=3D430234064 count=3D8 oflag=3D=
direct
8+0 records in
8+0 records out
4096 bytes (4.1 kB) copied, 3.73824 s, 1.1 kB/s
I may be missing something here, but surely after all this faffing abou=
t
and errors isn't it about time to replicate the data to a new drive and
then hit this one repeatedly with a very large hammer.

The law of diminishing returns must surely be coming into play by now.
=20
This technique works.
=20
However, this seems like a contradiction. A complete -t long results =
=20
# 1 Extended offline Completed without error 00% 1219 =
-
=20
and yet
=20
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Alw=
ays - 16
=20
How are there 16 pending sectors, with no errors found during the ext=
ended offline test? In order to fix this without SMART reporting the af=
fected LBAs, I'd have to write to every sector on the drive. This seems=
like bad design or implementation.
=20
Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid"=
in
More majordomo info at http://vger.kernel.org/majordomo-info.html
=20
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Murphy
2014-06-10 00:09:56 UTC
Permalink
How can there still be pending bad sectors, and yet no error and LB=
A reported?
=20
So I started another -t long test. And it comes up with an LBA not p=
reviously reported.
=20
# 1 Extended offline Completed: read failure 60% 1214=
430234064
=20
# dd if=3D/dev/zero of=3D/dev/sda seek=3D430234064 count=3D8
dd: writing to =91/dev/sda=92: Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 3.63342 s, 0.0 kB/s
=20
On this sector the technique fails.
=20
# dd if=3D/dev/zero of=3D/dev/sda seek=3D430234064 count=3D8 oflag=3D=
direct
8+0 records in
8+0 records out
4096 bytes (4.1 kB) copied, 3.73824 s, 1.1 kB/s
=20
I may be missing something here, but surely after all this faffing ab=
out
and errors isn't it about time to replicate the data to a new drive a=
nd
then hit this one repeatedly with a very large hammer.
=20
The law of diminishing returns must surely be coming into play by now=
=2E

No the question here isn't what's the right course of action from this =
point. This is an academic question: whether the reported behavior(s) a=
re as designed.

=46rom an enterprise perspective, my understanding is even one bad sect=
or is disqualifying and the drive goes back to the manufacturer if it's=
under warranty; or otherwise demoted for less important use if it's no=
t.

=46or consumer drives, which this is, all the manufacturers will say th=
e drive is functioning as designed with bad sectors *if* they're being =
reallocated. Maybe some of them won't quibble and will send a replaceme=
nt drive anyway.

But what I'm reporting is an instance where an ATA Secure Erase definit=
ely did not fix up a single one of the bad sectors. Maybe that's consis=
tent with the spec, I don't know, but it's not what I'd expect seeing a=
s every sector, those with an without LBA's assigned, are overwritten. =
Yet pending sectors were not remapped. Further, with all sectors overwr=
itten by software (not merely the ATA Secure Erase command) yields no e=
rrors yet SMART reports there are still pending sectors, yet it's own e=
xtended test says there are none. I think that's bad behavior. But perh=
aps I don't understand the design and it's actually working as designed=
=2E


Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Wilson Jonathan
2014-06-10 06:52:30 UTC
Permalink
This post might be inappropriate. Click to display it.
Phillip Susi
2014-10-08 17:56:51 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Chris Murphy
But what I'm reporting is an instance where an ATA Secure Erase
definitely did not fix up a single one of the bad sectors. Maybe
that's consistent with the spec, I don't know, but it's not what
I'd expect seeing as every sector, those with an without LBA's
assigned, are overwritten. Yet pending sectors were not remapped.
Further, with all sectors overwritten by software (not merely the
ATA Secure Erase command) yields no errors yet SMART reports there
are still pending sectors, yet it's own extended test says there
are none. I think that's bad behavior. But perhaps I don't
understand the design and it's actually working as designed.
It sounds like what happened is the secure erase successfully rewrote
the sectors that were already flagged as pending, but did not
decrement the pending count.

FYI, rather than continuing to run a smart selftest to find one
sector, then use dd to fix it, and repeat, it would be much faster to
use the badblocks utility to read and rewrite the whole drive. You
will want to make sure to use the correct sector size, and a
sufficiently large batch size for good performance.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)

iQEcBAEBAgAGBQJUNXrjAAoJEI5FoCIzSKrw//cH/jgbli22/MmRpRXLOc0YJg8O
npSEI3fusBspMWhWS+a5SGRQQQjrfiK8mK8NkAC1VrX80zI8UcLkrBNVX1NQQ7eP
tgjJJJLN0BeQIk7RtAhO0rxajnZp19bBv7r8oRgWg9PRXrrxZHrXJNxHqUlANNsq
70blruORy3MbTqUk8QU4qXw/y5XduhRyJEX0SDogrQwI0xJqaUWPn5CQPQnKWydr
0q6evfdRVfLC2rg0AbQ1ksj+nRhTRkrUctXuNc/8GL4S6wR77bQwTXlyBn8E8Uec
T6lsCs5J43e2yyRtj3c0ZWcmyuZuwKbO4LHPAA4kYf9faHV/OEWPwlAHHVC1Ggo=
=2OuK
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Wolfgang Denk
2014-06-09 19:37:23 UTC
Permalink
Dear Chris,
# dd if=/dev/zero of=/dev/sda seek=430234064 count=8 oflag=direct
8+0 records in
8+0 records out
4096 bytes (4.1 kB) copied, 3.73824 s, 1.1 kB/s
This has been pointed out before - if this is a 4k sector drive, then
you should really write in units of 4 k, not 8 x 512 bytes as you do
here.

Best regards,

Wolfgang Denk
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: ***@denx.de
"Security is mostly a superstition. It does not exist in nature...
Life is either a daring adventure or nothing." - Helen Keller
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Murphy
2014-06-10 02:48:33 UTC
Permalink
Post by Wolfgang Denk
Dear Chris,
# dd if=/dev/zero of=/dev/sda seek=430234064 count=8 oflag=direct
8+0 records in
8+0 records out
4096 bytes (4.1 kB) copied, 3.73824 s, 1.1 kB/s
This has been pointed out before - if this is a 4k sector drive, then
you should really write in units of 4 k, not 8 x 512 bytes as you do
here.
It worked so, why? The drive interface only accepts LBAs based on 512 byte sectors, so bs=512 count=8 is the same as bs=4096 count=1, it has to get translated into 512 byte LBAs regardless. If it were a 4096 byte logical sector drive I'd agree.

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Phil Turmel
2014-06-10 13:40:30 UTC
Permalink
Post by Chris Murphy
Post by Wolfgang Denk
Dear Chris,
In message
# dd if=/dev/zero of=/dev/sda seek=430234064 count=8 oflag=direct
8+0 records in 8+0 records out 4096 bytes (4.1 kB) copied,
3.73824 s, 1.1 kB/s
This has been pointed out before - if this is a 4k sector drive,
then you should really write in units of 4 k, not 8 x 512 bytes as
you do here.
It worked so, why?
Because writing 512 bytes into a 4096 byte physical sector requires a
read-modify-write cycle. That will fail if the physical sector is
unreadable. If you try to overwrite a bad 4k sector with eight 512-byte
writes, each will trigger an RMW, and the 'R' of the RMW will fail for
all eight logical sectors. If you tell dd to use a block size of 4k, a
single write will be created and passed to the drive encompassing all
eight logical sectors at once. So the drive doesn't need an RMW
cycle--a write attempt can be made without the preceding read. Then the
drive has the opportunity to complete its rewrite or remap logic.
Post by Chris Murphy
The drive interface only accepts LBAs based on 512 byte sectors, so
bs=512 count=8 is the same as bs=4096 count=1, it has to get
translated into 512 byte LBAs regardless.
The sector address does have to be translated to 512-byte LBAs. That
has nothing to do with the *size* of each write. So *NO*, it is *not*
the same.

"dd" is a terrible tool, except when it is perfect. As a general rule,
if you aren't specifying 'bs=' every time you use it, you've messed up.
And if you specify 'direct', remember that each block sized read or
write issued by dd will have to *complete* through the whole driver
stack before dd will issue the next one.
Post by Chris Murphy
If it were a 4096 byte logical sector drive I'd agree.
You do know that drives are physically incapable of writing partial
sectors? It has to be emulated, either in drive firmware or OS driver
stack. What you've written suggests you've missed that basic reality.
The rest is operator error. Roman and Wolfgang were too polite when
pointing out the need for bs=4096 -- it isn't 'should', it is 'must'.

As for the secure erase, I too am surprised that it didn't take care of
pending errors. But I am *not* surprised that that new errors were
discovered shortly after, as pending errors are only ever discovered
when *reading*.

HTH,

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Murphy
2014-06-29 00:05:29 UTC
Permalink
Post by Phil Turmel
Post by Chris Murphy
Post by Wolfgang Denk
Dear Chris,
In message
# dd if=/dev/zero of=/dev/sda seek=430234064 count=8 oflag=direct
8+0 records in 8+0 records out 4096 bytes (4.1 kB) copied,
3.73824 s, 1.1 kB/s
This has been pointed out before - if this is a 4k sector drive,
then you should really write in units of 4 k, not 8 x 512 bytes as
you do here.
It worked so, why?
Because writing 512 bytes into a 4096 byte physical sector requires a
read-modify-write cycle. That will fail if the physical sector is
unreadable. If you try to overwrite a bad 4k sector with eight 512-byte
writes, each will trigger an RMW, and the 'R' of the RMW will fail for
all eight logical sectors. If you tell dd to use a block size of 4k, a
single write will be created and passed to the drive encompassing all
eight logical sectors at once. So the drive doesn't need an RMW
cycle--a write attempt can be made without the preceding read. Then the
drive has the opportunity to complete its rewrite or remap logic.
By doing some SCSI command tracing with the kernel, I've learned some things about this. Whether the drive has 512 byte or 4096 byte sectors has no bearing on the actual command issued to the drive. But the use of oflag=direct does change the behavior at the SCSI layer (for both drive types).

http://www.fpaste.org/114087/
[1]

The following commands all produce the same single write command to both types of drives:

# dd if=/dev/zero of=/dev/sdb bs=512 count=8
# dd if=/dev/zero of=/dev/sdb bs=4096 count=1
# dd if=/dev/zero of=/dev/sdb bs=4096 count=1 oflag=direct

The SCSI layer is clearly combining the bs=512 count=8 into a single write command. This is inhibited with oflag=direct.

I also found intermittent issuance of READ_10 to the drive, before WRITE_10, but wasn't able to figure out why it's intermittant. Maybe dd issues READ_10 the first time it's going to write to sector, and it was the READ_10 command triggering the read failure from the drive, preventing the WRITE_10 from even being issued. I can't test this because the drive no longer reports LBAs for any bad sectors.
Post by Phil Turmel
Post by Chris Murphy
The drive interface only accepts LBAs based on 512 byte sectors, so
bs=512 count=8 is the same as bs=4096 count=1, it has to get
translated into 512 byte LBAs regardless.
The sector address does have to be translated to 512-byte LBAs. That
has nothing to do with the *size* of each write. So *NO*, it is *not*
the same.
These two dd commands definitely result in the same write command for the same size (txlen=8) to the drive being issued by the SCSI layer:
# dd if=/dev/zero of=/dev/sdb bs=512 count=8
# dd if=/dev/zero of=/dev/sdb bs=4096 count=1
Post by Phil Turmel
"dd" is a terrible tool, except when it is perfect. As a general rule,
if you aren't specifying 'bs=' every time you use it, you've messed up.
I get the same WRITE_10 command for these two commands:

# dd if=/dev/zero of=/dev/sdb count=8
# dd if=/dev/zero of=/dev/sdb bs=4096 count=1
Post by Phil Turmel
And if you specify 'direct', remember that each block sized read or
write issued by dd will have to *complete* through the whole driver
stack before dd will issue the next one.
That's consistent with the tracing results.
Post by Phil Turmel
Post by Chris Murphy
If it were a 4096 byte logical sector drive I'd agree.
You do know that drives are physically incapable of writing partial
sectors? It has to be emulated, either in drive firmware or OS driver
stack. What you've written suggests you've missed that basic reality.
The rest is operator error. Roman and Wolfgang were too polite when
pointing out the need for bs=4096 -- it isn't 'should', it is 'must'.
That's true for oflag=direct, it's not true without it.

Also included for interest is the result of issue an hdparm write command. It works without a size specification, so I don't actually know what happens on the drive itself, plus the command that gets issued to the drive isn't "WRITE_10" but "ATA_16".
Post by Phil Turmel
As for the secure erase, I too am surprised that it didn't take care of
pending errors. But I am *not* surprised that that new errors were
discovered shortly after, as pending errors are only ever discovered
when *reading*.
SMART read the whole drive and said no errors found, even though current pending still reports a non-zero value. I think that is surprising.


Chris Murphy




[1]
Formats better in fpaste once clicking on Wrap. But I'll post the raw data here in case someone looks at this more than a month from now.
512/512

# dd if=/dev/zero of=/dev/sdb bs=512 count=8
dd-891 [000] .... 550.352639: scsi_dispatch_cmd_start: host_no=1 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=0 txlen=8 protect=0 raw=2a 00 00 00 00 00 00 00 08 00)

# dd if=/dev/zero of=/dev/sdb bs=4096 count=1
dd-894 [000] .... 566.506562: scsi_dispatch_cmd_start: host_no=1 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=0 txlen=8 protect=0 raw=2a 00 00 00 00 00 00 00 08 00)

# dd if=/dev/zero of=/dev/sdb bs=512 count=8 oflag=direct
dd-1042 [000] .... 10261.418019: scsi_dispatch_cmd_start: host_no=1 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=0 txlen=1 protect=0 raw=2a 00 00 00 00 00 00 00 01 00)
dd-1042 [000] .... 10261.418294: scsi_dispatch_cmd_start: host_no=1 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=1 txlen=1 protect=0 raw=2a 00 00 00 00 01 00 00 01 00)
dd-1042 [000] .... 10261.418650: scsi_dispatch_cmd_start: host_no=1 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=2 txlen=1 protect=0 raw=2a 00 00 00 00 02 00 00 01 00)
dd-1042 [000] .... 10261.419006: scsi_dispatch_cmd_start: host_no=1 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=3 txlen=1 protect=0 raw=2a 00 00 00 00 03 00 00 01 00)
dd-1042 [000] .... 10261.419203: scsi_dispatch_cmd_start: host_no=1 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=4 txlen=1 protect=0 raw=2a 00 00 00 00 04 00 00 01 00)
dd-1042 [000] .... 10261.419365: scsi_dispatch_cmd_start: host_no=1 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=5 txlen=1 protect=0 raw=2a 00 00 00 00 05 00 00 01 00)
dd-1042 [000] .... 10261.419527: scsi_dispatch_cmd_start: host_no=1 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=6 txlen=1 protect=0 raw=2a 00 00 00 00 06 00 00 01 00)
dd-1042 [000] .... 10261.419766: scsi_dispatch_cmd_start: host_no=1 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=7 txlen=1 protect=0 raw=2a 00 00 00 00 07 00 00 01 00)

# dd if=/dev/zero of=/dev/sdb bs=4096 count=1 oflag=direct
dd-1045 [001] .... 10337.899923: scsi_dispatch_cmd_start: host_no=1 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=0 txlen=8 protect=0 raw=2a 00 00 00 00 00 00 00 08 00)


512/4096

# dd if=/dev/zero of=/dev/sdb bs=512 count=8

dd-1814 [002] ...1 530.285126: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=431467112 txlen=8 protect=0 raw=2a 00 19 b7 aa 68 00 00 08 00)

# dd if=/dev/zero of=/dev/sdb bs=4096 count=1

dd-1881 [002] ...1 1094.707870: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=431467112 txlen=8 protect=0 raw=2a 00 19 b7 aa 68 00 00 08 00)


# dd if=/dev/zero of=/dev/sdb bs=512 count=8 oflag=direct

dd-1890 [003] ...1 1255.136864: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=431467112 txlen=1 protect=0 raw=2a 00 19 b7 aa 68 00 00 01 00)
dd-1890 [002] ...1 1255.422802: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=431467113 txlen=1 protect=0 raw=2a 00 19 b7 aa 69 00 00 01 00)
dd-1890 [002] ...1 1255.423167: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=431467114 txlen=1 protect=0 raw=2a 00 19 b7 aa 6a 00 00 01 00)
dd-1890 [002] ...1 1255.423386: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=431467115 txlen=1 protect=0 raw=2a 00 19 b7 aa 6b 00 00 01 00)
dd-1890 [000] ...1 1255.423625: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=431467116 txlen=1 protect=0 raw=2a 00 19 b7 aa 6c 00 00 01 00)
dd-1890 [002] ...1 1255.423921: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=431467117 txlen=1 protect=0 raw=2a 00 19 b7 aa 6d 00 00 01 00)
dd-1890 [002] ...1 1255.424110: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=431467118 txlen=1 protect=0 raw=2a 00 19 b7 aa 6e 00 00 01 00)
dd-1890 [002] ...1 1255.424309: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=431467119 txlen=1 protect=0 raw=2a 00 19 b7 aa 6f 00 00 01 00)

# dd if=/dev/zero of=/dev/sdb bs=4096 count=1 oflag=direct

dd-1895 [002] ...1 1388.656777: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 prot_op=SCSI_PROT_NORMAL cmnd=(WRITE_10 lba=431467112 txlen=8 protect=0 raw=2a 00 19 b7 aa 68 00 00 08 00)--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Martin K. Petersen
2014-06-29 23:50:16 UTC
Permalink
Chris,

Chris> The SCSI layer is clearly combining the bs=512 count=8 into a
Chris> single write command. This is inhibited with oflag=direct.

It's not really the SCSI layer that does any of this but the VM and/or
the I/O scheduler (depending on how things were submitted).

Chris> I also found intermittent issuance of READ_10 to the drive,
Chris> before WRITE_10, but wasn't able to figure out why it's
Chris> intermittant.

It's either the page cache doing readahead or you doing partial writes
to uncached pages.

You can flush the page cache like this:

echo 3 > /proc/sys/vm/drop_caches
Post by Phil Turmel
You do know that drives are physically incapable of writing partial
sectors? It has to be emulated, either in drive firmware or OS
driver stack. What you've written suggests you've missed that basic
reality. The rest is operator error. Roman and Wolfgang were too
polite when pointing out the need for bs=4096 -- it isn't 'should',
it is 'must'.
Chris> That's true for oflag=direct, it's not true without it.

Correct.

In general, a buffered write() call in dd or any other userland app does
not have a 1:1 mapping with a SCSI WRITE command at the bottom of the
stack. The pages in question will simply be marked dirty and eventually
flushed to disk.

You can force a more block-centric behavior by using synchronous/direct
I/O.

Chris> Also included for interest is the result of issue an hdparm write
Chris> command. It works without a size specification, so I don't
Chris> actually know what happens on the drive itself, plus the command
Chris> that gets issued to the drive isn't "WRITE_10" but "ATA_16".

That's because the ATA command gets encapsulated in a SCSI command so it
can pass through the SCSI layer.
--
Martin K. Petersen Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Roger Heflin
2014-06-30 00:51:42 UTC
Permalink
All of this is probably the reason that this command exists:

hdparm --write-sector <sectornum>

I believe it directly sends the scsi/ata layer commands.

On Sun, Jun 29, 2014 at 6:50 PM, Martin K. Petersen
Post by Wolfgang Denk
Chris,
Chris> The SCSI layer is clearly combining the bs=512 count=8 into a
Chris> single write command. This is inhibited with oflag=direct.
It's not really the SCSI layer that does any of this but the VM and/or
the I/O scheduler (depending on how things were submitted).
Chris> I also found intermittent issuance of READ_10 to the drive,
Chris> before WRITE_10, but wasn't able to figure out why it's
Chris> intermittant.
It's either the page cache doing readahead or you doing partial writes
to uncached pages.
echo 3 > /proc/sys/vm/drop_caches
Post by Phil Turmel
You do know that drives are physically incapable of writing partial
sectors? It has to be emulated, either in drive firmware or OS
driver stack. What you've written suggests you've missed that basic
reality. The rest is operator error. Roman and Wolfgang were too
polite when pointing out the need for bs=4096 -- it isn't 'should',
it is 'must'.
Chris> That's true for oflag=direct, it's not true without it.
Correct.
In general, a buffered write() call in dd or any other userland app does
not have a 1:1 mapping with a SCSI WRITE command at the bottom of the
stack. The pages in question will simply be marked dirty and eventually
flushed to disk.
You can force a more block-centric behavior by using synchronous/direct
I/O.
Chris> Also included for interest is the result of issue an hdparm write
Chris> command. It works without a size specification, so I don't
Chris> actually know what happens on the drive itself, plus the command
Chris> that gets issued to the drive isn't "WRITE_10" but "ATA_16".
That's because the ATA command gets encapsulated in a SCSI command so it
can pass through the SCSI layer.
--
Martin K. Petersen Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Phillip Susi
2014-10-08 17:51:33 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Roger Heflin
hdparm --write-sector <sectornum>
I believe it directly sends the scsi/ata layer commands.
You end up with the same results as using dd ( with oflag=direct ), it
is just a matter of the path it takes to get there.

With dd, it calls write() to pass the data to the block layer, which
hands it to the scsi layer, which translates it into a scsi
WRITE_10/16 command, which hands it to libata which translates it into
an ata taskfile to be handed to the drive.

With hdparm --write-sector, it builds the ata taskfile, uses the SG_IO
ioctl to hand it to the block layer, which hands it down through the
scsi and libata layers which see that it needs no translation and it
goes to the drive unmodified.

The resulting taskfile the drive actually sees should be the same.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)

iQEcBAEBAgAGBQJUNXmlAAoJEI5FoCIzSKrwMYwIAJj1c0pBxdOcCQAk4i26802S
/lbPHhY5Xu7wR5KbZSXeEazE/vgTT7mDgjWHoe6Vl9e+Ci90KJxSFgQXNNwcYtuK
V+UFrTyqiKAzfk8VbRj0kwxk1JuXRQesDlwCGUsBkjSO26pdhUVfxwP8I3JcOBQW
uKRmh8PE48iq7kDWQdtxve6IPnAj/VY8AubwRAaVAvZ3xsEUBlf7UAkvA4n3WvWN
mfO1VVWwv4zyZ6bEBoWfjj6//5C0R+q2TrnBDFD9pN/wY4TdAx0gtufiUWx0v5WG
NNzJ9tm5z2rNo/HNi4w1gHm0JLDhSky21sNX7KyY8/1tFjqa3KQT7iQ6vxk4UJM=
=Xv01
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Eyal Lebedinsky
2014-06-10 22:18:39 UTC
Permalink
Related while not exactly on-topic: Is there a way to list all the pend=
ing sectors (rather
than just the first one failing during the extended test)? And the list=
of bad sectors?

I am asking about the lists kept by the disk, not the logical list kept=
by software raid.


TIA
This is a bit off topic as it doesn't involved md raid. But bad secto=
rs are common sources of md raid problems, so I figured I'd post this h=
ere.
Summary: Hitachi/HGST Travelstar 5K750. smartctl will not complete an=
extended offline test, it stops 60% remaining reporting the LBA of the=
first error. Whether I use dd to read that LBA, or write zeros to it, =
or to a 1MB block surrounding it, I always get back a read error. Not a=
write error. I can't get rid of this bad sector. I have used the ATA s=
ecure erase command via hdparm and get the same results. Very weird, I'=
d expect a write error to occur.
Num Test_Description Status Remaining LifeTime(=
hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 60% 1206 =
430197584
### Link to the full smartctl -x output
https://docs.google.com/file/d/0B_2Asp8DGjJ9VmdIZVo4UzdGaEE/edit
### This is the command I used to try to write zeros over it, and th=
# dd if=3D/dev/zero of=3D/dev/sda seek=3D430197584 count=3D1
dd: writing to =EF=BF=BD/dev/sda=EF=BF=BD: Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 3.6149 s, 0.0 kB/s
[15110.142071] ata1.00: exception Emask 0x0 SAct 0x20000 SErr 0x0 act=
ion 0x0
[15110.142079] ata1.00: irq_stat 0x40000008
[15110.142084] ata1.00: failed command: READ FPDMA QUEUED
[15110.142092] ata1.00: cmd 60/08:88:50:4b:a4/00:00:19:00:00/40 tag 1=
7 ncq 4096 in
res 51/40:08:50:4b:a4/00:00:19:00:00/40 Emask 0x409 (media =
error) <F>
[15110.142096] ata1.00: status: { DRDY ERR }
[15110.142099] ata1.00: error: { UNC }
[15110.144802] ata1.00: configured for UDMA/133
[15110.144826] sd 0:0:0:0: [sda] Unhandled sense code
[15110.144830] sd 0:0:0:0: [sda]
[15110.144832] Result: hostbyte=3DDID_OK driverbyte=3DDRIVER_SENSE
[15110.144835] sd 0:0:0:0: [sda]
[15110.144837] Sense Key : Medium Error [current] [descriptor]
[15110.144843] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 0=
0
[15110.144854] 19 a4 4b 50
[15110.144860] sd 0:0:0:0: [sda]
[15110.144863] Add. Sense: Unrecovered read error - auto reallocate f=
ailed
[15110.144867] Read(10): 28 00 19 a4 4b 50 00 00 08 00
[15110.144892] end_request: I/O error, dev sda, sector 430197584
[15110.144934] ata1: EH complete
### This is the complete dmesg
https://docs.google.com/file/d/0B_2Asp8DGjJ9c3hfelQyTnNoMU0/edit
At first I thought it was because I'm writing one 512 byte logical se=
ctor, but this drive has 4096 physical sectors. OK so I write out 8 log=
ical sectors instead, still get a read error. If I do this, to put the =
# dd if=3D/dev/zero of=3D/dev/sda seek=3D430196560 count=3D2048
dd: writing to =EF=BF=BD/dev/sda=EF=BF=BD: Input/output error
1025+0 records in
1024+0 records out
It stops right at LBA 430197584, again with a read error. So even tho=
ugh the drive SMART health assessment is "pass" and there are no other =
SMART values below threshold indicating "works as designed" this drive =
has effectively failed because any write operation to this LBA results =
in unrecoverable failure.
Anyway I find this confusing and unexpected.
Chris Murphy--
--=20
Eyal Lebedinsky (***@eyal.emu.id.au)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...