Discussion:
/sys/block/md126 still exists even after stopping the array
Francis Moreau
2014-06-24 15:38:30 UTC
Permalink
Hello,

I'm having the folloing behaviour with kernel 3.14.5 and mdadm v3.3.1.

After stopping all arrays, I still can see one of them in /sys/block/:

# cat /proc/mdstat
Personalities : [raid1]
md125 : active raid1 sdb3[1] sda3[0]
483688448 blocks super 1.2 [2/2] [UU]
[======>..............] resync = 34.9% (169161280/483688448)
finish=44.0min speed=118852K/sec
bitmap: 3/4 pages [12KB], 65536KB chunk

md126 : active raid1 sdb2[1] sda2[0]
4038656 blocks super 1.2 [2/2] [UU]

md127 : active raid1 sdb1[1] sda1[0]
524224 blocks super 1.0 [2/2] [UU]

unused devices: <none>

# mdadm --stop /dev/md12[567]
mdadm: stopped /dev/md125
mdadm: stopped /dev/md126
mdadm: stopped /dev/md127

# cat /proc/mdstat
Personalities : [raid1]
unused devices: <none>

# ls /sys/block/
md126 sda sdb sdc sr0

# ls /sys/block/md126/md/
array_size array_state bitmap chunk_size component_size layout
level max_read_errors metadata_version new_dev raid_disks
reshape_direction reshape_position resync_start safe_mode_delay

# dmesg
....
[ 1573.715476] md125: detected capacity change from 495296970752 to 0
[ 1573.715626] md: md125 stopped.
[ 1573.715633] md: unbind<sdb3>
[ 1573.740681] md: export_rdev(sdb3)
[ 1573.740694] md: unbind<sda3>
[ 1573.754008] md: export_rdev(sda3)
[ 1573.773398] md126: detected capacity change from 4135583744 to 0
[ 1573.773403] md: md126 stopped.
[ 1573.773410] md: unbind<sdb2>
[ 1573.820652] md: export_rdev(sdb2)
[ 1573.820664] md: unbind<sda2>
[ 1573.873974] md: export_rdev(sda2)
[ 1573.889904] md127: detected capacity change from 536805376 to 0
[ 1573.889910] md: md127 stopped.
[ 1573.889917] md: unbind<sdb1>
[ 1573.913978] md: export_rdev(sdb1)
[ 1573.914033] md: unbind<sda1>
[ 1573.940627] md: export_rdev(sda1)

After waiting a couple of min, stopping again md126 worked:

[ 1835.755661] md: md126 stopped.

Is this expected ?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown
2014-06-25 01:03:48 UTC
Permalink
Post by Francis Moreau
Hello,
I'm having the folloing behaviour with kernel 3.14.5 and mdadm v3.3.1.
# cat /proc/mdstat
Personalities : [raid1]
md125 : active raid1 sdb3[1] sda3[0]
483688448 blocks super 1.2 [2/2] [UU]
[======>..............] resync = 34.9% (169161280/483688448)
finish=44.0min speed=118852K/sec
bitmap: 3/4 pages [12KB], 65536KB chunk
md126 : active raid1 sdb2[1] sda2[0]
4038656 blocks super 1.2 [2/2] [UU]
md127 : active raid1 sdb1[1] sda1[0]
524224 blocks super 1.0 [2/2] [UU]
unused devices: <none>
# mdadm --stop /dev/md12[567]
mdadm: stopped /dev/md125
mdadm: stopped /dev/md126
mdadm: stopped /dev/md127
# cat /proc/mdstat
Personalities : [raid1]
unused devices: <none>
# ls /sys/block/
md126 sda sdb sdc sr0
# ls /sys/block/md126/md/
array_size array_state bitmap chunk_size component_size layout
level max_read_errors metadata_version new_dev raid_disks
reshape_direction reshape_position resync_start safe_mode_delay
# dmesg
....
[ 1573.715476] md125: detected capacity change from 495296970752 to 0
[ 1573.715626] md: md125 stopped.
[ 1573.715633] md: unbind<sdb3>
[ 1573.740681] md: export_rdev(sdb3)
[ 1573.740694] md: unbind<sda3>
[ 1573.754008] md: export_rdev(sda3)
[ 1573.773398] md126: detected capacity change from 4135583744 to 0
[ 1573.773403] md: md126 stopped.
[ 1573.773410] md: unbind<sdb2>
[ 1573.820652] md: export_rdev(sdb2)
[ 1573.820664] md: unbind<sda2>
[ 1573.873974] md: export_rdev(sda2)
[ 1573.889904] md127: detected capacity change from 536805376 to 0
[ 1573.889910] md: md127 stopped.
[ 1573.889917] md: unbind<sdb1>
[ 1573.913978] md: export_rdev(sdb1)
[ 1573.914033] md: unbind<sda1>
[ 1573.940627] md: export_rdev(sda1)
[ 1835.755661] md: md126 stopped.
Is this expected ?
No overly surprising.

This is probably caused by udev, or something udev runs, opening /dev/md126
after it has been stopped. This has the effect of creating an empty inactive
array.
e.g.

mknod /dev/test b 9 57
< /dev/test

will make /sys/block/md57 appear.

It is a bit untidy, but shouldn't cause problems.

NeilBrown
Francis Moreau
2014-06-25 06:59:00 UTC
Permalink
Hello Neil,

On 06/25/2014 03:03 AM, NeilBrown wrote:
[...]
Post by NeilBrown
Post by Francis Moreau
[ 1835.755661] md: md126 stopped.
Is this expected ?
No overly surprising.
This is probably caused by udev, or something udev runs, opening /dev/md126
after it has been stopped. This has the effect of creating an empty inactive
array.
e.g.
Why opening /dev/md126 after it's been stopped didn't fail ?

Also why doing 'mdadm --stop /dev/md126' several times reports that the
array was stopped each time ?
Post by NeilBrown
mknod /dev/test b 9 57
< /dev/test
will make /sys/block/md57 appear.
It is a bit untidy, but shouldn't cause problems.
Another 'inconsistent' state I noticed is when a new array is created.
There's a time window when /sys//block/md127/md exists but the contained
disk (dev-sda1 dev-sdb1, for example) are not present in that directory.
This state (as the previous one) confuses my tool that is listening udev
events in order to register all block devices present in the system.

So I am wondering how I should deal with this 'transition' states and
how can detect them in a reliable way ?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Sebastian Parschauer
2014-07-24 13:40:22 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Tue, 24 Jun 2014 17:38:30 +0200 Francis Moreau
Post by Francis Moreau
Hello,
I'm having the folloing behaviour with kernel 3.14.5 and mdadm v3.3.1.
After stopping all arrays, I still can see one of them in
# cat /proc/mdstat Personalities : [raid1] md125 : active raid1
sdb3[1] sda3[0] 483688448 blocks super 1.2 [2/2] [UU]
[======>..............] resync = 34.9% (169161280/483688448)
finish=44.0min speed=118852K/sec bitmap: 3/4 pages [12KB],
65536KB chunk
md126 : active raid1 sdb2[1] sda2[0] 4038656 blocks super 1.2
[2/2] [UU]
md127 : active raid1 sdb1[1] sda1[0] 524224 blocks super 1.0
[2/2] [UU]
unused devices: <none>
stopped /dev/md126 mdadm: stopped /dev/md127
<none>
# ls /sys/block/ md126 sda sdb sdc sr0
# ls /sys/block/md126/md/ array_size array_state bitmap
chunk_size component_size layout level max_read_errors
metadata_version new_dev raid_disks reshape_direction
reshape_position resync_start safe_mode_delay
# dmesg .... [ 1573.715476] md125: detected capacity change from
495296970752 to 0 [ 1573.715626] md: md125 stopped. [
export_rdev(sdb3) [ 1573.740694] md: unbind<sda3> [ 1573.754008]
md: export_rdev(sda3) [ 1573.773398] md126: detected capacity
change from 4135583744 to 0 [ 1573.773403] md: md126 stopped. [
export_rdev(sdb2) [ 1573.820664] md: unbind<sda2> [ 1573.873974]
md: export_rdev(sda2) [ 1573.889904] md127: detected capacity
change from 536805376 to 0 [ 1573.889910] md: md127 stopped. [
export_rdev(sdb1) [ 1573.914033] md: unbind<sda1> [ 1573.940627]
md: export_rdev(sda1)
[ 1835.755661] md: md126 stopped.
Is this expected ?
No overly surprising.
This is probably caused by udev, or something udev runs, opening
/dev/md126 after it has been stopped. This has the effect of
creating an empty inactive array. e.g.
mknod /dev/test b 9 57 < /dev/test
will make /sys/block/md57 appear.
It is a bit untidy, but shouldn't cause problems.
NeilBrown
Hi Neil,

we are also noticing this issue with udev-215 on Gentoo. They've
switched over to systemd. Deactivating all udev rules doesn't help. So
it must be an issue in the code. udev-204 works fine.
When stopping udev, then it is possible to stop the MD array and it
remains stopped (/sys/block/mdX disappears). Looks like they do
something really strange there.

Do you have any advice how to fix this in another way but to change
the kernel or to downgrade udev?

File a systemd bug?

Cheers,
Sebastian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJT0QzGAAoJEH4DRb7WXajZXooIALBi2HzzXL0aLx6aKf3EHIRb
/PmRFhGdHZlGcETbgYflbuFYz30rLs3AYfOm+/PpGNpatppxuKkEddEq/2Lp/wJn
2cQt6cd2F6fOdgMq9MK3tM2nnj4HRPpqpq5huvnvijTYsPXRYvepjQUxFtd7/G/L
z/41cRGeeLOhlwfoUeWIwm1XKsetLgUl8VyKmxxtBRwCA5UZjfdE9u1X7TUxxA7d
ZRtASftmtXqjtxLHRzV+0LRSDgBBAYLzrJTlrgPf5oN+rW/NtjNxld5V0oqynrsW
WawC3ntc+z79tPtfHSLV1icshhmkfhcQzWjOcmW4RypFF/7h+3zTBBi262FEiGM=
=CvmT
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Artur Paszkiewicz
2014-07-24 13:51:22 UTC
Permalink
Post by Sebastian Parschauer
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Tue, 24 Jun 2014 17:38:30 +0200 Francis Moreau
Post by Francis Moreau
Hello,
I'm having the folloing behaviour with kernel 3.14.5 and mdadm v3.3.1.
After stopping all arrays, I still can see one of them in
# cat /proc/mdstat Personalities : [raid1] md125 : active raid1
sdb3[1] sda3[0] 483688448 blocks super 1.2 [2/2] [UU]
[======>..............] resync = 34.9% (169161280/483688448)
finish=44.0min speed=118852K/sec bitmap: 3/4 pages [12KB],
65536KB chunk
md126 : active raid1 sdb2[1] sda2[0] 4038656 blocks super 1.2
[2/2] [UU]
md127 : active raid1 sdb1[1] sda1[0] 524224 blocks super 1.0
[2/2] [UU]
unused devices: <none>
stopped /dev/md126 mdadm: stopped /dev/md127
<none>
# ls /sys/block/ md126 sda sdb sdc sr0
# ls /sys/block/md126/md/ array_size array_state bitmap
chunk_size component_size layout level max_read_errors
metadata_version new_dev raid_disks reshape_direction
reshape_position resync_start safe_mode_delay
# dmesg .... [ 1573.715476] md125: detected capacity change from
495296970752 to 0 [ 1573.715626] md: md125 stopped. [
export_rdev(sdb3) [ 1573.740694] md: unbind<sda3> [ 1573.754008]
md: export_rdev(sda3) [ 1573.773398] md126: detected capacity
change from 4135583744 to 0 [ 1573.773403] md: md126 stopped. [
export_rdev(sdb2) [ 1573.820664] md: unbind<sda2> [ 1573.873974]
md: export_rdev(sda2) [ 1573.889904] md127: detected capacity
change from 536805376 to 0 [ 1573.889910] md: md127 stopped. [
export_rdev(sdb1) [ 1573.914033] md: unbind<sda1> [ 1573.940627]
md: export_rdev(sda1)
[ 1835.755661] md: md126 stopped.
Is this expected ?
No overly surprising.
This is probably caused by udev, or something udev runs, opening
/dev/md126 after it has been stopped. This has the effect of
creating an empty inactive array. e.g.
mknod /dev/test b 9 57 < /dev/test
will make /sys/block/md57 appear.
It is a bit untidy, but shouldn't cause problems.
NeilBrown
Hi Neil,
we are also noticing this issue with udev-215 on Gentoo. They've
switched over to systemd. Deactivating all udev rules doesn't help. So
it must be an issue in the code. udev-204 works fine.
When stopping udev, then it is possible to stop the MD array and it
remains stopped (/sys/block/mdX disappears). Looks like they do
something really strange there.
Do you have any advice how to fix this in another way but to change
the kernel or to downgrade udev?
File a systemd bug?
Hi,

I have also noticed this issue. It is caused by this change in udev:

commit 3ebdb81ef088afd3b4c72b516beb5610f8c93a0d
Author: Kay Sievers <kay at vrfy.org>
Date: Sun Apr 13 19:54:27 2014 -0700

udev: serialize/synchronize block device event handling with file locks

http://cgit.freedesktop.org/systemd/systemd/commit/?id=3ebdb81ef088afd3b4c72b516beb5610f8c93a0d

It seems that they have already disabled this for dm for some reason,
but not for md:

commit e918a1b5a94f270186dca59156354acd2a596494
Author: Kay Sievers <***@vrfy.org>
Date: Tue Jun 3 16:49:38 2014 +0200

udev: exclude device-mapper from block device ownership event locking

http://cgit.freedesktop.org/systemd/systemd/commit/?id=e918a1b5a94f270186dca59156354acd2a596494


Artur
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Francis Moreau
2014-09-25 16:12:07 UTC
Permalink
Hello,
Post by NeilBrown
Post by Francis Moreau
Hello,
I'm having the folloing behaviour with kernel 3.14.5 and mdadm v3.3.1.
# cat /proc/mdstat
Personalities : [raid1]
md125 : active raid1 sdb3[1] sda3[0]
483688448 blocks super 1.2 [2/2] [UU]
[======>..............] resync = 34.9% (169161280/483688448)
finish=44.0min speed=118852K/sec
bitmap: 3/4 pages [12KB], 65536KB chunk
md126 : active raid1 sdb2[1] sda2[0]
4038656 blocks super 1.2 [2/2] [UU]
md127 : active raid1 sdb1[1] sda1[0]
524224 blocks super 1.0 [2/2] [UU]
unused devices: <none>
# mdadm --stop /dev/md12[567]
mdadm: stopped /dev/md125
mdadm: stopped /dev/md126
mdadm: stopped /dev/md127
# cat /proc/mdstat
Personalities : [raid1]
unused devices: <none>
# ls /sys/block/
md126 sda sdb sdc sr0
# ls /sys/block/md126/md/
array_size array_state bitmap chunk_size component_size layout
level max_read_errors metadata_version new_dev raid_disks
reshape_direction reshape_position resync_start safe_mode_delay
# dmesg
....
[ 1573.715476] md125: detected capacity change from 495296970752 to 0
[ 1573.715626] md: md125 stopped.
[ 1573.715633] md: unbind<sdb3>
[ 1573.740681] md: export_rdev(sdb3)
[ 1573.740694] md: unbind<sda3>
[ 1573.754008] md: export_rdev(sda3)
[ 1573.773398] md126: detected capacity change from 4135583744 to 0
[ 1573.773403] md: md126 stopped.
[ 1573.773410] md: unbind<sdb2>
[ 1573.820652] md: export_rdev(sdb2)
[ 1573.820664] md: unbind<sda2>
[ 1573.873974] md: export_rdev(sda2)
[ 1573.889904] md127: detected capacity change from 536805376 to 0
[ 1573.889910] md: md127 stopped.
[ 1573.889917] md: unbind<sdb1>
[ 1573.913978] md: export_rdev(sdb1)
[ 1573.914033] md: unbind<sda1>
[ 1573.940627] md: export_rdev(sda1)
[ 1835.755661] md: md126 stopped.
Is this expected ?
No overly surprising.
This is probably caused by udev, or something udev runs, opening /dev/md126
after it has been stopped. This has the effect of creating an empty inactive
array.
e.g.
Sorry for resurecting this again, but I'm still seeing this.

Sebastian saw the same behaviour with udev 215 and it appeared that this
specific version introduced a regression which resulted in the same
behavior I described initialy.

But in my case, the version of udev used is older (204).

I tried to find out what could have opened the md device by using fuser,
but fuser reports no users.

I took a look to the udev rules which are the one shipped by mdadm 3.3.2
but nothing keep the device opened during the remove event.

Could you give me some hints here to debug this ?

Thanks


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown
2014-09-26 00:33:48 UTC
Permalink
Post by Francis Moreau
Hello,
Post by NeilBrown
Post by Francis Moreau
Hello,
I'm having the folloing behaviour with kernel 3.14.5 and mdadm v3.3.1.
# cat /proc/mdstat
Personalities : [raid1]
md125 : active raid1 sdb3[1] sda3[0]
483688448 blocks super 1.2 [2/2] [UU]
[======>..............] resync = 34.9% (169161280/483688448)
finish=44.0min speed=118852K/sec
bitmap: 3/4 pages [12KB], 65536KB chunk
md126 : active raid1 sdb2[1] sda2[0]
4038656 blocks super 1.2 [2/2] [UU]
md127 : active raid1 sdb1[1] sda1[0]
524224 blocks super 1.0 [2/2] [UU]
unused devices: <none>
# mdadm --stop /dev/md12[567]
mdadm: stopped /dev/md125
mdadm: stopped /dev/md126
mdadm: stopped /dev/md127
# cat /proc/mdstat
Personalities : [raid1]
unused devices: <none>
# ls /sys/block/
md126 sda sdb sdc sr0
# ls /sys/block/md126/md/
array_size array_state bitmap chunk_size component_size layout
level max_read_errors metadata_version new_dev raid_disks
reshape_direction reshape_position resync_start safe_mode_delay
# dmesg
....
[ 1573.715476] md125: detected capacity change from 495296970752 to 0
[ 1573.715626] md: md125 stopped.
[ 1573.715633] md: unbind<sdb3>
[ 1573.740681] md: export_rdev(sdb3)
[ 1573.740694] md: unbind<sda3>
[ 1573.754008] md: export_rdev(sda3)
[ 1573.773398] md126: detected capacity change from 4135583744 to 0
[ 1573.773403] md: md126 stopped.
[ 1573.773410] md: unbind<sdb2>
[ 1573.820652] md: export_rdev(sdb2)
[ 1573.820664] md: unbind<sda2>
[ 1573.873974] md: export_rdev(sda2)
[ 1573.889904] md127: detected capacity change from 536805376 to 0
[ 1573.889910] md: md127 stopped.
[ 1573.889917] md: unbind<sdb1>
[ 1573.913978] md: export_rdev(sdb1)
[ 1573.914033] md: unbind<sda1>
[ 1573.940627] md: export_rdev(sda1)
[ 1835.755661] md: md126 stopped.
Is this expected ?
No overly surprising.
This is probably caused by udev, or something udev runs, opening /dev/md126
after it has been stopped. This has the effect of creating an empty inactive
array.
e.g.
Sorry for resurecting this again, but I'm still seeing this.
Sebastian saw the same behaviour with udev 215 and it appeared that this
specific version introduced a regression which resulted in the same
behavior I described initialy.
But in my case, the version of udev used is older (204).
I tried to find out what could have opened the md device by using fuser,
but fuser reports no users.
It is probably a transient open/close.
Post by Francis Moreau
I took a look to the udev rules which are the one shipped by mdadm 3.3.2
but nothing keep the device opened during the remove event.
Could you give me some hints here to debug this ?
Modify md_open in drivers/md/md.c to add
printk("Opened by %s\n", current->comm);

and build a new kernel. That will tell you the name of the process which
opened the device.

NeilBrown
Post by Francis Moreau
Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Francis Moreau
2014-09-26 10:23:27 UTC
Permalink
Hello Neil,
[...]
Post by NeilBrown
Post by Francis Moreau
I tried to find out what could have opened the md device by using fuser,
but fuser reports no users.
It is probably a transient open/close.
If it's open/close wouldn't the 'close' part make the device disapear ?
Post by NeilBrown
Post by Francis Moreau
I took a look to the udev rules which are the one shipped by mdadm 3.3.2
but nothing keep the device opened during the remove event.
Could you give me some hints here to debug this ?
Modify md_open in drivers/md/md.c to add
printk("Opened by %s\n", current->comm);
and build a new kernel. That will tell you the name of the process which
opened the device.
I did that I also added a trace in md_release() but strangely no trace
were outputed from there.

Here's the details of what I did:

--- %< ---
[***@localhost ~]# cat /proc/mdstat
Personalities : [raid1]
md125 : active raid1 vdc1[1] vdb1[0]
65472 blocks super 1.0 [2/2] [UU]

md126 : active raid1 vdc2[1] vdb2[0]
209536 blocks super 1.2 [2/2] [UU]

md127 : active raid1 vdb3[0] vdc3[1]
1819584 blocks super 1.2 [2/2] [UU]

unused devices: <none>

[***@localhost ~]# mdadm --stop --scan

[***@localhost ~]# dmesg | grep md_
[ 1.474207] md_open(): opened by mdadm
[ 1.475316] md_open(): opened by mdadm
[ 1.492880] md_open(): opened by mdadm
[ 1.493201] md_open(): opened by mdadm
[ 1.494690] md_open(): opened by mdadm
[ 1.499369] md_open(): opened by mdadm
[ 1.533566] md_open(): opened by mdadm
[ 1.533697] md_open(): opened by mdadm
[ 1.554419] md_open(): opened by mdadm
[ 1.574451] md_open(): opened by mdadm
[ 1.574666] md_open(): opened by mdadm
[ 1.574877] md_open(): opened by mdadm
[ 1.576822] md_open(): opened by systemd-udevd
[ 1.576895] md_open(): opened by systemd-udevd
[ 1.577029] md_open(): opened by systemd-udevd
[ 1.581850] md_open(): opened by mdadm
[ 1.584054] md_open(): opened by systemd-udevd
[ 1.584770] md_open(): opened by mdadm
[ 1.585175] md_open(): opened by mdadm
[ 1.586328] md_open(): opened by systemd-udevd
[ 1.586933] md_open(): opened by systemd-udevd
[ 1.651265] md_open(): opened by mdadm
[ 1.651320] md_open(): opened by mdadm
[ 1.651364] md_open(): opened by mdadm
[ 1.651437] md_open(): opened by mdadm
[ 1.652376] md_open(): opened by mdadm
[ 1.652452] md_open(): opened by mdadm
[ 33.486704] md_open(): opened by mdadm
[ 33.489259] md_open(): opened by mdadm
[ 33.491000] md_open(): opened by mdadm
[ 33.491767] md_open(): opened by systemd-udevd
[ 33.692255] md_open(): opened by mdadm
[ 33.692288] md_open(): opened by mdadm
[ 33.692606] md_open(): opened by mdadm
[ 33.692858] md_open(): opened by mdadm
[ 33.692942] md_open(): opened by mdadm
[ 33.693237] md_open(): opened by mdadm
[ 33.694254] md_open(): opened by mdadm
[ 33.694275] md_open(): opened by mdadm
[ 33.694373] md_open(): opened by mdadm
[ 33.695558] md_open(): opened by mdadm
[ 33.695679] md_open(): opened by mdadm
[ 33.695855] md_open(): opened by mdadm
[ 33.695894] md_open(): opened by mdadm

[***@localhost ~]# ls /dev/md125
/dev/md125

[***@localhost ~]# fuser /dev/md125

[***@localhost ~]# ps aux | grep "mdadm\|systemd-udevd"
root 366 0.0 0.1 38172 1696 ? Ss 06:04 0:00
/usr/lib/systemd/systemd-udevd
root 465 0.0 0.0 4964 924 ? Ss 06:04 0:00
/sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid

[***@localhost ~]# ls -l /proc/366/fd/
total 0
lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 10 -> socket:[8665]
lr-x------ 1 root root 64 Sep 26 06:04 11 -> /etc/udev/hwdb.bin
lrwx------ 1 root root 64 Sep 26 06:04 12 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 3 -> socket:[8144]
lrwx------ 1 root root 64 Sep 26 06:04 4 -> socket:[8103]
lrwx------ 1 root root 64 Sep 26 06:04 5 -> socket:[8660]
lrwx------ 1 root root 64 Sep 26 06:04 6 -> /run/udev/queue.bin
lr-x------ 1 root root 64 Sep 26 06:04 7 -> anon_inode:inotify
lrwx------ 1 root root 64 Sep 26 06:04 8 -> anon_inode:[signalfd]
lrwx------ 1 root root 64 Sep 26 06:04 9 -> socket:[8664]

[***@localhost ~]# ls -l /proc/465/fd/
total 0
lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
lr-x------ 1 root root 64 Sep 26 06:06 4 -> /proc/mdstat
lrwx------ 1 root root 64 Sep 26 06:06 5 -> socket:[10038]

[***@localhost ~]# cat /proc/mdstat
Personalities : [raid1]
unused devices: <none>

[***@localhost ~]# ls /sys/block/md125/md/
array_size array_state bitmap/ chunk_size component_size layout
level max_read_errors metadata_version new_dev raid_disks
reshape_direction reshape_position resync_start safe_mode_delay

--- >% ---

So in my understanding, only mdadm and udevd are opening the MD devices
and mdamd was the last to open the device. For some unknown reasons,
md_release() is never called.

This happens with:

- kernel 3.14.19
- mdadm 3.3.2
- systemd 208

Can you see something wrong here ?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown
2014-09-26 10:44:45 UTC
Permalink
Post by Francis Moreau
Hello Neil,
[...]
Post by NeilBrown
Post by Francis Moreau
I tried to find out what could have opened the md device by using fuser,
but fuser reports no users.
It is probably a transient open/close.
If it's open/close wouldn't the 'close' part make the device disapear ?
No. It's ... complicated.
Post by Francis Moreau
Post by NeilBrown
Post by Francis Moreau
I took a look to the udev rules which are the one shipped by mdadm 3.3.2
but nothing keep the device opened during the remove event.
Could you give me some hints here to debug this ?
Modify md_open in drivers/md/md.c to add
printk("Opened by %s\n", current->comm);
and build a new kernel. That will tell you the name of the process which
opened the device.
I did that I also added a trace in md_release() but strangely no trace
were outputed from there.
Without seeing your patch I can't guess what it happening, but I am *certain*
that md_release() would get called providing md_open didn't return an error.

It might be helpful to print out the pid and the md device number too
task_tgid_vnr(current)
will give you the pid.
mdname(mddev)
give the name of the device.

Probably there is a 'change' event happening just before the 'remove' event,
and udev runs "mdadm" on the 'change' event, and that ends up happening after
the device has been removed.

Is this really a problem? Can't you just ignore it and pretend it isn't
there?

NeilBrown
Post by Francis Moreau
--- %< ---
Personalities : [raid1]
md125 : active raid1 vdc1[1] vdb1[0]
65472 blocks super 1.0 [2/2] [UU]
md126 : active raid1 vdc2[1] vdb2[0]
209536 blocks super 1.2 [2/2] [UU]
md127 : active raid1 vdb3[0] vdc3[1]
1819584 blocks super 1.2 [2/2] [UU]
unused devices: <none>
[ 1.474207] md_open(): opened by mdadm
[ 1.475316] md_open(): opened by mdadm
[ 1.492880] md_open(): opened by mdadm
[ 1.493201] md_open(): opened by mdadm
[ 1.494690] md_open(): opened by mdadm
[ 1.499369] md_open(): opened by mdadm
[ 1.533566] md_open(): opened by mdadm
[ 1.533697] md_open(): opened by mdadm
[ 1.554419] md_open(): opened by mdadm
[ 1.574451] md_open(): opened by mdadm
[ 1.574666] md_open(): opened by mdadm
[ 1.574877] md_open(): opened by mdadm
[ 1.576822] md_open(): opened by systemd-udevd
[ 1.576895] md_open(): opened by systemd-udevd
[ 1.577029] md_open(): opened by systemd-udevd
[ 1.581850] md_open(): opened by mdadm
[ 1.584054] md_open(): opened by systemd-udevd
[ 1.584770] md_open(): opened by mdadm
[ 1.585175] md_open(): opened by mdadm
[ 1.586328] md_open(): opened by systemd-udevd
[ 1.586933] md_open(): opened by systemd-udevd
[ 1.651265] md_open(): opened by mdadm
[ 1.651320] md_open(): opened by mdadm
[ 1.651364] md_open(): opened by mdadm
[ 1.651437] md_open(): opened by mdadm
[ 1.652376] md_open(): opened by mdadm
[ 1.652452] md_open(): opened by mdadm
[ 33.486704] md_open(): opened by mdadm
[ 33.489259] md_open(): opened by mdadm
[ 33.491000] md_open(): opened by mdadm
[ 33.491767] md_open(): opened by systemd-udevd
[ 33.692255] md_open(): opened by mdadm
[ 33.692288] md_open(): opened by mdadm
[ 33.692606] md_open(): opened by mdadm
[ 33.692858] md_open(): opened by mdadm
[ 33.692942] md_open(): opened by mdadm
[ 33.693237] md_open(): opened by mdadm
[ 33.694254] md_open(): opened by mdadm
[ 33.694275] md_open(): opened by mdadm
[ 33.694373] md_open(): opened by mdadm
[ 33.695558] md_open(): opened by mdadm
[ 33.695679] md_open(): opened by mdadm
[ 33.695855] md_open(): opened by mdadm
[ 33.695894] md_open(): opened by mdadm
/dev/md125
root 366 0.0 0.1 38172 1696 ? Ss 06:04 0:00
/usr/lib/systemd/systemd-udevd
root 465 0.0 0.0 4964 924 ? Ss 06:04 0:00
/sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid
total 0
lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 10 -> socket:[8665]
lr-x------ 1 root root 64 Sep 26 06:04 11 -> /etc/udev/hwdb.bin
lrwx------ 1 root root 64 Sep 26 06:04 12 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 3 -> socket:[8144]
lrwx------ 1 root root 64 Sep 26 06:04 4 -> socket:[8103]
lrwx------ 1 root root 64 Sep 26 06:04 5 -> socket:[8660]
lrwx------ 1 root root 64 Sep 26 06:04 6 -> /run/udev/queue.bin
lr-x------ 1 root root 64 Sep 26 06:04 7 -> anon_inode:inotify
lrwx------ 1 root root 64 Sep 26 06:04 8 -> anon_inode:[signalfd]
lrwx------ 1 root root 64 Sep 26 06:04 9 -> socket:[8664]
total 0
lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
lr-x------ 1 root root 64 Sep 26 06:06 4 -> /proc/mdstat
lrwx------ 1 root root 64 Sep 26 06:06 5 -> socket:[10038]
Personalities : [raid1]
unused devices: <none>
array_size array_state bitmap/ chunk_size component_size layout
level max_read_errors metadata_version new_dev raid_disks
reshape_direction reshape_position resync_start safe_mode_delay
--- >% ---
So in my understanding, only mdadm and udevd are opening the MD devices
and mdamd was the last to open the device. For some unknown reasons,
md_release() is never called.
- kernel 3.14.19
- mdadm 3.3.2
- systemd 208
Can you see something wrong here ?
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Artur Paszkiewicz
2014-09-26 11:23:41 UTC
Permalink
Post by NeilBrown
Post by Francis Moreau
Hello Neil,
[...]
Post by NeilBrown
Post by Francis Moreau
I tried to find out what could have opened the md device by using fuser,
but fuser reports no users.
It is probably a transient open/close.
If it's open/close wouldn't the 'close' part make the device disapear ?
No. It's ... complicated.
Post by Francis Moreau
Post by NeilBrown
Post by Francis Moreau
I took a look to the udev rules which are the one shipped by mdadm 3.3.2
but nothing keep the device opened during the remove event.
Could you give me some hints here to debug this ?
Modify md_open in drivers/md/md.c to add
printk("Opened by %s\n", current->comm);
and build a new kernel. That will tell you the name of the process which
opened the device.
I did that I also added a trace in md_release() but strangely no trace
were outputed from there.
Without seeing your patch I can't guess what it happening, but I am *certain*
that md_release() would get called providing md_open didn't return an error.
It might be helpful to print out the pid and the md device number too
task_tgid_vnr(current)
will give you the pid.
mdname(mddev)
give the name of the device.
Probably there is a 'change' event happening just before the 'remove' event,
and udev runs "mdadm" on the 'change' event, and that ends up happening after
the device has been removed.
Is this really a problem? Can't you just ignore it and pretend it isn't
there?
NeilBrown
Post by Francis Moreau
--- %< ---
Personalities : [raid1]
md125 : active raid1 vdc1[1] vdb1[0]
65472 blocks super 1.0 [2/2] [UU]
md126 : active raid1 vdc2[1] vdb2[0]
209536 blocks super 1.2 [2/2] [UU]
md127 : active raid1 vdb3[0] vdc3[1]
1819584 blocks super 1.2 [2/2] [UU]
unused devices: <none>
[ 1.474207] md_open(): opened by mdadm
[ 1.475316] md_open(): opened by mdadm
[ 1.492880] md_open(): opened by mdadm
[ 1.493201] md_open(): opened by mdadm
[ 1.494690] md_open(): opened by mdadm
[ 1.499369] md_open(): opened by mdadm
[ 1.533566] md_open(): opened by mdadm
[ 1.533697] md_open(): opened by mdadm
[ 1.554419] md_open(): opened by mdadm
[ 1.574451] md_open(): opened by mdadm
[ 1.574666] md_open(): opened by mdadm
[ 1.574877] md_open(): opened by mdadm
[ 1.576822] md_open(): opened by systemd-udevd
[ 1.576895] md_open(): opened by systemd-udevd
[ 1.577029] md_open(): opened by systemd-udevd
[ 1.581850] md_open(): opened by mdadm
[ 1.584054] md_open(): opened by systemd-udevd
[ 1.584770] md_open(): opened by mdadm
[ 1.585175] md_open(): opened by mdadm
[ 1.586328] md_open(): opened by systemd-udevd
[ 1.586933] md_open(): opened by systemd-udevd
[ 1.651265] md_open(): opened by mdadm
[ 1.651320] md_open(): opened by mdadm
[ 1.651364] md_open(): opened by mdadm
[ 1.651437] md_open(): opened by mdadm
[ 1.652376] md_open(): opened by mdadm
[ 1.652452] md_open(): opened by mdadm
[ 33.486704] md_open(): opened by mdadm
[ 33.489259] md_open(): opened by mdadm
[ 33.491000] md_open(): opened by mdadm
[ 33.491767] md_open(): opened by systemd-udevd
[ 33.692255] md_open(): opened by mdadm
[ 33.692288] md_open(): opened by mdadm
[ 33.692606] md_open(): opened by mdadm
[ 33.692858] md_open(): opened by mdadm
[ 33.692942] md_open(): opened by mdadm
[ 33.693237] md_open(): opened by mdadm
[ 33.694254] md_open(): opened by mdadm
[ 33.694275] md_open(): opened by mdadm
[ 33.694373] md_open(): opened by mdadm
[ 33.695558] md_open(): opened by mdadm
[ 33.695679] md_open(): opened by mdadm
[ 33.695855] md_open(): opened by mdadm
[ 33.695894] md_open(): opened by mdadm
/dev/md125
root 366 0.0 0.1 38172 1696 ? Ss 06:04 0:00
/usr/lib/systemd/systemd-udevd
root 465 0.0 0.0 4964 924 ? Ss 06:04 0:00
/sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid
total 0
lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 10 -> socket:[8665]
lr-x------ 1 root root 64 Sep 26 06:04 11 -> /etc/udev/hwdb.bin
lrwx------ 1 root root 64 Sep 26 06:04 12 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 3 -> socket:[8144]
lrwx------ 1 root root 64 Sep 26 06:04 4 -> socket:[8103]
lrwx------ 1 root root 64 Sep 26 06:04 5 -> socket:[8660]
lrwx------ 1 root root 64 Sep 26 06:04 6 -> /run/udev/queue.bin
lr-x------ 1 root root 64 Sep 26 06:04 7 -> anon_inode:inotify
lrwx------ 1 root root 64 Sep 26 06:04 8 -> anon_inode:[signalfd]
lrwx------ 1 root root 64 Sep 26 06:04 9 -> socket:[8664]
total 0
lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
lr-x------ 1 root root 64 Sep 26 06:06 4 -> /proc/mdstat
lrwx------ 1 root root 64 Sep 26 06:06 5 -> socket:[10038]
Personalities : [raid1]
unused devices: <none>
array_size array_state bitmap/ chunk_size component_size layout
level max_read_errors metadata_version new_dev raid_disks
reshape_direction reshape_position resync_start safe_mode_delay
--- >% ---
So in my understanding, only mdadm and udevd are opening the MD devices
and mdamd was the last to open the device. For some unknown reasons,
md_release() is never called.
- kernel 3.14.19
- mdadm 3.3.2
- systemd 208
Can you see something wrong here ?
Thanks.
--
Hi,

I have also been debugging this issue and I came up with this
fix/workaround. It works for me. Can you take a look a this?

Thanks,
Artur
Post by NeilBrown
From c547e39789cde93d4a7ea1d3f845d61b82e4f0ed Mon Sep 17 00:00:00 2001
From: Artur Paszkiewicz <***@intel.com>
Date: Fri, 26 Sep 2014 12:20:46 +0200
Subject: [PATCH] md: avoid creating new devices for stopped arrays in
md_open()

When an array is about to be destroyed, set mddev->gendisk->private_data
to NULL as it is no longer needed and check it in md_open(). If
bdev->bd_disk->private_data is NULL, then this indicates that the array
is stopped and return -ENODEV.

Signed-off-by: Artur Paszkiewicz <***@intel.com>
---
drivers/md/md.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 1294238..7109d48 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -449,6 +449,7 @@ static void mddev_put(struct mddev *mddev)
bs = mddev->bio_set;
mddev->bio_set = NULL;
if (mddev->gendisk) {
+ mddev->gendisk->private_data = NULL;
/* We did a probe so need to clean up. Call
* queue_work inside the spinlock so that
* flush_workqueue() after mddev_find will
@@ -6693,9 +6694,14 @@ static int md_open(struct block_device *bdev, fmode_t mode)
* Succeed if we can lock the mddev, which confirms that
* it isn't being stopped right now.
*/
- struct mddev *mddev = mddev_find(bdev->bd_dev);
+ struct mddev *mddev;
int err;

+ if (!bdev->bd_disk->private_data)
+ return -ENODEV;
+
+ mddev = mddev_find(bdev->bd_dev);
+
if (!mddev)
return -ENODEV;
--
1.8.4.5
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown
2014-09-29 04:19:02 UTC
Permalink
On Fri, 26 Sep 2014 13:23:41 +0200 Artur Paszkiewicz
Post by Artur Paszkiewicz
Post by NeilBrown
Post by Francis Moreau
Hello Neil,
[...]
Post by NeilBrown
Post by Francis Moreau
I tried to find out what could have opened the md device by using fuser,
but fuser reports no users.
It is probably a transient open/close.
If it's open/close wouldn't the 'close' part make the device disapear ?
No. It's ... complicated.
Post by Francis Moreau
Post by NeilBrown
Post by Francis Moreau
I took a look to the udev rules which are the one shipped by mdadm 3.3.2
but nothing keep the device opened during the remove event.
Could you give me some hints here to debug this ?
Modify md_open in drivers/md/md.c to add
printk("Opened by %s\n", current->comm);
and build a new kernel. That will tell you the name of the process which
opened the device.
I did that I also added a trace in md_release() but strangely no trace
were outputed from there.
Without seeing your patch I can't guess what it happening, but I am *certain*
that md_release() would get called providing md_open didn't return an error.
It might be helpful to print out the pid and the md device number too
task_tgid_vnr(current)
will give you the pid.
mdname(mddev)
give the name of the device.
Probably there is a 'change' event happening just before the 'remove' event,
and udev runs "mdadm" on the 'change' event, and that ends up happening after
the device has been removed.
Is this really a problem? Can't you just ignore it and pretend it isn't
there?
NeilBrown
Post by Francis Moreau
--- %< ---
Personalities : [raid1]
md125 : active raid1 vdc1[1] vdb1[0]
65472 blocks super 1.0 [2/2] [UU]
md126 : active raid1 vdc2[1] vdb2[0]
209536 blocks super 1.2 [2/2] [UU]
md127 : active raid1 vdb3[0] vdc3[1]
1819584 blocks super 1.2 [2/2] [UU]
unused devices: <none>
[ 1.474207] md_open(): opened by mdadm
[ 1.475316] md_open(): opened by mdadm
[ 1.492880] md_open(): opened by mdadm
[ 1.493201] md_open(): opened by mdadm
[ 1.494690] md_open(): opened by mdadm
[ 1.499369] md_open(): opened by mdadm
[ 1.533566] md_open(): opened by mdadm
[ 1.533697] md_open(): opened by mdadm
[ 1.554419] md_open(): opened by mdadm
[ 1.574451] md_open(): opened by mdadm
[ 1.574666] md_open(): opened by mdadm
[ 1.574877] md_open(): opened by mdadm
[ 1.576822] md_open(): opened by systemd-udevd
[ 1.576895] md_open(): opened by systemd-udevd
[ 1.577029] md_open(): opened by systemd-udevd
[ 1.581850] md_open(): opened by mdadm
[ 1.584054] md_open(): opened by systemd-udevd
[ 1.584770] md_open(): opened by mdadm
[ 1.585175] md_open(): opened by mdadm
[ 1.586328] md_open(): opened by systemd-udevd
[ 1.586933] md_open(): opened by systemd-udevd
[ 1.651265] md_open(): opened by mdadm
[ 1.651320] md_open(): opened by mdadm
[ 1.651364] md_open(): opened by mdadm
[ 1.651437] md_open(): opened by mdadm
[ 1.652376] md_open(): opened by mdadm
[ 1.652452] md_open(): opened by mdadm
[ 33.486704] md_open(): opened by mdadm
[ 33.489259] md_open(): opened by mdadm
[ 33.491000] md_open(): opened by mdadm
[ 33.491767] md_open(): opened by systemd-udevd
[ 33.692255] md_open(): opened by mdadm
[ 33.692288] md_open(): opened by mdadm
[ 33.692606] md_open(): opened by mdadm
[ 33.692858] md_open(): opened by mdadm
[ 33.692942] md_open(): opened by mdadm
[ 33.693237] md_open(): opened by mdadm
[ 33.694254] md_open(): opened by mdadm
[ 33.694275] md_open(): opened by mdadm
[ 33.694373] md_open(): opened by mdadm
[ 33.695558] md_open(): opened by mdadm
[ 33.695679] md_open(): opened by mdadm
[ 33.695855] md_open(): opened by mdadm
[ 33.695894] md_open(): opened by mdadm
/dev/md125
root 366 0.0 0.1 38172 1696 ? Ss 06:04 0:00
/usr/lib/systemd/systemd-udevd
root 465 0.0 0.0 4964 924 ? Ss 06:04 0:00
/sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid
total 0
lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 10 -> socket:[8665]
lr-x------ 1 root root 64 Sep 26 06:04 11 -> /etc/udev/hwdb.bin
lrwx------ 1 root root 64 Sep 26 06:04 12 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 3 -> socket:[8144]
lrwx------ 1 root root 64 Sep 26 06:04 4 -> socket:[8103]
lrwx------ 1 root root 64 Sep 26 06:04 5 -> socket:[8660]
lrwx------ 1 root root 64 Sep 26 06:04 6 -> /run/udev/queue.bin
lr-x------ 1 root root 64 Sep 26 06:04 7 -> anon_inode:inotify
lrwx------ 1 root root 64 Sep 26 06:04 8 -> anon_inode:[signalfd]
lrwx------ 1 root root 64 Sep 26 06:04 9 -> socket:[8664]
total 0
lrwx------ 1 root root 64 Sep 26 06:04 0 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 1 -> /dev/null
lrwx------ 1 root root 64 Sep 26 06:04 2 -> /dev/null
lr-x------ 1 root root 64 Sep 26 06:06 4 -> /proc/mdstat
lrwx------ 1 root root 64 Sep 26 06:06 5 -> socket:[10038]
Personalities : [raid1]
unused devices: <none>
array_size array_state bitmap/ chunk_size component_size layout
level max_read_errors metadata_version new_dev raid_disks
reshape_direction reshape_position resync_start safe_mode_delay
--- >% ---
So in my understanding, only mdadm and udevd are opening the MD devices
and mdamd was the last to open the device. For some unknown reasons,
md_release() is never called.
- kernel 3.14.19
- mdadm 3.3.2
- systemd 208
Can you see something wrong here ?
Thanks.
--
Hi,
I have also been debugging this issue and I came up with this
fix/workaround. It works for me. Can you take a look a this?
Thanks,
Artur
Post by NeilBrown
From c547e39789cde93d4a7ea1d3f845d61b82e4f0ed Mon Sep 17 00:00:00 2001
Date: Fri, 26 Sep 2014 12:20:46 +0200
Subject: [PATCH] md: avoid creating new devices for stopped arrays in
md_open()
When an array is about to be destroyed, set mddev->gendisk->private_data
to NULL as it is no longer needed and check it in md_open(). If
bdev->bd_disk->private_data is NULL, then this indicates that the array
is stopped and return -ENODEV.
---
drivers/md/md.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 1294238..7109d48 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -449,6 +449,7 @@ static void mddev_put(struct mddev *mddev)
bs = mddev->bio_set;
mddev->bio_set = NULL;
if (mddev->gendisk) {
+ mddev->gendisk->private_data = NULL;
/* We did a probe so need to clean up. Call
* queue_work inside the spinlock so that
* flush_workqueue() after mddev_find will
@@ -6693,9 +6694,14 @@ static int md_open(struct block_device *bdev, fmode_t mode)
* Succeed if we can lock the mddev, which confirms that
* it isn't being stopped right now.
*/
- struct mddev *mddev = mddev_find(bdev->bd_dev);
+ struct mddev *mddev;
int err;
+ if (!bdev->bd_disk->private_data)
+ return -ENODEV;
+
+ mddev = mddev_find(bdev->bd_dev);
+
if (!mddev)
return -ENODEV;
Thanks, but I don't think this is a complete fix.
It creates a small window after an array is stopped during which an attempt
to open the device will fail. Once mddev_delayed_delete() completes, the
device can be opened again.
So it might occasionally fix the symptom, but it is very dependant on timing
and won't always work.

NeilBrown
Francis Moreau
2014-09-26 12:21:04 UTC
Permalink
Post by NeilBrown
Post by Francis Moreau
Hello Neil,
[...]
Post by NeilBrown
Post by Francis Moreau
I tried to find out what could have opened the md device by using fuser,
but fuser reports no users.
It is probably a transient open/close.
If it's open/close wouldn't the 'close' part make the device disapear ?
No. It's ... complicated.
Post by Francis Moreau
Post by NeilBrown
Post by Francis Moreau
I took a look to the udev rules which are the one shipped by mdadm 3.3.2
but nothing keep the device opened during the remove event.
Could you give me some hints here to debug this ?
Modify md_open in drivers/md/md.c to add
printk("Opened by %s\n", current->comm);
and build a new kernel. That will tell you the name of the process which
opened the device.
I did that I also added a trace in md_release() but strangely no trace
were outputed from there.
Without seeing your patch I can't guess what it happening, but I am *certain*
that md_release() would get called providing md_open didn't return an error.
Here's the patch:

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 73aedcb..08ead8d 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6703,6 +6703,8 @@ static int md_open(struct block_device *bdev,
fmode_t mode)
struct mddev *mddev = mddev_find(bdev->bd_dev);
int err;

+ printk("md_open(): opened by %s\n", current->comm);
+
if (!mddev)
return -ENODEV;

@@ -6735,6 +6737,8 @@ static void md_release(struct gendisk *disk,
fmode_t mode)
{
struct mddev *mddev = disk->private_data;

+ printk("md_release(): released by %s\n", current->comm);
+
BUG_ON(!mddev);
atomic_dec(&mddev->openers);
mddev_put(mddev);
Post by NeilBrown
It might be helpful to print out the pid and the md device number too
task_tgid_vnr(current)
will give you the pid.
mdname(mddev)
give the name of the device.
Here's the new trace, this time md_release() was called, so I probably
did something wrong the first time, sorry for that.

[ 1.470744] md_open(): md127 opened by mdadm [388]
[ 1.485437] md_release(): md127 released by mdadm [388]
[ 1.486888] md_open(): md126 opened by mdadm [381]
[ 1.487468] md_release(): md126 released by mdadm [381]
[ 1.488646] md_open(): md125 opened by mdadm [383]
[ 1.489074] md_release(): md125 released by mdadm [383]
[ 1.490555] md_open(): md127 opened by mdadm [385]
[ 1.512556] md_release(): md127 released by mdadm [385]
[ 1.512582] md_open(): md127 opened by mdadm [385]
[ 1.512682] md_open(): md126 opened by mdadm [384]
[ 1.553414] md_release(): md126 released by mdadm [384]
[ 1.553442] md_open(): md126 opened by mdadm [384]
[ 1.553549] md_open(): md125 opened by mdadm [382]
[ 1.573263] md_release(): md125 released by mdadm [382]
[ 1.573288] md_open(): md125 opened by mdadm [382]
[ 1.601034] md_open(): md125 opened by mdadm [459]
[ 1.601041] md_release(): md125 released by mdadm [459]
[ 1.601065] md_open(): md126 opened by mdadm [459]
[ 1.601067] md_release(): md126 released by mdadm [459]
[ 1.601090] md_open(): md127 opened by mdadm [459]
[ 1.601092] md_release(): md127 released by mdadm [459]
[ 1.601130] md_open(): md127 opened by mdadm [459]
[ 1.601220] md_release(): md127 released by mdadm [459]
[ 1.601633] md_open(): md126 opened by mdadm [459]
[ 1.601661] md_release(): md126 released by mdadm [459]
[ 1.601673] md_open(): md125 opened by mdadm [459]
[ 1.601695] md_release(): md125 released by mdadm [459]
[ 1.606127] md_open(): md125 opened by mdadm [454]
[ 1.608682] md_open(): md126 opened by mdadm [453]
[ 1.609514] md_open(): md127 opened by mdadm [448]
[ 1.622512] md_release(): md126 released by mdadm [453]
[ 1.623028] md_release(): md127 released by mdadm [448]
[ 1.625288] md_open(): md126 opened by systemd-udevd [363]
[ 1.625391] md_release(): md125 released by mdadm [454]
[ 1.625619] md_open(): md127 opened by systemd-udevd [368]
[ 1.625737] md_open(): md125 opened by systemd-udevd [366]
[ 1.637137] md_release(): md125 released by systemd-udevd [366]
[ 1.643982] md_open(): md125 opened by mdadm [476]
[ 1.644071] md_release(): md127 released by systemd-udevd [368]
[ 1.647787] md_release(): md125 released by mdadm [382]
[ 1.648171] md_release(): md126 released by systemd-udevd [363]
[ 1.651629] md_open(): md126 opened by mdadm [479]
[ 1.656666] md_open(): md127 opened by mdadm [480]
[ 1.657771] md_release(): md125 released by mdadm [476]
[ 1.659312] md_open(): md125 opened by systemd-udevd [365]
[ 1.663193] md_release(): md127 released by mdadm [385]
[ 1.673669] md_release(): md125 released by systemd-udevd [365]
[ 1.685527] md_release(): md127 released by mdadm [480]
[ 1.685599] md_release(): md126 released by mdadm [479]
[ 1.686058] md_open(): md126 opened by systemd-udevd [366]
[ 1.686282] md_release(): md126 released by systemd-udevd [366]
[ 1.691024] md_open(): md127 opened by systemd-udevd [363]
[ 1.695415] md_release(): md126 released by mdadm [384]
[ 1.707163] md_release(): md127 released by systemd-udevd [363]
Post by NeilBrown
Post by Francis Moreau
Post by NeilBrown
mdadm --stop --scan <<<
[ 89.975162] md_open(): md125 opened by mdadm [930]
[ 89.975305] md_release(): md125 released by mdadm [930]
[ 89.977434] md_open(): md125 opened by mdadm [932]
[ 89.978813] md_open(): md125 opened by mdadm [930]
[ 89.979365] md_release(): md125 released by mdadm [932]
[ 89.979693] md_open(): md125 opened by systemd-udevd [931]
[ 89.985790] md_release(): md125 released by systemd-udevd [931]
[ 90.179911] md_release(): md125 released by mdadm [930]
[ 90.180168] md_open(): md127 opened by mdadm [459]
[ 90.180187] md_release(): md127 released by mdadm [459]
[ 90.180199] md_open(): md126 opened by mdadm [459]
[ 90.180205] md_release(): md126 released by mdadm [459]
[ 90.180556] md_open(): md126 opened by mdadm [930]
[ 90.180653] md_release(): md126 released by mdadm [930]
[ 90.180690] md_open(): md126 opened by mdadm [930]
[ 90.180758] md_open(): mdX opened by mdadm [459]
[ 90.180995] md_open(): md125 opened by mdadm [459]
[ 90.181056] md_release(): md125 released by mdadm [459]
[ 90.182717] md_open(): md127 opened by mdadm [459]
[ 90.182725] md_release(): md127 released by mdadm [459]
[ 90.182732] md_open(): md126 opened by mdadm [459]
[ 90.182761] md_release(): md126 released by mdadm [459]
[ 90.182770] md_open(): md125 opened by mdadm [459]
[ 90.182775] md_release(): md125 released by mdadm [459]
[ 90.182940] md_release(): md126 released by mdadm [930]
[ 90.183167] md_open(): md127 opened by mdadm [930]
[ 90.183257] md_release(): md127 released by mdadm [930]
[ 90.183288] md_open(): md127 opened by mdadm [930]
[ 90.183461] md_open(): md127 opened by mdadm [459]
[ 90.183488] md_release(): md127 released by mdadm [459]
[ 90.183499] md_open(): md125 opened by mdadm [459]
[ 90.183505] md_release(): md125 released by mdadm [459]
[ 90.183686] md_release(): md127 released by mdadm [930]
Post by NeilBrown
Probably there is a 'change' event happening just before the 'remove' event,
and udev runs "mdadm" on the 'change' event, and that ends up happening after
the device has been removed.
Is this really a problem? Can't you just ignore it and pretend it isn't
there?
Well, if you list the block devices that the kernel detected in order to
operate on them, it could. I don't know exactly what would be the result
to use it but it could confuse some tools.

Is there a way to check that the 'ghost' device has been removed by
poking sysfs ?

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Francis Moreau
2014-09-26 12:50:08 UTC
Permalink
On 09/26/2014 02:21 PM, Francis Moreau wrote:
[...]
Post by Francis Moreau
Post by NeilBrown
Post by NeilBrown
mdadm --stop --scan <<<
[ 89.975162] md_open(): md125 opened by mdadm [930]
[ 89.975305] md_release(): md125 released by mdadm [930]
[ 89.977434] md_open(): md125 opened by mdadm [932]
[ 89.978813] md_open(): md125 opened by mdadm [930]
[ 89.979365] md_release(): md125 released by mdadm [932]
[ 89.979693] md_open(): md125 opened by systemd-udevd [931]
[ 89.985790] md_release(): md125 released by systemd-udevd [931]
[ 90.179911] md_release(): md125 released by mdadm [930]
[ 90.180168] md_open(): md127 opened by mdadm [459]
[ 90.180187] md_release(): md127 released by mdadm [459]
[ 90.180199] md_open(): md126 opened by mdadm [459]
[ 90.180205] md_release(): md126 released by mdadm [459]
[ 90.180556] md_open(): md126 opened by mdadm [930]
[ 90.180653] md_release(): md126 released by mdadm [930]
[ 90.180690] md_open(): md126 opened by mdadm [930]
[ 90.180758] md_open(): mdX opened by mdadm [459]
What is this 'mdX' device that mdadm operates on ?

It also doesn't have a counterpart release() call.
Post by Francis Moreau
[ 90.180995] md_open(): md125 opened by mdadm [459]
[ 90.181056] md_release(): md125 released by mdadm [459]
[ 90.182717] md_open(): md127 opened by mdadm [459]
[ 90.182725] md_release(): md127 released by mdadm [459]
[ 90.182732] md_open(): md126 opened by mdadm [459]
[ 90.182761] md_release(): md126 released by mdadm [459]
[ 90.182770] md_open(): md125 opened by mdadm [459]
[ 90.182775] md_release(): md125 released by mdadm [459]
[ 90.182940] md_release(): md126 released by mdadm [930]
[ 90.183167] md_open(): md127 opened by mdadm [930]
[ 90.183257] md_release(): md127 released by mdadm [930]
[ 90.183288] md_open(): md127 opened by mdadm [930]
[ 90.183461] md_open(): md127 opened by mdadm [459]
[ 90.183488] md_release(): md127 released by mdadm [459]
[ 90.183499] md_open(): md125 opened by mdadm [459]
[ 90.183505] md_release(): md125 released by mdadm [459]
[ 90.183686] md_release(): md127 released by mdadm [930]
Post by NeilBrown
Probably there is a 'change' event happening just before the 'remove' event,
and udev runs "mdadm" on the 'change' event, and that ends up happening after
the device has been removed.
Is this really a problem? Can't you just ignore it and pretend it isn't
there?
Well, if you list the block devices that the kernel detected in order to
operate on them, it could. I don't know exactly what would be the result
to use it but it could confuse some tools.
Is there a way to check that the 'ghost' device has been removed by
poking sysfs ?
Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown
2014-09-29 04:47:07 UTC
Permalink
Post by Francis Moreau
[...]
Post by Francis Moreau
Post by NeilBrown
Post by NeilBrown
mdadm --stop --scan <<<
[ 89.975162] md_open(): md125 opened by mdadm [930]
[ 89.975305] md_release(): md125 released by mdadm [930]
[ 89.977434] md_open(): md125 opened by mdadm [932]
[ 89.978813] md_open(): md125 opened by mdadm [930]
[ 89.979365] md_release(): md125 released by mdadm [932]
[ 89.979693] md_open(): md125 opened by systemd-udevd [931]
[ 89.985790] md_release(): md125 released by systemd-udevd [931]
[ 90.179911] md_release(): md125 released by mdadm [930]
[ 90.180168] md_open(): md127 opened by mdadm [459]
[ 90.180187] md_release(): md127 released by mdadm [459]
[ 90.180199] md_open(): md126 opened by mdadm [459]
[ 90.180205] md_release(): md126 released by mdadm [459]
[ 90.180556] md_open(): md126 opened by mdadm [930]
[ 90.180653] md_release(): md126 released by mdadm [930]
[ 90.180690] md_open(): md126 opened by mdadm [930]
[ 90.180758] md_open(): mdX opened by mdadm [459]
What is this 'mdX' device that mdadm operates on ?
It also doesn't have a counterpart release() call.
'mdX' is the name used if mddev->gendisk is NULL.
In that case, md_open() will return an error (ERESTARTSYS).
As the 'open' failed, we wouldn't expect a matching close/release.

NeilBrown
Post by Francis Moreau
Post by Francis Moreau
[ 90.180995] md_open(): md125 opened by mdadm [459]
[ 90.181056] md_release(): md125 released by mdadm [459]
[ 90.182717] md_open(): md127 opened by mdadm [459]
[ 90.182725] md_release(): md127 released by mdadm [459]
[ 90.182732] md_open(): md126 opened by mdadm [459]
[ 90.182761] md_release(): md126 released by mdadm [459]
[ 90.182770] md_open(): md125 opened by mdadm [459]
[ 90.182775] md_release(): md125 released by mdadm [459]
[ 90.182940] md_release(): md126 released by mdadm [930]
[ 90.183167] md_open(): md127 opened by mdadm [930]
[ 90.183257] md_release(): md127 released by mdadm [930]
[ 90.183288] md_open(): md127 opened by mdadm [930]
[ 90.183461] md_open(): md127 opened by mdadm [459]
[ 90.183488] md_release(): md127 released by mdadm [459]
[ 90.183499] md_open(): md125 opened by mdadm [459]
[ 90.183505] md_release(): md125 released by mdadm [459]
[ 90.183686] md_release(): md127 released by mdadm [930]
Post by NeilBrown
Probably there is a 'change' event happening just before the 'remove' event,
and udev runs "mdadm" on the 'change' event, and that ends up happening after
the device has been removed.
Is this really a problem? Can't you just ignore it and pretend it isn't
there?
Well, if you list the block devices that the kernel detected in order to
operate on them, it could. I don't know exactly what would be the result
to use it but it could confuse some tools.
Is there a way to check that the 'ghost' device has been removed by
poking sysfs ?
Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown
2014-09-29 04:37:35 UTC
Permalink
Post by Artur Paszkiewicz
Post by NeilBrown
Post by Francis Moreau
Hello Neil,
[...]
Post by NeilBrown
Post by Francis Moreau
I tried to find out what could have opened the md device by using fuser,
but fuser reports no users.
It is probably a transient open/close.
If it's open/close wouldn't the 'close' part make the device disapear ?
No. It's ... complicated.
Post by Francis Moreau
Post by NeilBrown
Post by Francis Moreau
I took a look to the udev rules which are the one shipped by mdadm 3.3.2
but nothing keep the device opened during the remove event.
Could you give me some hints here to debug this ?
Modify md_open in drivers/md/md.c to add
printk("Opened by %s\n", current->comm);
and build a new kernel. That will tell you the name of the process which
opened the device.
I did that I also added a trace in md_release() but strangely no trace
were outputed from there.
Without seeing your patch I can't guess what it happening, but I am *certain*
that md_release() would get called providing md_open didn't return an error.
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 73aedcb..08ead8d 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6703,6 +6703,8 @@ static int md_open(struct block_device *bdev,
fmode_t mode)
struct mddev *mddev = mddev_find(bdev->bd_dev);
int err;
+ printk("md_open(): opened by %s\n", current->comm);
+
if (!mddev)
return -ENODEV;
@@ -6735,6 +6737,8 @@ static void md_release(struct gendisk *disk,
fmode_t mode)
{
struct mddev *mddev = disk->private_data;
+ printk("md_release(): released by %s\n", current->comm);
+
BUG_ON(!mddev);
atomic_dec(&mddev->openers);
mddev_put(mddev);
Post by NeilBrown
It might be helpful to print out the pid and the md device number too
task_tgid_vnr(current)
will give you the pid.
mdname(mddev)
give the name of the device.
Here's the new trace, this time md_release() was called, so I probably
did something wrong the first time, sorry for that.
[ 1.470744] md_open(): md127 opened by mdadm [388]
[ 1.485437] md_release(): md127 released by mdadm [388]
[ 1.486888] md_open(): md126 opened by mdadm [381]
[ 1.487468] md_release(): md126 released by mdadm [381]
[ 1.488646] md_open(): md125 opened by mdadm [383]
[ 1.489074] md_release(): md125 released by mdadm [383]
[ 1.490555] md_open(): md127 opened by mdadm [385]
[ 1.512556] md_release(): md127 released by mdadm [385]
[ 1.512582] md_open(): md127 opened by mdadm [385]
[ 1.512682] md_open(): md126 opened by mdadm [384]
[ 1.553414] md_release(): md126 released by mdadm [384]
[ 1.553442] md_open(): md126 opened by mdadm [384]
[ 1.553549] md_open(): md125 opened by mdadm [382]
[ 1.573263] md_release(): md125 released by mdadm [382]
[ 1.573288] md_open(): md125 opened by mdadm [382]
[ 1.601034] md_open(): md125 opened by mdadm [459]
[ 1.601041] md_release(): md125 released by mdadm [459]
[ 1.601065] md_open(): md126 opened by mdadm [459]
[ 1.601067] md_release(): md126 released by mdadm [459]
[ 1.601090] md_open(): md127 opened by mdadm [459]
[ 1.601092] md_release(): md127 released by mdadm [459]
[ 1.601130] md_open(): md127 opened by mdadm [459]
[ 1.601220] md_release(): md127 released by mdadm [459]
[ 1.601633] md_open(): md126 opened by mdadm [459]
[ 1.601661] md_release(): md126 released by mdadm [459]
[ 1.601673] md_open(): md125 opened by mdadm [459]
[ 1.601695] md_release(): md125 released by mdadm [459]
[ 1.606127] md_open(): md125 opened by mdadm [454]
[ 1.608682] md_open(): md126 opened by mdadm [453]
[ 1.609514] md_open(): md127 opened by mdadm [448]
[ 1.622512] md_release(): md126 released by mdadm [453]
[ 1.623028] md_release(): md127 released by mdadm [448]
[ 1.625288] md_open(): md126 opened by systemd-udevd [363]
[ 1.625391] md_release(): md125 released by mdadm [454]
[ 1.625619] md_open(): md127 opened by systemd-udevd [368]
[ 1.625737] md_open(): md125 opened by systemd-udevd [366]
[ 1.637137] md_release(): md125 released by systemd-udevd [366]
[ 1.643982] md_open(): md125 opened by mdadm [476]
[ 1.644071] md_release(): md127 released by systemd-udevd [368]
[ 1.647787] md_release(): md125 released by mdadm [382]
[ 1.648171] md_release(): md126 released by systemd-udevd [363]
[ 1.651629] md_open(): md126 opened by mdadm [479]
[ 1.656666] md_open(): md127 opened by mdadm [480]
[ 1.657771] md_release(): md125 released by mdadm [476]
[ 1.659312] md_open(): md125 opened by systemd-udevd [365]
[ 1.663193] md_release(): md127 released by mdadm [385]
[ 1.673669] md_release(): md125 released by systemd-udevd [365]
[ 1.685527] md_release(): md127 released by mdadm [480]
[ 1.685599] md_release(): md126 released by mdadm [479]
[ 1.686058] md_open(): md126 opened by systemd-udevd [366]
[ 1.686282] md_release(): md126 released by systemd-udevd [366]
[ 1.691024] md_open(): md127 opened by systemd-udevd [363]
[ 1.695415] md_release(): md126 released by mdadm [384]
[ 1.707163] md_release(): md127 released by systemd-udevd [363]
Post by NeilBrown
Post by Francis Moreau
Post by NeilBrown
mdadm --stop --scan <<<
[ 89.975162] md_open(): md125 opened by mdadm [930]
[ 89.975305] md_release(): md125 released by mdadm [930]
[ 89.977434] md_open(): md125 opened by mdadm [932]
[ 89.978813] md_open(): md125 opened by mdadm [930]
[ 89.979365] md_release(): md125 released by mdadm [932]
[ 89.979693] md_open(): md125 opened by systemd-udevd [931]
[ 89.985790] md_release(): md125 released by systemd-udevd [931]
[ 90.179911] md_release(): md125 released by mdadm [930]
[ 90.180168] md_open(): md127 opened by mdadm [459]
[ 90.180187] md_release(): md127 released by mdadm [459]
[ 90.180199] md_open(): md126 opened by mdadm [459]
[ 90.180205] md_release(): md126 released by mdadm [459]
[ 90.180556] md_open(): md126 opened by mdadm [930]
[ 90.180653] md_release(): md126 released by mdadm [930]
[ 90.180690] md_open(): md126 opened by mdadm [930]
[ 90.180758] md_open(): mdX opened by mdadm [459]
[ 90.180995] md_open(): md125 opened by mdadm [459]
[ 90.181056] md_release(): md125 released by mdadm [459]
[ 90.182717] md_open(): md127 opened by mdadm [459]
[ 90.182725] md_release(): md127 released by mdadm [459]
[ 90.182732] md_open(): md126 opened by mdadm [459]
[ 90.182761] md_release(): md126 released by mdadm [459]
[ 90.182770] md_open(): md125 opened by mdadm [459]
[ 90.182775] md_release(): md125 released by mdadm [459]
[ 90.182940] md_release(): md126 released by mdadm [930]
[ 90.183167] md_open(): md127 opened by mdadm [930]
[ 90.183257] md_release(): md127 released by mdadm [930]
[ 90.183288] md_open(): md127 opened by mdadm [930]
[ 90.183461] md_open(): md127 opened by mdadm [459]
[ 90.183488] md_release(): md127 released by mdadm [459]
[ 90.183499] md_open(): md125 opened by mdadm [459]
[ 90.183505] md_release(): md125 released by mdadm [459]
[ 90.183686] md_release(): md127 released by mdadm [930]
So what were pids 930 and 459?
One was presumably the "mdadm -Ss" - probably 930.
Is 459 the "mdadm --monitor" ?? That might be useful hint.
Post by Artur Paszkiewicz
Post by NeilBrown
Probably there is a 'change' event happening just before the 'remove' event,
and udev runs "mdadm" on the 'change' event, and that ends up happening after
the device has been removed.
Is this really a problem? Can't you just ignore it and pretend it isn't
there?
Well, if you list the block devices that the kernel detected in order to
operate on them, it could. I don't know exactly what would be the result
to use it but it could confuse some tools.
Is there a way to check that the 'ghost' device has been removed by
poking sysfs ?
If you look at /sys/block/md*/md/array_state, those that contain 'inactive'
or 'clear' might be 'ghosts', or might be in the process of being assembled.
If you write 'clear' to the same file they should disappear.... unless udev
does something to re-create them.

NeilBrown
Francis Moreau
2014-09-29 08:45:17 UTC
Permalink
Hello Neil,
Post by NeilBrown
Post by Artur Paszkiewicz
Post by NeilBrown
Post by Francis Moreau
Hello Neil,
[...]
Post by NeilBrown
Post by Francis Moreau
I tried to find out what could have opened the md device by using fuser,
but fuser reports no users.
It is probably a transient open/close.
If it's open/close wouldn't the 'close' part make the device disapear ?
No. It's ... complicated.
Post by Francis Moreau
Post by NeilBrown
Post by Francis Moreau
I took a look to the udev rules which are the one shipped by mdadm 3.3.2
but nothing keep the device opened during the remove event.
Could you give me some hints here to debug this ?
Modify md_open in drivers/md/md.c to add
printk("Opened by %s\n", current->comm);
and build a new kernel. That will tell you the name of the process which
opened the device.
I did that I also added a trace in md_release() but strangely no trace
were outputed from there.
Without seeing your patch I can't guess what it happening, but I am *certain*
that md_release() would get called providing md_open didn't return an error.
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 73aedcb..08ead8d 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6703,6 +6703,8 @@ static int md_open(struct block_device *bdev,
fmode_t mode)
struct mddev *mddev = mddev_find(bdev->bd_dev);
int err;
+ printk("md_open(): opened by %s\n", current->comm);
+
if (!mddev)
return -ENODEV;
@@ -6735,6 +6737,8 @@ static void md_release(struct gendisk *disk,
fmode_t mode)
{
struct mddev *mddev = disk->private_data;
+ printk("md_release(): released by %s\n", current->comm);
+
BUG_ON(!mddev);
atomic_dec(&mddev->openers);
mddev_put(mddev);
Post by NeilBrown
It might be helpful to print out the pid and the md device number too
task_tgid_vnr(current)
will give you the pid.
mdname(mddev)
give the name of the device.
Here's the new trace, this time md_release() was called, so I probably
did something wrong the first time, sorry for that.
[ 1.470744] md_open(): md127 opened by mdadm [388]
[ 1.485437] md_release(): md127 released by mdadm [388]
[ 1.486888] md_open(): md126 opened by mdadm [381]
[ 1.487468] md_release(): md126 released by mdadm [381]
[ 1.488646] md_open(): md125 opened by mdadm [383]
[ 1.489074] md_release(): md125 released by mdadm [383]
[ 1.490555] md_open(): md127 opened by mdadm [385]
[ 1.512556] md_release(): md127 released by mdadm [385]
[ 1.512582] md_open(): md127 opened by mdadm [385]
[ 1.512682] md_open(): md126 opened by mdadm [384]
[ 1.553414] md_release(): md126 released by mdadm [384]
[ 1.553442] md_open(): md126 opened by mdadm [384]
[ 1.553549] md_open(): md125 opened by mdadm [382]
[ 1.573263] md_release(): md125 released by mdadm [382]
[ 1.573288] md_open(): md125 opened by mdadm [382]
[ 1.601034] md_open(): md125 opened by mdadm [459]
[ 1.601041] md_release(): md125 released by mdadm [459]
[ 1.601065] md_open(): md126 opened by mdadm [459]
[ 1.601067] md_release(): md126 released by mdadm [459]
[ 1.601090] md_open(): md127 opened by mdadm [459]
[ 1.601092] md_release(): md127 released by mdadm [459]
[ 1.601130] md_open(): md127 opened by mdadm [459]
[ 1.601220] md_release(): md127 released by mdadm [459]
[ 1.601633] md_open(): md126 opened by mdadm [459]
[ 1.601661] md_release(): md126 released by mdadm [459]
[ 1.601673] md_open(): md125 opened by mdadm [459]
[ 1.601695] md_release(): md125 released by mdadm [459]
[ 1.606127] md_open(): md125 opened by mdadm [454]
[ 1.608682] md_open(): md126 opened by mdadm [453]
[ 1.609514] md_open(): md127 opened by mdadm [448]
[ 1.622512] md_release(): md126 released by mdadm [453]
[ 1.623028] md_release(): md127 released by mdadm [448]
[ 1.625288] md_open(): md126 opened by systemd-udevd [363]
[ 1.625391] md_release(): md125 released by mdadm [454]
[ 1.625619] md_open(): md127 opened by systemd-udevd [368]
[ 1.625737] md_open(): md125 opened by systemd-udevd [366]
[ 1.637137] md_release(): md125 released by systemd-udevd [366]
[ 1.643982] md_open(): md125 opened by mdadm [476]
[ 1.644071] md_release(): md127 released by systemd-udevd [368]
[ 1.647787] md_release(): md125 released by mdadm [382]
[ 1.648171] md_release(): md126 released by systemd-udevd [363]
[ 1.651629] md_open(): md126 opened by mdadm [479]
[ 1.656666] md_open(): md127 opened by mdadm [480]
[ 1.657771] md_release(): md125 released by mdadm [476]
[ 1.659312] md_open(): md125 opened by systemd-udevd [365]
[ 1.663193] md_release(): md127 released by mdadm [385]
[ 1.673669] md_release(): md125 released by systemd-udevd [365]
[ 1.685527] md_release(): md127 released by mdadm [480]
[ 1.685599] md_release(): md126 released by mdadm [479]
[ 1.686058] md_open(): md126 opened by systemd-udevd [366]
[ 1.686282] md_release(): md126 released by systemd-udevd [366]
[ 1.691024] md_open(): md127 opened by systemd-udevd [363]
[ 1.695415] md_release(): md126 released by mdadm [384]
[ 1.707163] md_release(): md127 released by systemd-udevd [363]
Post by NeilBrown
Post by Francis Moreau
Post by NeilBrown
mdadm --stop --scan <<<
[ 89.975162] md_open(): md125 opened by mdadm [930]
[ 89.975305] md_release(): md125 released by mdadm [930]
[ 89.977434] md_open(): md125 opened by mdadm [932]
[ 89.978813] md_open(): md125 opened by mdadm [930]
[ 89.979365] md_release(): md125 released by mdadm [932]
[ 89.979693] md_open(): md125 opened by systemd-udevd [931]
[ 89.985790] md_release(): md125 released by systemd-udevd [931]
[ 90.179911] md_release(): md125 released by mdadm [930]
[ 90.180168] md_open(): md127 opened by mdadm [459]
[ 90.180187] md_release(): md127 released by mdadm [459]
[ 90.180199] md_open(): md126 opened by mdadm [459]
[ 90.180205] md_release(): md126 released by mdadm [459]
[ 90.180556] md_open(): md126 opened by mdadm [930]
[ 90.180653] md_release(): md126 released by mdadm [930]
[ 90.180690] md_open(): md126 opened by mdadm [930]
[ 90.180758] md_open(): mdX opened by mdadm [459]
[ 90.180995] md_open(): md125 opened by mdadm [459]
[ 90.181056] md_release(): md125 released by mdadm [459]opened by mdadm [968]
[ 90.182717] md_open(): md127 opened by mdadm [459]
[ 90.182725] md_release(): md127 released by mdadm [459]
[ 90.182732] md_open(): md126 opened by mdadm [459]
[ 90.182761] md_release(): md126 released by mdadm [459]
[ 90.182770] md_open(): md125 opened by mdadm [459]
[ 90.182775] md_release(): md125 released by mdadm [459]
[ 90.182940] md_release(): md126 released by mdadm [930]
[ 90.183167] md_open(): md127 opened by mdadm [930]
[ 90.183257] md_release(): md127 released by mdadm [930]
[ 90.183288] md_open(): md127 opened by mdadm [930]
[ 90.183461] md_open(): md127 opened by mdadm [459]
[ 90.183488] md_release(): md127 released by mdadm [459]
[ 90.183499] md_open(): md125 opened by mdadm [459]
[ 90.183505] md_release(): md125 released by mdadm [459]
[ 90.183686] md_release(): md127 released by mdadm [930]
So what were pids 930 and 459?
One was presumably the "mdadm -Ss" - probably 930.
Is 459 the "mdadm --monitor" ?? That might be useful hint.
yes.

[456] is: /sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid

and [930] is 'mdamd -Ss'.
Post by NeilBrown
Post by Artur Paszkiewicz
Post by NeilBrown
Probably there is a 'change' event happening just before the 'remove' event,
and udev runs "mdadm" on the 'change' event, and that ends up happening after
the device has been removed.
Is this really a problem? Can't you just ignore it and pretend it isn't
there?
Well, if you list the block devices that the kernel detected in order to
operate on them, it could. I don't know exactly what would be the result
to use it but it could confuse some tools.
Is there a way to check that the 'ghost' device has been removed by
poking sysfs ?
If you look at /sys/block/md*/md/array_state, those that contain 'inactive'
or 'clear' might be 'ghosts', or might be in the process of being assembled.
If you write 'clear' to the same file they should disappear.... unless udev
does something to re-create them.
It's in 'clear' state, and writing 'clear' doesn't make the device disapear.

[***@localhost ~]# dmesg -c >/dev/null
[***@localhost ~]# echo clear >/sys/block/md125/md/array_state
[***@localhost ~]# dmesg
[ 254.106252] md: md125 stopped.
[ 254.108182] md_open(): mdX opened by mdadm [968]

[ 254.109103] md_open(): md125 opened by mdadm [459]
[ 254.109127] md_open(): md125 opened by mdadm [459]
[ 254.109281] md_release(): md125 released by mdadm [459]

[ 254.109337] md_open(): md125 opened by mdadm [968]
[ 254.109572] md_release(): md125 released by mdadm [968]

[ 254.109847] md_open(): md125 opened by systemd-udevd [967]
[ 254.109986] md_release(): md125 released by systemd-udevd [967]

In that sequence, it seems that mdadm [459] is missing a md_release()
here. Is this expected ?

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown
2014-09-29 21:56:43 UTC
Permalink
Post by Francis Moreau
Post by NeilBrown
So what were pids 930 and 459?
One was presumably the "mdadm -Ss" - probably 930.
Is 459 the "mdadm --monitor" ?? That might be useful hint.
yes.
[456] is: /sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid
and [930] is 'mdamd -Ss'.
Good. Please try the patch below.
Post by Francis Moreau
Post by NeilBrown
Post by Francis Moreau
Post by NeilBrown
Probably there is a 'change' event happening just before the 'remove' event,
and udev runs "mdadm" on the 'change' event, and that ends up happening after
the device has been removed.
Is this really a problem? Can't you just ignore it and pretend it isn't
there?
Well, if you list the block devices that the kernel detected in order to
operate on them, it could. I don't know exactly what would be the result
to use it but it could confuse some tools.
Is there a way to check that the 'ghost' device has been removed by
poking sysfs ?
If you look at /sys/block/md*/md/array_state, those that contain 'inactive'
or 'clear' might be 'ghosts', or might be in the process of being assembled.
If you write 'clear' to the same file they should disappear.... unless udev
does something to re-create them.
It's in 'clear' state, and writing 'clear' doesn't make the device disapear.
[ 254.106252] md: md125 stopped.
[ 254.108182] md_open(): mdX opened by mdadm [968]
[ 254.109103] md_open(): md125 opened by mdadm [459]
[ 254.109127] md_open(): md125 opened by mdadm [459]
[ 254.109281] md_release(): md125 released by mdadm [459]
[ 254.109337] md_open(): md125 opened by mdadm [968]
[ 254.109572] md_release(): md125 released by mdadm [968]
[ 254.109847] md_open(): md125 opened by systemd-udevd [967]
[ 254.109986] md_release(): md125 released by systemd-udevd [967]
In that sequence, it seems that mdadm [459] is missing a md_release()
here. Is this expected ?
Presumably the first md_open returned an error. You could add another printk
at each 'return' to check.

Thanks,
NeilBrown


diff --git a/Monitor.c b/Monitor.c
index 5cb24fab8f2a..971d2ecbea72 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -460,7 +460,7 @@ static int check_array(struct state *st, struct mdstat_ent *mdstat,
mdu_array_info_t array;
struct mdstat_ent *mse = NULL, *mse2;
char *dev = st->devname;
- int fd;
+ int fd = -1;
int i;
int remaining_disks;
int last_disk;
@@ -468,6 +468,27 @@ static int check_array(struct state *st, struct mdstat_ent *mdstat,

if (test)
alert("TestMessage", dev, NULL, ainfo);
+ if (st->devnm[0])
+ fd = open("/sys/block", O_RDONLY|O_DIRECTORY);
+ if (fd >= 0) {
+ /* Don't open the device unless it is present and
+ * active in sysfs.
+ */
+ char buf[10];
+ close(fd);
+ fd = sysfs_open(st->devnm, NULL, "array_state");
+ if (fd < 0 ||
+ read(fd, buf, 10) < 5 ||
+ strncmp(buf,"clear",5) == 0 ||
+ strncmp(buf,"inact",5) == 0) {
+ if (fd >= 0)
+ close(fd);
+ if (!st->err)
+ alert("DeviceDisappeared", dev, NULL, ainfo);
+ st->err++;
+ return 0;
+ }
+ }
fd = open(dev, O_RDONLY);
if (fd < 0) {
if (!st->err)
Francis Moreau
2014-09-30 07:43:17 UTC
Permalink
Hi Neil,
Post by NeilBrown
Post by Francis Moreau
Post by NeilBrown
So what were pids 930 and 459?
One was presumably the "mdadm -Ss" - probably 930.
Is 459 the "mdadm --monitor" ?? That might be useful hint.
yes.
[456] is: /sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid
and [930] is 'mdamd -Ss'.
Good. Please try the patch below.
After applying your patch, this is what I'm getting in syslog:

Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by
systemd-udevd [971]
Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit
mdmonitor-takeover.service, ignoring: Invalid argument
Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring
and management.
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by
systemd-udevd [971]
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md125
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md126
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md127
Sep 30 03:40:08 localhost kernel: md125: detected capacity change from
1863254016 to 0
Sep 30 03:40:08 localhost kernel: md: md125 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3)
Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md126: detected capacity change from
67043328 to 0
Sep 30 03:40:08 localhost kernel: md: md126 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1)
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md127: detected capacity change from
214564864 to 0
Sep 30 03:40:08 localhost kernel: md: md127 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2)
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]

The ghost device is no more present so your patch seems to have fixed my
issue. But I must admit I don't really understand what's going on :-/

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Francis Moreau
2014-10-07 07:05:43 UTC
Permalink
Hi Neil,
Post by Sebastian Parschauer
Hi Neil,
Post by NeilBrown
Post by Francis Moreau
Post by NeilBrown
So what were pids 930 and 459?
One was presumably the "mdadm -Ss" - probably 930.
Is 459 the "mdadm --monitor" ?? That might be useful hint.
yes.
[456] is: /sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid
and [930] is 'mdamd -Ss'.
Good. Please try the patch below.
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by
systemd-udevd [971]
Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit
mdmonitor-takeover.service, ignoring: Invalid argument
Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring
and management.
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by
systemd-udevd [971]
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md125
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md126
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md127
Sep 30 03:40:08 localhost kernel: md125: detected capacity change from
1863254016 to 0
Sep 30 03:40:08 localhost kernel: md: md125 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3)
Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md126: detected capacity change from
67043328 to 0
Sep 30 03:40:08 localhost kernel: md: md126 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1)
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md127: detected capacity change from
214564864 to 0
Sep 30 03:40:08 localhost kernel: md: md127 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2)
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]
The ghost device is no more present so your patch seems to have fixed my
issue. But I must admit I don't really understand what's going on :-/
Since those 'ghost' devices are expected from the MD implementation
point of view, I'm wondering how am I supposed to detect them or maybe
how an application is supposed to recognized online arrays.

My application uses udev to detect et to get information about new
devices. I don't think the information exported by udev is enough to
figure this out. Also please note that since I rely on udev, I can't
really read information on /sys since this information may be out of
sync with the one returned by udev.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown
2014-10-07 23:54:25 UTC
Permalink
Post by Sebastian Parschauer
Hi Neil,
Post by Sebastian Parschauer
Hi Neil,
Post by NeilBrown
Post by Francis Moreau
Post by NeilBrown
So what were pids 930 and 459?
One was presumably the "mdadm -Ss" - probably 930.
Is 459 the "mdadm --monitor" ?? That might be useful hint.
yes.
[456] is: /sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid
and [930] is 'mdamd -Ss'.
Good. Please try the patch below.
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by
systemd-udevd [971]
Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit
mdmonitor-takeover.service, ignoring: Invalid argument
Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring
and management.
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by
systemd-udevd [971]
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md125
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md126
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md127
Sep 30 03:40:08 localhost kernel: md125: detected capacity change from
1863254016 to 0
Sep 30 03:40:08 localhost kernel: md: md125 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3)
Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md126: detected capacity change from
67043328 to 0
Sep 30 03:40:08 localhost kernel: md: md126 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1)
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md127: detected capacity change from
214564864 to 0
Sep 30 03:40:08 localhost kernel: md: md127 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2)
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]
The ghost device is no more present so your patch seems to have fixed my
issue. But I must admit I don't really understand what's going on :-/
Since those 'ghost' devices are expected from the MD implementation
point of view, I'm wondering how am I supposed to detect them or maybe
how an application is supposed to recognized online arrays.
If your application is looking in /proc/mdstat, then the "ghost" devices will
be either "inactive" or not present at all.
If your application is looking in /sys/block/md*, then the "ghost" devices
will have "clear" or "inactive" in /sys/block/mdXX/md/array_state.

If you use the new "CREATE names=yes" line in mdadm.conf (mdadm 3.3 or
later), and use kernel 3.17 or later, and use names rather than numbers to
identify your arrays (/dev/md/home, /dev/md_root), then the "ghost" problem
will be gone, and names in /proc/mdstat will be e.g. "md_home", or "md_root"
rather than "md4" or "md127".
Post by Sebastian Parschauer
My application uses udev to detect et to get information about new
devices. I don't think the information exported by udev is enough to
figure this out. Also please note that since I rely on udev, I can't
really read information on /sys since this information may be out of
sync with the one returned by udev.
If udev reports that an array exists, then it really did exist when udev got
the message. By the time your program gets run by udev, it might not exist
any more.
i.e. udev is always racy.
You should always treat any event from udev as a hint:

"Something happened to this device in the recent past. Lots of other
things might have happened since. The device might not exist any more, or
it might have been replaced with a completely different device. So you
might want to do something, or you might not, but whatever you do - be
careful and don't blame me if things go wrong 'cause I'm just the
messenger."

NeilBrown
Francis Moreau
2014-10-09 09:40:25 UTC
Permalink
Post by NeilBrown
Post by Sebastian Parschauer
Hi Neil,
Post by Sebastian Parschauer
Hi Neil,
Post by NeilBrown
Post by Francis Moreau
Post by NeilBrown
So what were pids 930 and 459?
One was presumably the "mdadm -Ss" - probably 930.
Is 459 the "mdadm --monitor" ?? That might be useful hint.
yes.
[456] is: /sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid
and [930] is 'mdamd -Ss'.
Good. Please try the patch below.
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by
systemd-udevd [971]
Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit
mdmonitor-takeover.service, ignoring: Invalid argument
Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring
and management.
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by
systemd-udevd [971]
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md125
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md126
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md127
Sep 30 03:40:08 localhost kernel: md125: detected capacity change from
1863254016 to 0
Sep 30 03:40:08 localhost kernel: md: md125 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3)
Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md126: detected capacity change from
67043328 to 0
Sep 30 03:40:08 localhost kernel: md: md126 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1)
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md127: detected capacity change from
214564864 to 0
Sep 30 03:40:08 localhost kernel: md: md127 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2)
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]
The ghost device is no more present so your patch seems to have fixed my
issue. But I must admit I don't really understand what's going on :-/
Since those 'ghost' devices are expected from the MD implementation
point of view, I'm wondering how am I supposed to detect them or maybe
how an application is supposed to recognized online arrays.
If your application is looking in /proc/mdstat, then the "ghost" devices will
be either "inactive" or not present at all.
If your application is looking in /sys/block/md*, then the "ghost" devices
will have "clear" or "inactive" in /sys/block/mdXX/md/array_state.
If you use the new "CREATE names=yes" line in mdadm.conf (mdadm 3.3 or
later), and use kernel 3.17 or later, and use names rather than numbers to
identify your arrays (/dev/md/home, /dev/md_root), then the "ghost" problem
will be gone, and names in /proc/mdstat will be e.g. "md_home", or "md_root"
rather than "md4" or "md127".
Post by Sebastian Parschauer
My application uses udev to detect et to get information about new
devices. I don't think the information exported by udev is enough to
figure this out. Also please note that since I rely on udev, I can't
really read information on /sys since this information may be out of
sync with the one returned by udev.
If udev reports that an array exists, then it really did exist when udev got
the message. By the time your program gets run by udev, it might not exist
any more. i.e. udev is always racy.
Yes, but reading sysfs is also racy. I was thinking that the advantage
of using udev is that it gives me a *consistent* (perhaps outdated)
snapshot of the device state.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
NeilBrown
2014-10-09 09:55:09 UTC
Permalink
Post by Francis Moreau
Post by NeilBrown
Post by Sebastian Parschauer
Hi Neil,
Post by Sebastian Parschauer
Hi Neil,
Post by NeilBrown
Post by Francis Moreau
Post by NeilBrown
So what were pids 930 and 459?
One was presumably the "mdadm -Ss" - probably 930.
Is 459 the "mdadm --monitor" ?? That might be useful hint.
yes.
[456] is: /sbin/mdadm --monitor --scan --daemonise --syslog
--pid-file=/run/mdadm/mdadm.pid
and [930] is 'mdamd -Ss'.
Good. Please try the patch below.
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
[972]
Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by
systemd-udevd [971]
Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit
mdmonitor-takeover.service, ignoring: Invalid argument
Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring
and management.
Sep 30 03:40:07 localhost kernel: md_release(): md125 released by
systemd-udevd [971]
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md125
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md126
Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
on md device /dev/md127
Sep 30 03:40:08 localhost kernel: md125: detected capacity change from
1863254016 to 0
Sep 30 03:40:08 localhost kernel: md: md125 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb3>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3)
Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md126: detected capacity change from
67043328 to 0
Sep 30 03:40:08 localhost kernel: md: md126 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb1>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1)
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[466]
Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]
Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
Sep 30 03:40:08 localhost kernel: md127: detected capacity change from
214564864 to 0
Sep 30 03:40:08 localhost kernel: md: md127 stopped.
Sep 30 03:40:08 localhost kernel: md: unbind<vdc2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2)
Sep 30 03:40:08 localhost kernel: md: unbind<vdb2>
Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2)
Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
[970]
The ghost device is no more present so your patch seems to have fixed my
issue. But I must admit I don't really understand what's going on :-/
Since those 'ghost' devices are expected from the MD implementation
point of view, I'm wondering how am I supposed to detect them or maybe
how an application is supposed to recognized online arrays.
If your application is looking in /proc/mdstat, then the "ghost" devices will
be either "inactive" or not present at all.
If your application is looking in /sys/block/md*, then the "ghost" devices
will have "clear" or "inactive" in /sys/block/mdXX/md/array_state.
If you use the new "CREATE names=yes" line in mdadm.conf (mdadm 3.3 or
later), and use kernel 3.17 or later, and use names rather than numbers to
identify your arrays (/dev/md/home, /dev/md_root), then the "ghost" problem
will be gone, and names in /proc/mdstat will be e.g. "md_home", or "md_root"
rather than "md4" or "md127".
Post by Sebastian Parschauer
My application uses udev to detect et to get information about new
devices. I don't think the information exported by udev is enough to
figure this out. Also please note that since I rely on udev, I can't
really read information on /sys since this information may be out of
sync with the one returned by udev.
If udev reports that an array exists, then it really did exist when udev got
the message. By the time your program gets run by udev, it might not exist
any more. i.e. udev is always racy.
Yes, but reading sysfs is also racy. I was thinking that the advantage
of using udev is that it gives me a *consistent* (perhaps outdated)
snapshot of the device state.
In what sense do you think sysfs is racy? What exactly do you want to do
with the udev event?

The event from the kernel to udev only contains the identity of the device
and the type of event (add,change,remove).

Some drivers add extra 'environment' information.
- 'dm' adds a 'cookie'.
- bcache add a 'CACHED_UUID' and 'CACHED_LABEL'
- libata-acpi adds a 'BAY_EVENT'

but in general there is nothing extra.

If udev adds stuff (which is probably does), it is just as racy as anything
that you might determine and add yourself.

NeilBrown
Francis Moreau
2014-10-10 19:34:18 UTC
Permalink
Hello Neil,
[...]
Post by NeilBrown
Post by Francis Moreau
Yes, but reading sysfs is also racy. I was thinking that the advantage
of using udev is that it gives me a *consistent* (perhaps outdated)
snapshot of the device state.
In what sense do you think sysfs is racy? What exactly do you want to do
with the udev event?
The event from the kernel to udev only contains the identity of the device
and the type of event (add,change,remove).
Some drivers add extra 'environment' information.
- 'dm' adds a 'cookie'.
- bcache add a 'CACHED_UUID' and 'CACHED_LABEL'
- libata-acpi adds a 'BAY_EVENT'
but in general there is nothing extra.
If udev adds stuff (which is probably does), it is just as racy as anything
that you might determine and add yourself.
I was wrongly assuming that if I queried the udev DB through its library
it would have returned a consistent image of the state of the device
meaning that each property of the device would be coherent with the others.

But even if that was possible, it would be pointless since all accesses
to the device could still fail.

Thanks for enlighting me :)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...