Examining another hang at bootup
Easy 7.0.6 has a mechanism to log the cause of hanging after the message "Loading kernel modules..." in the /etc/rc.d/rc.sysinit script, described here:
"Mechanism to detect kernel hang at bootup"
https://bkhome.org/news/202508/mechanism-to-detect-kernel-hang-at-bootup.html
Just now, i have got another hang, booting on my Lenovo Ideacentre PC. The great thing is, I waited a couple of minutes, then rebooted, and found /mnt/wkg/.syslogd.log.18756, opened it in geany, and found this:
Sep 4 16:36:57 (none) daemon.info kernel: [ 14.904152] udevd[415]: starting eudev-3.2.14
...
Sep 4 16:37:18 (none) daemon.err kernel: [ 36.241134] udevd[429]: timeout '/usr/local/pup_event/bluetooth-add'
Sep 4 16:37:18 (none) daemon.warn kernel: [ 36.241151] udevd[429]: slow: '/usr/local/pup_event/bluetooth-add' [496]
Sep 4 16:37:19 (none) daemon.err kernel: [ 37.242415] udevd[429]: timeout: killing '/usr/local/pup_event/bluetooth-add' [496]
Sep 4 16:37:19 (none) daemon.warn kernel: [ 37.242425] udevd[429]: slow: '/usr/local/pup_event/bluetooth-add' [496]
Sep 4 16:37:19 (none) daemon.err kernel: [ 37.242601] udevd[429]: '/usr/local/pup_event/bluetooth-add' [496] terminated by signal 9 (Killed)
...I don't understand the last line, as I waited a couple of minutes before holding down the power button, and it was still hung when I did that.
Notice the times. udevd daemon starts at 16:36;57, and the timout happens at 16:37:18; that is 20 seconds, which is specified in rc.sysinit when udevd is launched (line 707):
udevd --daemon --resolve-names=early --children-max=32 --event-timeout=20 >/tmp/udevd-debug.log 2>&1
Here is the content of /usr/local/pup_event/bluetooth-add:
#!/usr/bin/bash
BTGUI="$(cat /var/local/bluepup/gui-choice)" #default "bluepup"
if [ "${BTGUI}" != "bluepup" ];then
exit 0
fi
#apparently, script may be holding up udev, so get out quick...
#ref: https://bkhome.org/news/202009/detach-child-process-from-parent.html
#setsid --fork /usr/local/pup_event/bluetooth-add-run
cttyhack sh /usr/local/pup_event/bluetooth-add-run &
...there's nothing much in it, it exits almost immediately!
So what is going on here? Why does udevd think that it hasn't
finished? We have had the same thing happen with other processes
called by a udev rule, such as the binary executable 'ata_id' Yet,
it does work most of the time, just hits this timeout problem
sometimes, maybe after several successful boots.
It looks like there is something seriously wrong with udevd. Well, it is actually the eudev project. I might go to the github project page and examine the commits.
EDIT:
A hack, for now, is some code that will at least, hopefully,
allow bootup to continue. In /etc/rc.d/rc.sysinit, changes in
bold:
sleep 0.1
udevadm trigger --action=add --attr-nomatch=driver --type=subsystems
udevadm settle --timeout=22 #--timeout=5 #20221018 20250811 rem timeout. 20250905 back again.
udevadm trigger --action=add --attr-nomatch=driver --type=devices
udevadm settle --timeout=22 #20250905
#20250905 don't know if eudev udevd reurns non-zero if timeout...
if [ $? -ne 0 ];then
udevadm control --exit --timeout=1
sleep 0.1
killall udevd #just in case.
udevd --daemon --resolve-names=early --children-max=48 --event-timeout=20 >/tmp/udevd-debug2.log 2>&1
fi
In the meantime, I will keep exploring why
this problem is happening.
Tags: easy