Dragonboard 410c wifi problems

Hi all,
For the last two years I am facing wifi problems on my DB410c that affects our developments seriously.
we are trying to work around it, but resetting the wifi from time to time and reconnect to the wifi, but still there are serious problems.

I am using Linaro, debian 9.4 stretch, and the hostapd -v is 2.4

the current problem is that some processes are completely stuck, after running hotspotd (to make the wifi to be in access point mode).

I have the dmsg log and the /var/log/messages log if it helps - how can I attach them as a file to here?

Debian stretch implies you are using the 2017.09 release (or earlier). How many of your issues reproduce on 2019.01?

To be clear, there remain some bugs on 2019.01, especially https://bugs.96boards.org/show_bug.cgi?id=538 but running an old release is going to give you additional troubles with WiFi.

Hi Daniel, and thanks for the great help!

  1. “How many of your issues reproduce on 2019.01?” - I don’t know. To know, I need to build the operating system 2019.01, install everything and put the code, build the openCV, our custom drivers, etc. it takes around a full week of work (quite an effort). Is there a way to easily upgrade the system from the command line, or should I invest this full week of work?

  2. Are you interested in the dmsg and var log?

  3. Isn’t there a way to upgrade packaged, like hostapd, etc, instead of everytime build the kernel and invest a week of work?

One discovery: I have two dragonboards running in the same room with the same MAC. Can it be the reason?

Multiple boards with the same MAC address - Very Bad…

Network protocols are based on the concept that every device has a globally unique MAC address.

I am not sure what the current process is to change the MAC address, but having the MAC address permanently stored on the eMMC is something I had asked for years ago. I don’t know if it has been implements. I had also asked Arrow to place a sticker on the board with the MAC address (even if the MAC address is not programmed onto the board), does your board have a sticker with a MAC address on it?

What is the current state of MAC address storage and stickers on the board? I do know that at one time when Debian first booted it was supposed to generate a random MAC address to prevent the collisions you seem to be seeing.

-L-

So this is the thing: We always create one dragonboard (build all the packages there, install the code, etc, etc), and then simply duplicate it by copying the eMMC (bitwise).
So, I guess, the MAC address was copied as well…

This resulted serious problems, and also some processes (especially sudo-run processes) on the board to get hanged, with no indication of the duplicate MAC problem. (below I put the stack trace of the dmsg error, but nothing said MAC there…)

Do you know how can I permanently change the MAC address from within the code itself, on the board (without flashing, or using external cables)?

=====

[23840.426254] Process hostapd (pid: 25898, stack limit = 0xffff800035208020)
[23840.427823] Stack: (0xffff80003520bad0 to 0xffff80003520c000)
[23840.434508] bac0: ffff80003520bb00 ffff000000ce0358
[23840.440328] bae0: ffff800031fab520 ffff800035284400 0000000000000001 ffff800031fab5f0
[23840.448145] bb00: ffff80003520bb40 ffff000000cddee8 0000000000000080 ffff000000ce5410
[23840.455952] bb20: ffff800031fab54c ffff800031fab520 ffff80003520bb40 0000000000cddecc
[23840.463766] bb40: ffff80003520bb80 ffff000000c4cd98 ffff800031faa700 0000000000000000
[23840.471578] bb60: ffff800031faa700 ffff80003513c910 0000000000000001 0000000000008914
[23840.479390] bb80: ffff80003520bba0 ffff000000c5df5c ffff80003513c900 ffff80003513c000
[23840.487202] bba0: ffff80003520bc00 ffff000000c5e60c ffff80003513c000 ffff000000c8a8c8
[23840.495021] bbc0: 0000000000000001 ffff80003513c048 0000000000001002 0000000000008914
[23840.502831] bbe0: ffff80003594c200 0000ffffdbac3a00 0000000000000914 ffff0000088b9eec
[23840.510639] bc00: ffff80003520bc20 ffff0000088b9f68 ffff80003513c000 ffff0000088b9ea0
[23840.518452] bc20: ffff80003520bc60 ffff0000088ba22c ffff80003513c000 0000000000001003
[23840.526263] bc40: 0000000000000001 0000000000000000 ffff80003513c000 ffff80003513c000
[23840.534075] bc60: ffff80003520bca0 ffff0000088ba310 ffff80003513c000 00000000ffffff9d
[23840.541888] bc80: 0000000000001002 0000000000000000 0000000000000001 0000000000000000
[23840.549710] bca0: ffff80003520bcd0 ffff000008927978 ffff80003520bd48 00000000ffffff9d
[23840.557514] bcc0: ffff80003594c210 0000000000000000 ffff80003520bd70 ffff000008929a3c
[23840.565325] bce0: 0000000000008914 ffff000009039080 ffff000009039080 0000000000000007
[23840.573139] bd00: 0000ffffdbac3a00 0000000000008914 0000000000000123 000000000000001d
[23840.580951] bd20: ffff0000089f2000 ffff800035208000 ffff80003594c210 ffff80003513c000
[23840.588765] bd40: ffff0000089f2000 000000306e616c77 0000000000000000 0000000000001003
[23840.596575] bd60: 0000000000000000 0000000000000000 ffff80003520bd80 ffff0000088977b0
[23840.604398] bd80: ffff80003520bdb0 ffff000008898530 0000000000008914 0000ffffdbac3a00
[23840.612213] bda0: 0000000000000010 0000ffffdbac3a00 ffff80003520bdf0 ffff0000081ee7b0
[23840.620014] bdc0: ffff800036107300 0000ffffdbac3a00 ffff800032e7c030 0000000000000007
[23840.627826] bde0: 0000ffffdbac3a00 ffff0000081f9eec ffff80003520be80 ffff0000081eef04
[23840.635638] be00: ffff800036107300 0000000000000000 ffff800036107300 0000000000000007
[23840.643455] be20: 0000ffffdbac3a00 ffff00000889b93c 0000000000000000 0000000000000000
[23840.651264] be40: ffff80003520be80 ffff0000081eeec0 ffff800036107300 0000aaaad1129000
[23840.659079] be60: ffff800036107300 0000000000000007 0000ffffdbac3a00 ffff0000081eeea4
[23840.666889] be80: 0000000000000000 ffff000008082ef0 0000000000000000 0000aaaad1129000
[23840.674701] bea0: ffffffffffffffff 0000ffff8e557eac 0000000000000000 0000000000000015
[23840.682512] bec0: 0000000000000007 0000000000008914 0000ffffdbac3a00 0000aaaafab444f6
[23840.690326] bee0: 0000ffffdbac3a0f 0000ffff8e5db9b8 0000000000000000 0000000000000000
[23840.698150] bf00: 000000000000001d 0000aaaafab44910 5d9bd09800000002 0000000029006528
[23840.705950] bf20: 000500160000001c 290065285d9bd098 0000000000000023 0000000000000000
[23840.713763] bf40: 0000aaaad1129ad8 0000ffff8e557ea0 0000000000000000 0000000000000000
[23840.721575] bf60: 0000aaaad1129000 0000000000000007 0000ffffdbac3a00 0000aaaafab444f0
[23840.729389] bf80: 0000000000000001 0000ffffdbac3b38 0000aaaad10ca0f0 0000aaaafab45140
[23840.737201] bfa0: 0000aaaad112b2b0 0000ffffdbac39c0 0000aaaad107ea54 0000ffffdbac39c0
[23840.745013] bfc0: 0000ffff8e557eac 0000000000000000 0000000000000007 000000000000001d
[23840.752833] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[23840.760644] Call trace:
[23840.768444] Exception stack(0xffff80003520b900 to 0xffff80003520ba30)
[23840.770706] b900: ffff8000211cec00 0001000000000000 ffff80003520bad0 ffff000000cdf1ac
[23840.777305] b920: 0000000000000004 0000000000008000 ffff0000090959f0 0000000000000003
[23840.785117] b940: ffff000008f896d0 0000000000000004 000000040000070f 00000000000bfc00
[23840.792930] b960: 0000000000000000 0000000000000000 ffff80003520b980 ffff000008394d78
[23840.800743] b980: ffff80003520b9d0 ffff0000081d6d90 ffff0000090959d0 0000000000007e00
[23840.808554] b9a0: 00000000a77fa000 ffff8000277faf00 0000000000000040 000000000000003f
[23840.816372] b9c0: 0000000000000002 ffff7dfffe000000 ffff80003610f208 0000000000000000
[23840.824179] b9e0: 0000000000000001 0000000000000040 ffff0000090477c0 0000000000000001
[23840.831992] ba00: 0000000000000068 ffff800037f34450 0000000000000000 0000000000000000
[23840.839804] ba20: ffff800031de2b40 0000000000000000

Well, a fix has been merged in the second stage bootloader (LK), to derive a generate the mac address from the eMMC controller hardware ID, which should give a unique mac address (and Bluetooth BD address). You need (at least) to flash the last version of LK.

The newest firmware package is available at:
http://snapshots.linaro.org/96boards/dragonboard410c/linaro/rescue/latest/dragonboard-410c-bootloader-emmc-linux-*.zip

@Loic, so that means that with this fix, if we copy the eMMC of one dragonboard to many others, each will have difference MAC address? If so, that’s good.
So I basically need to upgrade our DB from 2017 kernel to 2019 kernel plus the boot loader update in order to solve the wifi issues? Another question: Is it possible to upgrade the debian 2017 to 2019 from the command line without flashing it? If we must flash, can the ‘user space’ stay the same, or we have to start ‘fresh’ and make/build everything from scratch?

yes, it would fix this problem. However doing a bitwise copy of a Linux system can have other problems, there are other files which are not supposed to be duplicated across linux systems. For example:

  • /etc/machine-id
  • the SSH keys in /etc/ssh

so generally speaking you could have other problems once you fix the MAC address issue…

@anon91830841 what’s the importance of /etc/machine-id ?

@danielt,

Changing linaro from 17.09 to 19.01 is hard, since all our drivers patches to the kernel (for camera) need to be rewritten.
However, when I do sudo apt-get upgrade I can get to debian 9.11 (instead of current 9.4).

Would the debian change 9.4->9.11 be enough to solve the wifi instability problems?

No. AFAIR most of the important changes are in the kernel.

Dear all,

I am facing the same issue with unstable WiFi on the dragonboard 410c for several years. There is no straight answer in any forum to this topic. I have 2 dragonboards and both show the same behavior on several releases of the linaro OS - WiFi disconnects multiple times per day. If there is a fix to this problem, a straight answer with a solution is appreciated.
Many thanks in advance,
brgds
Bernhard

@Bernhard, do you reproduce the issue with latest release [1] (and firmware). Do you have any log on dragonboard side, like e.g. the disconnect reason?

[1] Linaro Releases

I am using the latest release: https://releases.linaro.org/96boards/dragonboard410c/linaro/debian/latest/

Here the relevant lines from syslog:
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7327] dhcp4 (wlan0): address 10.0.0.6
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7328] dhcp4 (wlan0): plen 24 (255.255.255.0)
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7329] dhcp4 (wlan0): gateway 10.0.0.1
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7330] dhcp4 (wlan0): lease time 86400
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7330] dhcp4 (wlan0): nameserver ‘10.0.0.1’
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7331] dhcp4 (wlan0): nameserver ‘10.0.0.1’
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7331] dhcp4 (wlan0): domain name ‘home’
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7332] dhcp4 (wlan0): state changed unknown → bound
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7380] device (wlan0): state change: ip-config → ip-check (reason ‘none’, sys-iface-state: ‘managed’)
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7415] device (wlan0): state change: ip-check → secondaries (reason ‘none’, sys-iface-state: ‘managed’)
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7427] device (wlan0): state change: secondaries → activated (reason ‘none’, sys-iface-state: ‘managed’)
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7449] manager: NetworkManager state is now CONNECTED_LOCAL
Jan 3 03:04:14 ISE-DEV-008 dhclient[1880]: bound to 10.0.0.6 – renewal in 35064 seconds.
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7594] manager: NetworkManager state is now CONNECTED_SITE
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7602] policy: set ‘WiFi’ (wlan0) as default for IPv4 routing and DNS
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.7613] dns-mgr: Writing DNS information to /sbin/resolvconf
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.8521] device (wlan0): Activation: successful, device activated.
Jan 3 03:04:14 ISE-DEV-008 NetworkManager[1726]: [1578017054.8555] manager: NetworkManager state is now CONNECTED_GLOBAL
Jan 3 03:04:14 ISE-DEV-008 nm-dispatcher: req:3 ‘up’ [wlan0]: new request (1 scripts)
Jan 3 03:04:14 ISE-DEV-008 nm-dispatcher: req:3 ‘up’ [wlan0]: start running ordered scripts…
Jan 3 03:04:14 ISE-DEV-008 nm-dispatcher: req:4 ‘connectivity-change’: new request (1 scripts)
Jan 3 03:04:15 ISE-DEV-008 nm-dispatcher: req:4 ‘connectivity-change’: start running ordered scripts…
Jan 3 03:04:45 ISE-DEV-008 systemd-timesyncd[1652]: Synchronized to time server 185.175.58.235:123 (2.debian.pool.ntp.org).
Jan 3 03:09:01 ISE-DEV-008 CRON[1950]: (root) CMD ( [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Jan 3 03:09:11 ISE-DEV-008 kernel: [ 306.157652] wlan0: Connection to AP 8c:eb:c6:4d:b3:38 lost
Jan 3 03:09:11 ISE-DEV-008 wpa_supplicant[1725]: wlan0: CTRL-EVENT-DISCONNECTED bssid=8c:eb:c6:4d:b3:38 reason=4 locally_generated=1
Jan 3 03:09:11 ISE-DEV-008 NetworkManager[1726]: [1578017351.5493] sup-iface[0xaaaaeb490190,wlan0]: connection disconnected (reason -4)
Jan 3 03:09:11 ISE-DEV-008 NetworkManager[1726]: [1578017351.5669] device (wlan0): supplicant interface state: completed → disconnected
Jan 3 03:09:11 ISE-DEV-008 wpa_supplicant[1725]: wlan0: CTRL-EVENT-REGDOM-CHANGE init=CORE type=WORLD
Jan 3 03:09:11 ISE-DEV-008 NetworkManager[1726]: [1578017351.6547] device (wlan0): supplicant interface state: disconnected → scanning
Jan 3 03:09:12 ISE-DEV-008 wpa_supplicant[1725]: wlan0: SME: Trying to authenticate with 8c:eb:c6:4d:b3:38 (SSID=‘iseDMZ’ freq=2412 MHz)
Jan 3 03:09:12 ISE-DEV-008 kernel: [ 307.131323] wlan0: authenticate with 8c:eb:c6:4d:b3:38
Jan 3 03:09:12 ISE-DEV-008 kernel: [ 307.176376] wlan0: send auth to 8c:eb:c6:4d:b3:38 (try 1/3)
Jan 3 03:09:12 ISE-DEV-008 NetworkManager[1726]: [1578017352.4945] device (wlan0): supplicant interface state: scanning → authenticating
Jan 3 03:09:12 ISE-DEV-008 kernel: [ 307.266626] wlan0: authenticated
Jan 3 03:09:12 ISE-DEV-008 kernel: [ 307.270556] wlan0: associate with 8c:eb:c6:4d:b3:38 (try 1/3)
Jan 3 03:09:12 ISE-DEV-008 wpa_supplicant[1725]: wlan0: Trying to associate with 8c:eb:c6:4d:b3:38 (SSID=‘iseDMZ’ freq=2412 MHz)
Jan 3 03:09:12 ISE-DEV-008 NetworkManager[1726]: [1578017352.5903] device (wlan0): supplicant interface state: authenticating → associating
Jan 3 03:09:12 ISE-DEV-008 kernel: [ 307.286372] wlan0: RX AssocResp from 8c:eb:c6:4d:b3:38 (capab=0x1411 status=0 aid=1)
Jan 3 03:09:12 ISE-DEV-008 wpa_supplicant[1725]: wlan0: Associated with 8c:eb:c6:4d:b3:38
Jan 3 03:09:12 ISE-DEV-008 wpa_supplicant[1725]: wlan0: CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
Jan 3 03:09:12 ISE-DEV-008 kernel: [ 307.319724] wlan0: associated
Jan 3 03:09:12 ISE-DEV-008 NetworkManager[1726]: [1578017352.6413] device (wlan0): supplicant interface state: associating → 4-way handshake
Jan 3 03:09:12 ISE-DEV-008 wpa_supplicant[1725]: wlan0: WPA: Key negotiation completed with 8c:eb:c6:4d:b3:38 [PTK=CCMP GTK=TKIP]
Jan 3 03:09:12 ISE-DEV-008 wpa_supplicant[1725]: wlan0: CTRL-EVENT-CONNECTED - Connection to 8c:eb:c6:4d:b3:38 completed [id=0 id_str=]
Jan 3 03:09:12 ISE-DEV-008 wpa_supplicant[1725]: wlan0: CTRL-EVENT-REGDOM-CHANGE init=COUNTRY_IE type=COUNTRY alpha2=AT
Jan 3 03:09:12 ISE-DEV-008 NetworkManager[1726]: [1578017352.6895] device (wlan0): supplicant interface state: 4-way handshake → completed
Jan 3 03:09:12 ISE-DEV-008 kernel: [ 307.387738] wlan0: Limiting TX power to 20 (20 - 0) dBm as advertised by 8c:eb:c6:4d:b3:38
Jan 3 03:09:14 ISE-DEV-008 wpa_supplicant[1725]: wlan0: CTRL-EVENT-SIGNAL-CHANGE above=1 signal=-58 noise=9999 txrate=1000
Jan 3 03:15:26 ISE-DEV-008 kernel: [ 680.740647] wlan0: Connection to AP 8c:eb:c6:4d:b3:38 lost

What is the reproducibility of this issue? are you disconnected every 5 minutes, hours? is the connection always re-established successfully? I don’t reproduce on my side. from your log disconnection reason code is ‘4’ which normally means ‘disconnect due to inactivity’, though it could also be a bug. Can you please increase driver verbosity and re-share syslog (e.g. via pastebin) when error happens.

echo 0x2504 > /sys/module/wcn36xx/parameters/debug_mask

Also can you try to run ping command (indefinitely) on dragonboard side to see if it change anything:

ping 8.8.8.8

Is it possible to also enable AES-CCMP on your access point (instead of TKIP), or to test with an other AP to check the reproducibility?

Also an other test is to disable power saving:

iw dev wlan0 set power_save off

have found some hints in other blogs on the log entry “device not managed” for wlan0.

The following command (via command line after reboot) has made the interface working for the past hours without disconnect:

$ sudo nmcli dev set wlan0 managed yes

at the moment it works for the first time without disconnecting, lets see and wait for some days…

to the other points:

What is the reproducibility of this issue? are you disconnected every 5 minutes, hours?

it seems to deisconnect all other 20min, and then connect again after some time (10 to 15min)

PING 8.8.8.8

drops with the same syslog entries

Driver:

will send you more log data once it disconnects again - I have made the “managed yes” entry, so currently no disconnection.

Power saving:

is off in the settings, as i am running a GPS/GNNS skyview monitor

Weird, because the interface is supposed to already be in managed mode… but let me know if it fixes your issue.

even after setting wlan0 as managed the WiFi connection keeps disconnecting with a lower frequency - about very 2 hours now.

I have written a small daemon, which connects every 5 min to a host in the internet to see if the issue is related to the wlan0 falling asleep…

Will monitor the board for some days,
brgds
Bernhard

I have found some hints in other blogs on the log entry “device not managed” for wlan0.

The following command (via command line after reboot) has made the interface working for the past hours without disconnect:

$ sudo nmcli dev set wlan0 managed yes

at the moment it works for the first time without disconnecting, lets see and wait for some days…

to the other points:

What is the reproducibility of this issue? are you disconnected every 5 minutes, hours?

it seems to disconnect all other 20min, and then connect again after some time (10 to 15min)

PING 8.8.8.8

drops with the same syslog entries

Driver:

will send you more log data once it disconnects again - I have made the “managed yes” entry, so currently no disconnection.

Power saving:

is off in the settings, as i am running a GPS/GNNS skyview monitor