Inconsistent WiFi

The wcn36xx driver does not read WCNSS_qcom_cfg.ini and I’m not able to figure out exactly what “Idle Scan” means in the downstream driver, so I’m not able to say if we’re doing something similar or not.

I am unsure whether or not this is the official definition or not but to me Idle Scan means that the radio will scan for networks after a given period of inactivity. This is somewhat like our issue with high latency whenever the radio scans for networks.

So I might have a possible fix, but it’s more like a band-aid -

On a whim I went ahead and basically bypassed network manager completely by disabling the service in systemd, then I created my network in /etc/network/interfaces and its associated wpa-conf file in /etc/wpa_supplicant and connected to the network without the use of network-manager.

Much like the workaround procedure in: https://bugs.96boards.org/show_bug.cgi?id=272#c17

Then I started my ping tests and saw that after ~3 hours of sustained pinging that I had 0% packet loss from both desktop->dragonboard and dragonboard->desktop.

This pointed me towards an issue in network-manager! Now I know that the default behavior for network-manager is to periodically scan for networks while we’re in the connected state of the adapter, this is to probably aid in roaming from AP to AP in a corporate wifi environment.

So I did a little digging and fell upon this page in the archlinux wiki - https://wiki.archlinux.org/index.php/NetworkManager#Regular_network_disconnects.2C_latency_and_lost_packets_.28WiFi.29

This page pointed me to a repo with a patch for network-manager where we disable the periodic scan behavior -
https://aur.archlinux.org/cgit/aur.git/tree/disable_wifi_scan_when_connected.patch?h=networkmanager-noscan&id=0957c5941c95a8fc166379e36b1fbee85bfddbab

I then got the source for our network manager via apt and applied the above patch, compiled, and installed the patched version of network-manager after reverting my changes to /etc/network/interfaces and enabling network-manager via systemd.

Now after about ~20 minutes of pinging to and from the dragonboard, I have 0% packet loss. I am hesitant to say that this fixes the issue entirely, but it seems to be a lot more stable than it was with the periodic scanning enabled. I encourage others to give this a shot to see if things are more stable for you while using WIFI and network-manager.

So after testing this for about a over a week now, I can say that this just makes things a little better in terms of stability, but the WIFI is still very inconsistent and can even cause the board to reboot when running specific iperf tests.

I am running the latest release build #252.

Try this test -

  1. Start iperf server in UDP mode: iperf -s -u -p 12345

  2. Start passing packets to server using this command on the dragonboard: iperf -c 10.1.102.206 -u -p 12345 -t 30 -b 54M -i 1

  3. You’ll see a few packets transfer across the link, and then stop and reboot. Do this with the serial UART cable attached for maximum feedback, since at this point the device is probably unreachable over the network.

Here’s a log:

Has anyone tried the test in the previous post? If so what happened?

When I run this test on my DB410c using the recent linux-next kernel I have running it finishes cleanly with:
[ 3] 0.0-236.0 sec 35.6 MBytes 1.27 Mbits/sec 5.732 ms 11291/36701 (31%)

There’s no (relevant) delta in the WiFi driver itself, so I suspect the reboot is something unrelated.

That said, running the same test from my laptop I get 0% packet loss…

Regards,
Bjorn

Hello Bjorn,

I’d like to test out the recent linux-next kernel that you’re referring to. Does this map to a snapshot, or do I need to compile everything from sources (kernel, initramfs, and rootfs)?

Thanks again,

Rob

It’s been quite a while since we’ve last talked about this WIFI issue, has there been any new developments?

I’m very eager to assist in testing any potential fixes for this issue. Is there a snapshot that I can test that might provide any significant improvements to the performance of the WCN3620?

Sorry if this is getting annoying, its just that I’m dead in the water with this board if I cannot use the embedded WIFI adapter in a reliable fashion.

@Rob_Gries

Now after about ~20 minutes of pinging to and from the dragonboard, I have 0% packet loss. I am hesitant to say that this fixes the issue entirely, but it seems to be a lot more stable than it was with the periodic scanning enabled. I encourage others to give this a shot to see if things are more stable for you while using WIFI and network-manager.

I’ve built most of underlying software from various point(s) - yes there are some major issues with NetworkManager. I’m using v1.4.4 build that I forked from upstream.

There’s a sweet spot between hw-scans and an updated version of NM which seems to work without failures for prolonged periods of time. However, if you bypass this and go with a straight up network interface and wpa - it also works great.

Cheers!

Would you mind doing some iperf tests? Check the above posts for the commands.

I’m not sure that my test would results would apply directly to linaro’s support scenario, but (externally) I can confirm that NM presents unnecessary problems in versions lower than 1.4

I have forked a lot of stuff and frozen it for my system… so, my test results wouldn’t be totally applicable to v17+ from linaro.

But I can tell you that the commands on my forked version work between 20-50Mbits/sec. Bandwith on this chip will never be perfect - is my experience.

Off hand do you remember what was changed other than network manager?

Also what kernel version are you on?

I need to preface my response by saying that getting into the specifics of my system is a long, long thread… and probably less useful for current users…

I’m using:
network-manager 1.4.4 at 3c70a0…
wpasupplicant 2.3-1+deb8u4

I noticed immediate improvements with this combo.

I think the kernel was forked around linaro release 16.04 and selectively patched into a frankinkern thereafter.

I understand linaro has done substantial merging upstream since… however, wcnss has been mostly consistent… but the underlying driver support has changed quite a lot as things have migrated forward on linux side.

I tried the earlier versions of network-manager (1.4.4@3c70a0) and wpa_supplcant(2.3-1+deb8u4) with no measurable amount of success.

I’m running linaro release 17.04 #233 - Perhaps there are differences in the driver between 16.04 and 17.04 that makes this occur?

Can you post your kernel sources to github by any chance? Or do they not differ from 16.04?

Hello,

So I have now confirmed that the watchdog bite that occurred in the #252 build while running the iperf test no longer occurs when running the latest #260 build.

However, the intermittent packet loss/loss of connectivity issue still exists.

I figured I’d help update this post and see if it was useful to anyone else out there looking for a straw to grasp when it comes to the downright depressing performance of our lovely WCN3620 adapter.

-Rob

Hi Rob,

Just did some testing as well as wifi is an important part for me as well.

I can confirm the reboots when doing the iperf tests. However I also tried it with wifi in AP mode. The curious thing is that it works well with 802.11n+wmm enabled (in AP mode), throughput is also great:

root@dragonboard-410c:~# ./iperf -c 192.168.42.102 -u -p 12345 -t 30 -b 100M -i 1
...
[  3] 29.0-30.0 sec  8.93 MBytes  74.9 Mbits/sec
[  3]  0.0-30.0 sec   260 MBytes  72.6 Mbits/sec
[  3] Sent 185323 datagrams
[  3] Server Report:
[  3]  0.0-29.9 sec   195 MBytes  54.7 Mbits/sec   0.497 ms 46262/185322 (25%)

(But unfortunately for me this mode is troublesome for tcp connections somehow…)
Without the 802.11n/wmm the reboot also happens when you try to send faster than the available bandwidth.

I didn’t do any testing in STA mode while connected to a 802.11n wifi network. Maybe that might work for you.

All tests were done using on OE 17.06.1 build in a setup without NetworkManager.

It can’t be working all that well, you’ve got 25% packet loss. You’re dropping a quarter of the total datagrams sent. I’m not sure that is all that great… -_-

Correct, still ~54Mbit/s is not bad. When lowering specified send rate from 100M to the 54M you used, the drop was 1%.

Did some more testing, and also with 802.11n/wmm enabled it will eventually reboot… It just seems to take longer/occur much less frequent.

I am also looking for a work around for the wifi issue. Here are some tests that I completed today.

  1. Open Embedded build (rpb-console-image)
    Replicated high ping times (some > 1000ms)
    Ping time seems to get worse as router signal strength is lower

  2. I added a USB WiFI adapter to the dragonboard on the openembedded rpb-console-image
    rtl8188cus chipset (enabled driver in menuconfig)
    I see the same ping times as the 3620. (>1000ms at times)
    This indicates the issue is not with the driver

  3. I added a USB to ethernet adapter and tried ping again
    1ms ping times to router
    20ms to 8.8.8.8 (similar to my laptop)

  4. Disabled network manager service and retested 3620 ping
    Connected using iw and wpa_supplicant commands
    Significantly better rstults
    <5ms pings to router
    20ms average to 8.8.8.8 with a few upto 50ms

  5. Repeated results on latest debian image with newer network manager
    OE network manager version: 1.0.12
    Debian network manager version: 1.6.x (forgot the exact version)

Can somebody confirm with another USB to WiFi adapter?

Takeaway: It appears to be less about the 36xx driver and more about networkmanager.

For sure it is easy to have unrealistic expectations of wireless and its definitely worth running checks with other well supported adapters to make sure we don’t start looking for ethernet levels of reliability over a wireless link.

In particular if the WiFi device is built around a single tuner then it will have high ping times during an SSID scan. It is part of the hardware design and NetworkManager cannot change that: NetworkManager and WiFi Scans – Dan Williams’ blog