Inconsistent WiFi

Do you have the downstream driver setup so that you can confirm that there are no traces of the scan in the latency numbers during a scan?

I am running the most recent snapshot on my DB410C, but can flash to any version that is helpful for diagnosing this issue. What is a good version to use as a test for this issue?

To confirm this you should be able to trigger this behavior by just issuing:
iw dev wlan0 scan

I have been able to confirm that network scanning while pinging increases latency each time the command is executed. Is there a way to disable this automatic scan for networks so as to lower the amount of latency experienced during use?

I started seeing this issue (auth timeout) recently as well, but I thought it was related to me switching out my older Apple AP for a Google WiFi.

Glad to help narrow down the issue, WIFI problems can make you question even the slightest change in configuration! :smile:

Our OE build config was updated on 6/23 with this fix. and no build was triggered since… so next build will indeed have the fix!

In Android builds there is a file named WCNSS_qcom_cfg.ini that contains user preferences that overrides factory defaults in the WLAN driver. This file is usually found in the /data/misc/wifi directory. Does the driver on Debian look for the parameter overrides in this file?

I ask because, one of the issues that we see is that the idle scan mode of the radio ends up causing latency issues when the radio is in use. In the WCNSS_qcom_cfg.ini file there is a parameter that might be able to turn this idle scan mode off:

# Enable/Disable Idle Scan
gEnableIdleScan=0

Is this possible in the Debian distribution?

The wcn36xx driver does not read WCNSS_qcom_cfg.ini and I’m not able to figure out exactly what “Idle Scan” means in the downstream driver, so I’m not able to say if we’re doing something similar or not.

I am unsure whether or not this is the official definition or not but to me Idle Scan means that the radio will scan for networks after a given period of inactivity. This is somewhat like our issue with high latency whenever the radio scans for networks.

So I might have a possible fix, but it’s more like a band-aid -

On a whim I went ahead and basically bypassed network manager completely by disabling the service in systemd, then I created my network in /etc/network/interfaces and its associated wpa-conf file in /etc/wpa_supplicant and connected to the network without the use of network-manager.

Much like the workaround procedure in: https://bugs.96boards.org/show_bug.cgi?id=272#c17

Then I started my ping tests and saw that after ~3 hours of sustained pinging that I had 0% packet loss from both desktop->dragonboard and dragonboard->desktop.

This pointed me towards an issue in network-manager! Now I know that the default behavior for network-manager is to periodically scan for networks while we’re in the connected state of the adapter, this is to probably aid in roaming from AP to AP in a corporate wifi environment.

So I did a little digging and fell upon this page in the archlinux wiki - https://wiki.archlinux.org/index.php/NetworkManager#Regular_network_disconnects.2C_latency_and_lost_packets_.28WiFi.29

This page pointed me to a repo with a patch for network-manager where we disable the periodic scan behavior -
https://aur.archlinux.org/cgit/aur.git/tree/disable_wifi_scan_when_connected.patch?h=networkmanager-noscan&id=0957c5941c95a8fc166379e36b1fbee85bfddbab

I then got the source for our network manager via apt and applied the above patch, compiled, and installed the patched version of network-manager after reverting my changes to /etc/network/interfaces and enabling network-manager via systemd.

Now after about ~20 minutes of pinging to and from the dragonboard, I have 0% packet loss. I am hesitant to say that this fixes the issue entirely, but it seems to be a lot more stable than it was with the periodic scanning enabled. I encourage others to give this a shot to see if things are more stable for you while using WIFI and network-manager.

So after testing this for about a over a week now, I can say that this just makes things a little better in terms of stability, but the WIFI is still very inconsistent and can even cause the board to reboot when running specific iperf tests.

I am running the latest release build #252.

Try this test -

  1. Start iperf server in UDP mode: iperf -s -u -p 12345

  2. Start passing packets to server using this command on the dragonboard: iperf -c 10.1.102.206 -u -p 12345 -t 30 -b 54M -i 1

  3. You’ll see a few packets transfer across the link, and then stop and reboot. Do this with the serial UART cable attached for maximum feedback, since at this point the device is probably unreachable over the network.

Here’s a log:

Has anyone tried the test in the previous post? If so what happened?

When I run this test on my DB410c using the recent linux-next kernel I have running it finishes cleanly with:
[ 3] 0.0-236.0 sec 35.6 MBytes 1.27 Mbits/sec 5.732 ms 11291/36701 (31%)

There’s no (relevant) delta in the WiFi driver itself, so I suspect the reboot is something unrelated.

That said, running the same test from my laptop I get 0% packet loss…

Regards,
Bjorn

Hello Bjorn,

I’d like to test out the recent linux-next kernel that you’re referring to. Does this map to a snapshot, or do I need to compile everything from sources (kernel, initramfs, and rootfs)?

Thanks again,

Rob

It’s been quite a while since we’ve last talked about this WIFI issue, has there been any new developments?

I’m very eager to assist in testing any potential fixes for this issue. Is there a snapshot that I can test that might provide any significant improvements to the performance of the WCN3620?

Sorry if this is getting annoying, its just that I’m dead in the water with this board if I cannot use the embedded WIFI adapter in a reliable fashion.

@Rob_Gries

Now after about ~20 minutes of pinging to and from the dragonboard, I have 0% packet loss. I am hesitant to say that this fixes the issue entirely, but it seems to be a lot more stable than it was with the periodic scanning enabled. I encourage others to give this a shot to see if things are more stable for you while using WIFI and network-manager.

I’ve built most of underlying software from various point(s) - yes there are some major issues with NetworkManager. I’m using v1.4.4 build that I forked from upstream.

There’s a sweet spot between hw-scans and an updated version of NM which seems to work without failures for prolonged periods of time. However, if you bypass this and go with a straight up network interface and wpa - it also works great.

Cheers!

Would you mind doing some iperf tests? Check the above posts for the commands.

I’m not sure that my test would results would apply directly to linaro’s support scenario, but (externally) I can confirm that NM presents unnecessary problems in versions lower than 1.4

I have forked a lot of stuff and frozen it for my system… so, my test results wouldn’t be totally applicable to v17+ from linaro.

But I can tell you that the commands on my forked version work between 20-50Mbits/sec. Bandwith on this chip will never be perfect - is my experience.

Off hand do you remember what was changed other than network manager?

Also what kernel version are you on?

I need to preface my response by saying that getting into the specifics of my system is a long, long thread… and probably less useful for current users…

I’m using:
network-manager 1.4.4 at 3c70a0…
wpasupplicant 2.3-1+deb8u4

I noticed immediate improvements with this combo.

I think the kernel was forked around linaro release 16.04 and selectively patched into a frankinkern thereafter.

I understand linaro has done substantial merging upstream since… however, wcnss has been mostly consistent… but the underlying driver support has changed quite a lot as things have migrated forward on linux side.

I tried the earlier versions of network-manager (1.4.4@3c70a0) and wpa_supplcant(2.3-1+deb8u4) with no measurable amount of success.

I’m running linaro release 17.04 #233 - Perhaps there are differences in the driver between 16.04 and 17.04 that makes this occur?

Can you post your kernel sources to github by any chance? Or do they not differ from 16.04?

Hello,

So I have now confirmed that the watchdog bite that occurred in the #252 build while running the iperf test no longer occurs when running the latest #260 build.

However, the intermittent packet loss/loss of connectivity issue still exists.

I figured I’d help update this post and see if it was useful to anyone else out there looking for a straw to grasp when it comes to the downright depressing performance of our lovely WCN3620 adapter.

-Rob

Hi Rob,

Just did some testing as well as wifi is an important part for me as well.

I can confirm the reboots when doing the iperf tests. However I also tried it with wifi in AP mode. The curious thing is that it works well with 802.11n+wmm enabled (in AP mode), throughput is also great:

root@dragonboard-410c:~# ./iperf -c 192.168.42.102 -u -p 12345 -t 30 -b 100M -i 1
...
[  3] 29.0-30.0 sec  8.93 MBytes  74.9 Mbits/sec
[  3]  0.0-30.0 sec   260 MBytes  72.6 Mbits/sec
[  3] Sent 185323 datagrams
[  3] Server Report:
[  3]  0.0-29.9 sec   195 MBytes  54.7 Mbits/sec   0.497 ms 46262/185322 (25%)

(But unfortunately for me this mode is troublesome for tcp connections somehow…)
Without the 802.11n/wmm the reboot also happens when you try to send faster than the available bandwidth.

I didn’t do any testing in STA mode while connected to a 802.11n wifi network. Maybe that might work for you.

All tests were done using on OE 17.06.1 build in a setup without NetworkManager.

It can’t be working all that well, you’ve got 25% packet loss. You’re dropping a quarter of the total datagrams sent. I’m not sure that is all that great… -_-