WCN36xx - Tput drop

Hello,

I am using the Linaro 18_01 release. I am seeing the Throught put drop issue with specific AP (tested with Tp-Link). We are not seen the t-put issue with other STA device.

I verified the sniffer captures where is data rate is dropping like below

  1. Data re transmitted from AP with 65 Mbps
  2. RTS sent from the AP with 24 Mbps
  3. RTS rate is dropped to 12 Mbps
  4. RTS rate dropped to 1Mbps
  5. After this point Data rate is dropped to 11Mbps, which never pick up.

If we ran the wifi_scan in between Iperf3 test, also see the T-put low <3Mbps and never pick up again.

Some investigation:

STA went to scan without informing AP that it’s going to sleep (the bit in control field of any of data packets should be set to 1 - power save). It could be sniffer issue but both AP and sniffer doesn’t see this packet so most probably station issue.

Since AP didn’t get information that STA went to sleep, it thinks that STA can receive the packets and continue to send it to station. Finally since the station doesn’t respond, it sends BA request to verify last successfully received packets from station and didn’t get response since the station is still in scan and most probably closes aggregation - after that all packets are being sent on lower legacy rates and without aggregation and therefore throughput dropped significantly after station returns from scan.

STA doesn’t inform AP that it’s going to sleep before the scan start by setting Power save bit of any data packets (if STA doesn’t have anything to send data to AP it must send QOS null or Null data packets with Power save bit set - sleep).

[FYI - Also tested with DB410 and release 20_02 as well, same behavior observed]

Do you have any suggestion?

Thanks,
Darshak

Are you using a modified wcn36xx driver? I suspect you have a modified wcn36xx driver using software scanning instead of offload scanning.

@Loic,

Yes, We did below changes for WCN3660B

--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -1083,7 +1083,7 @@
 			status = "disabled";
 
 			iris {
-				compatible = "qcom,wcn3620";
+				compatible = "qcom,wcn3680";
 
 				clocks = <&rpmcc RPM_SMD_RF_CLK2>;
 				clock-names = "xo";
diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c
index ab5be6d..4b4e233 100644
--- a/drivers/net/wireless/ath/wcn36xx/main.c
+++ b/drivers/net/wireless/ath/wcn36xx/main.c
@@ -652,7 +652,7 @@ static int wcn36xx_hw_scan(struct ieee80211_hw *hw,
 
 	mutex_unlock(&wcn->scan_lock);
 
-	if (!get_feat_caps(wcn->fw_feat_caps, SCAN_OFFLOAD)) {
+	if (get_feat_caps(wcn->fw_feat_caps, SCAN_OFFLOAD)) {
 		/* legacy manual/sw scan */
 		schedule_work(&wcn->scan_work);
 		return 0;

FYI - We have tested with DB410 and Linaro 20_02 release, see the same behavior.

This change disables offload scanning, and fall back to software scanning which can cause connection troubles (we had some issues with db410c in the past).

At least on my side, DB410C STA set all data packets with PS flag during scanning, is it only reproducible with your AP?:

Yes @Loic We have tested with multiple AP and see this behaviour with TP-Link router only and same TP-Link AP is working fine with other STA.

We were not able to see 5GHz AP in our scan result with offload scanning. That’s the reason to change it to software scanning.

I will revert the changes and check the t-put and update you.

@Loic After enable the scan off load scanning. We also able see the same behavior, its low T-put.

We also check with power save off

iwconfig wlan0 power off

T-put is good compare to power save ON, but when run scan command in between, T-put getting down and never up it again.

I want attach/share the log here, How can I ? its txt or log file only.

@Loic Any update here in this case?

Let me know if any specific log required.

Hi @Loic,

After further debugging we have seen below behaviour and have query regarding the same.

DUT is sending the Block Ack request with “…1 … = PWR MGT: STA will go to sleep” and after that DUT sends the Null function with “…1 … = PWR MGT: STA will go to sleep”. Then we saw that the throughput goes down suddenly and never recovers.

Here, AP is sending RTS to the DUT continuosly and DUT is not responding CTS to the AP (Seems like DUT is in sleep and because of that it is not responding, packet - 67494)

At last, AP ended up sending data to DUT with the lower bandwidth and never recovers.

Can you please help us to understand this scenario?

Regards,
Darshak

I assume it’s only reproducible while scanning?

Yes that’s the point, the AP tries to communicate with the DUT while it’s in sleep. The AP shouldn’t do that, and from that perspective, it’s an issue on AP side.

Yes, @Loic

We are initiate Scanning between the Iperf test. Now, T-put down issue not happening in All SCAN case.

We are sending QoS null with PS ON call before SCAN start.

@@ -649,21 +668,33 @@ static int wcn36xx_hw_scan(struct ieee80211_hw *hw,
        struct wcn36xx *wcn = hw->priv;
        int i;

+       vif->bss_conf.ps = 1;
+       printk("%s == NULL in scan ps\n", __func__);
+       wcn36xx_send_nullfunc(wcn, vif, true);
+       mdelay(50);

Also sending QoS Null PS OFF >

@@ -708,11 +749,25 @@ static void wcn36xx_sw_scan_complete(struct ieee80211_hw *hw,
 {
        struct wcn36xx *wcn = hw->priv;
+       wcn36xx_dbg(WCN36XX_DBG_SCAN, "%s START\n", __func__);
        /* ensure that any scan session is finished */
        wcn36xx_smd_finish_scan(wcn, HAL_SYS_MODE_SCAN, wcn->sw_scan_vif,
                                wcn->sw_scan_opchannel);
+       vif->bss_conf.ps = 0;
+       printk("%s PS set to off ======\n", __func__);
+       wcn36xx_send_nullfunc(wcn, vif, true);

Device goes in the sleep mode while in scanning. But After scan completion, Data-rate Aggregation not happening with AP and STA devices.

Is there any work around :wink: for such buggy AP, like (I’m not WiFi expert)

  1. Sending Null packet with power save OFF
  2. Sending response to RTS

Can you please suggest, where we can add debug print to continue check the Null func / device power status?

Hey @Loic,

Is there any way, we can disable the complete power save functionality, which is initiated while wifi SCAN? Mean always STA in power save OFF → “STA will stay up”.

I have checked with oneplus mobile (STA) and connect&Test with TP-link AP, T-put never gets down, because Mobile (STA) not sending “STA will go to sleep” ever, and even in background we executed wifi SCAN in Mobile, see some T-put down but Get it up again very quickly in few second.

I’m affraid it can not work, you can not set PS flag from driver side since the firmware seems to always overwrite it. So you will end sending a NON-PS-ON QoS null data packet, which is counterproductive.

Well, TPLINK input would be nice here, to know what the AP expect.

When controller scan an other channels, it can not listen the ‘operational’ channel, so we can not reply to RTS. What would be possible is reducing the listening interval on each channel to get back faster on operating channel, and avoid non-recoverable throughput degradation. Or prevent scanning while link is busy.

We can certainly do that, but I don’t think it will help in our case. I would be interested by the sniff of oneplus scanning.

Maybe you can try this patch meanwhile: loic.poulain/linux.git - [no description]

If its a question of “just make it work”, have you checked if there is openwrt for that tplink? Most tplink’s I’ve seen are supported by openwrt.

Alternatively, you may be able to install the buggy firmware into openwrt and debug from that side.

After applying this patch, we see the T-put issue not reproduce fast, But once the T-put gets down never get it recovered. See error, after applying the patch.

root@linaro-developer:~# [ 77.994720] wcn36xx: ERROR hal_init_scan response failed err=29
[ 78.048663] wcn36xx: ERROR hal_init_scan response failed err=29

I treid to upgrade the openwrt firmware on AP. Not getting updated. Follow the steps.
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=bf39d5594b3c8f9409e6d9408a1f370c9f18d0dd

I’m using an archer c7 both here at the office, as well as at home. Easy stuff, just go into the firmware update in their web UI and select the “factory” file. It should do the rest. You might need to rename the file however. If I recall correctly, it might be too long or have characters that make it choke. It does work though.

According to the actual instructions page, looks like the v5 may need to have the factory firmware updated to a newer version before it works:

https://openwrt.org/toh/tp-link/archer-c7-1750