Hello,
While working with the dragonboard410c I ran into an interesting issue with the audio pipeline using alsaloop.
I moved forward to linux kernel 5.1 and based it off the mainline 5.1 branch. The same issue happens with the linaro branch (release/db845c/qcomlt-5.1).
This issue occurs when running alsaloop. The issue is that at any given moment alsaloop will hang up and start skipping and it never seems to recover. I use this command ( alsaloop -C hw:0,1 -P hw:0,0 -U -S 0 -l 2400 -v -v -v -v -v -v -v -v -v -v -v -v -v -v -v -v ) and after an amount of time the console printing will stop and give me a message along the lines of “poll took 65431256421us”
I made a post about this in the past (Kernel 5.1: Audio hanging on Alsaloop Poll) but more work has been done to figure out what the cause of the issue is. The efforts have not been successful but a lot was learned. Its important to note that this patch is added (Real Time Audio and DragonBoard 410c). That patch allows for the buffer sizes to be adjusted which is a nice feature but it does make the issue more prevalent. It is important to know that this issues does happen with or without the patch. Another strange thing that seems cause the hung state to occur faster is to run alsaloop through Putty connected to the serial terminal of the DragonBoard with the verbosity level at 16.
Through the testing we used devmem2 to check register values and see whats going on with the hardware.
When alsaloop hangs it is in a state where it is waiting for the period interrupt to happen but it never does. The interesting thing is that the DMACURR register continues to change as though nothing is wrong. Below is a capture of the lpass control registers using devmem2 while alsaloop is running and once it becomes hung. All the registers appear to have values indicating it should be running but no interrupt is firing. We have also traced into the irq functions in lpass-platform.c and the gic irq chip and have confirmed that they are not executing for the lpass-irq-lpaif while it is in this state.
Running:
// Capture registers
Read at address 0x07713004 (0xffff81568004) : 0xBEC80000 // LPAIF_DMABASE_REG
Read at address 0x07713008 (0xffff95c3c008) : 0x00000FFF // LPAIF_DMABUFF_REG
Read at address 0x07713010 (0xffff976fe010) : 0x000001FF // LPAIF_DMAPER_REG
Read at address 0x07713000 (0xffffb980b000) : 0x0000083F // LPAIF_DMACTL_REG
Read at address 0x0770E00C (0xffff9af4d00c) : 0x00000000 // LPAIF_IRQCLEAR_REG
Read at address 0x0770E000 (0xffff971a6000) : 0x00038007 // LPAIF_IRQEN_REG
Read at address 0x0771300C (0xffff9450800c) : 0xBEC811A0 // LPAIF_DMACURR_REG
Read at address 0x0770B000 (0xffffb314d000) : 0x00000110 // LPAIF_I2SCTL_REG
// Playback registers
Read at address 0x07710404 (0xffffbda36404) : 0xBEC7C000 // LPAIF_DMABASE_REG
Read at address 0x07710408 (0xffff8eb96408) : 0x00000FFF // LPAIF_DMABUFF_REG
Read at address 0x07710410 (0xffffa5a33410) : 0x000001FF // LPAIF_DMAPER_REG
Read at address 0x07710400 (0xffffb2866400) : 0x0000081F // LPAIF_DMACTL_REG
Read at address 0x0770E00C (0xffffa2dce00c) : 0x00000000 // LPAIF_IRQCLEAR_REG
Read at address 0x0770E000 (0xffff95300000) : 0x00038007 // LPAIF_IRQEN_REG
Read at address 0x0771040C (0xffffbdc6540c) : 0xBEC7F130 // LPAIF_DMACURR_REG
Read at address 0x07709000 (0xffffbf164000) : 0x00004400 // LPAIF_I2SCTL_REG
Hung :
// Capture registers
Read at address 0x07713004 (0xffffb1afe004) : 0xBEC80000 // LPAIF_DMABASE_REG
Read at address 0x07713008 (0xffffa1798008) : 0x00000FFF // LPAIF_DMABUFF_REG
Read at address 0x07713010 (0xffff9283b010) : 0x000001FF // LPAIF_DMAPER_REG
Read at address 0x07713000 (0xffff9b0a7000) : 0x0000083F // LPAIF_DMACTL_REG
Read at address 0x0770E00C (0xffffb8e2e00c) : 0x00000000 // LPAIF_IRQCLEAR_REG
Read at address 0x0770E000 (0xffff8a3bb000) : 0x00038007 // LPAIF_IRQEN_REG
Read at address 0x0771300C (0xffff9a58700c) : 0xBEC83CC0 // LPAIF_DMACURR_REG
Read at address 0x0770B000 (0xffffaa90d000) : 0x00000110 // LPAIF_I2SCTL_REG
// Playback registers
Read at address 0x07710404 (0xffffb403b404) : 0xBEC7C000 // LPAIF_DMABASE_REG
Read at address 0x07710408 (0xffffa6e91408) : 0x00000FFF // LPAIF_DMABUFF_REG
Read at address 0x07710410 (0xffff8a6ae410) : 0x000001FF // LPAIF_DMAPER_REG
Read at address 0x07710400 (0xffff83d48400) : 0x0000081F // LPAIF_DMACTL_REG
Read at address 0x0770E00C (0xffffb09a600c) : 0x00000000 // LPAIF_IRQCLEAR_REG
Read at address 0x0770E000 (0xffff83f41000) : 0x00038007 // LPAIF_IRQEN_REG
Read at address 0x0771040C (0xffff979da40c) : 0xBEC7D950 // LPAIF_DMACURR_REG
Read at address 0x07709000 (0xffffbf164000) : 0x00004400 // LPAIF_I2SCTL_REG
Since alsa is in a wait state and doesn’t seem to recognize that there is an issue; we tried jump starting the hardware again by executing the following sequence for each substream and suddenly alsa comes back to life and it starts happily running for some time until it hangs the same again. We can repeat this same jump start each time it hangs but eventually it comes back to life and quickly errors out with a broken pipe.
lpass_platform_pcmops_trigger with command SNDRV_PCM_TRIGGER_SUSPEND
lpass_platform_pcmops_trigger with command SNDRV_PCM_TRIGGER_RESUME
lpass_cpu_daiops_trigger with command SNDRV_PCM_TRIGGER_SUSPEND
lpass_cpu_daiops_trigger with command SNDRV_PCM_TRIGGER_RESUME
We have also repeated our testing with various kernel versions from 5.3 back to 4.14 and the issue occurs the same in all of them.
We are currently testing and looking all the way back to 4.4.9.
Any input or ideas are welcomed!