How to make I2S0(on LF connector) a PCM slave?

danielt · August 15, 2018, 11:40am

It verges close to off topic but you are sure about these sign extensions?

Simple zero extension will map the values around zero (16-bit on left, 24-bit on the right) as follows:

2 → 0x200 (512)
1 → 0x100 (256)
0 → 0x000 (0)
-1 → 0xffff00 (-256)
-2 → 0xfffe00 (-512)

This looks like a correct mapping to me: the difference between neighbouring samples will remain constant before and after conversion. Won’t padding with 0xff cause distortion every time we cross the zero point or did I make a mistake in the 24-bit to decimal conversion?

doitright · August 15, 2018, 2:38pm

Hmm, yep. You seem to be right.
Did I mention that I hate sound?

danielt · August 16, 2018, 10:31am

Hmm, yep. You seem to be right.
Did I mention that I hate sound?

I can certainly imagine you’ve had enough of it for a while!

In the past I’ve been “the audio guy” in various projects. Even now I
have a slight nervous twitch when I run some programs. It owes much to
the “rapid headphone removal reflex” that develops among audio engineers
who don’t always get things right first time!

dey_arnab · August 17, 2018, 6:59am

Dear Mr. doitright, danielt,

Interesting conversation…

Back to the problem, I think the problem might be somewhere else in my hardware and driver compatibility as loopback on the USB headset did not work. I am always getting some jittering sound with no impression of the original sound. Where else the problem could be?frames block size? Endianness? I do not think there is any dependency on I2S sound DAI as I am using I2S0. Please guide me.

By the way, I am trying to get the Vantec USB audio adapter also.

Thanking you,

Regards,
Arnab Dey

dey_arnab · August 17, 2018, 12:50pm

Loopback is now working.

I changed the pcm_write() and pcm_read() byte sizes to full buffer size(i.e. period_size*period_count*number_of_channels*sample_size_in_bytes). Getting available buffer from pcm_get_htimestamp() was somehow causing the problem. I will explore the usage of the function more once I get HFP audio working.
Next step is to check BT hfp audio.

Thanking you,

Regards,
Arnab Dey

doitright · August 17, 2018, 1:00pm

Did you say that you have a logic analyzer? What does the data actually look like on the wires? The best advice I was given when I was working out the i2s audio was to look at the actual data on the wires. Of course, I was working with the built-in bluetooth chip, so hooking up to the i2s pins was extremely inconvenient, you’ve got it easy

Now if you’re doing complete buffer read/writes instead of partial, you’re going to have to keep in mind that once you are passing data between two devices with separate clock sources, you are going to have to do something to compensate for clock drift.

dey_arnab · August 17, 2018, 2:53pm

Dear Mr. doitright,

I have already verified that BT PCM is working fine both by tapping I2S0 lines and tinycap on BT PCM card. Therefore, I think the problem is somewhere in the hal.

With my buffer related change, again I am getting buffer overrun on USB, after disabling loopback.

Is something causing huge delay between two successive pcm_read() from USB? Please help.

Thanking you,

Regards,
Arnab Dey

doitright · August 17, 2018, 3:45pm

Of course you’re getting buffer problems. If you aren’t doing anything to compensate for clock drift, then because one of your audio devices will be running faster than the other (simple matter of physics, no two crystals will be exactly the same, and there will be a difference in the clocks), and the faster one’s buffer will overrun while waiting for the slower one to empty its buffer enough to allow you to write your data. This is why you need to check how much room is available in the buffer you are writing to, and ONLY WRITE THAT MUCH (discard excess), and check how much data is available to read, and ONLY READ THAT MUCH (pad the difference).

The bluetooth device READ is the only one you will WAIT on. The USB device write must be immediate, the USB device read must be immediate, and no matter what, you can write the same number of bytes back to the bluetooth as you previously read (i.e., in the time it takes to supply X bytes to one buffer, it will consume X bytes from the other buffer).

Note that there are smarter ways of compensating for clock drift than brutally dropping data and padding it with probably zeros, in particular, resampling the block. In my own observations, however, a phone call isn’t hurt too badly with the brutal approach.

It probably also wouldn’t hurt for you to check what is going on with respect to this part of the code;
car96audio/audio_hal.c · master_apr2018_car96 · HiKey960-Car / android_device_linaro_hikey · GitLab through to line 1542. If its running the rejig function on your data when it shouldn’t be (the code was written to deal with the built in bluetooth chip, and a problem with the order the data sometimes comes in at), then that would certainly make a mess of your audio.

Something that I did for debugging, is to write out 4 wave files during a phone call;

16k stereo read from the bluetooth,
48k stereo wrote to the usb,
48k stereo read from the usb,
16k stereo wrote to the bluetooth.

Open these files in a text editor to make sure that they (a) have data, and (b) the data makes sense, then open in audacity to actually listen to them. This can help you to track down the point where the data gets messed up and hopefully why. This example can help you writing the data to files;

github.com

tinyalsa/tinyalsa/blob/master/examples/pcm-writei.c

#include <stdio.h>
#include <stdlib.h>

#include <tinyalsa/pcm.h>

static long int file_size(FILE * file)
{
    if (fseek(file, 0, SEEK_END) < 0) {
        return -1;
    }
    long int file_size = ftell(file);
    if (fseek(file, 0, SEEK_SET) < 0) {
        return -1;
    }
    return file_size;
}

static size_t read_file(void ** frames){

    FILE * input_file = fopen("audio.raw", "rb");

This file has been truncated. show original

dey_arnab · August 20, 2018, 5:23pm

Dear Mr. doitright,

Thank you very much for your detailed answer.
I could manage to get audio out of my USB headset. Following are the changes I had to do in your driver code to make it compatible with my setup -

Disabled broken sco detection part.
Disabled audio processing related code
In stereo_to_mono(), discard stereo[2*i] instead of discarding stereo[2*i+1]
Make block length for USB read/write >= 3x1.5xblock length for BT SCO read/write, but less than full buffer size.(factor 3 → USB@48k, BT@16k, factor 1.5 → USB@24bits, BT@16bits). I am taking 800 blocks(in terms of frames) for USB and 160 blocks for BT SCO.
For S16_LE to S24_3LE conversion,

for (i=0; i<length; i++){
s24_3le[3i] = 0x00;
s24_3le[(3i)+1] = s16_le[i] & 0x00ff;
s24_3le[(3*i)+2] = (s16_le[i]>>8) & 0x00ff;
}

Dummy ‘pcm_read(USB)’ before entering the while loop.(This is a bad workaround I had to do as I found that sometimes before answering and after hanging up the call, buffer overrun error were being thrown…will explore the real cause later)
Taking f_write_avail as frames available for application to write rather than taking (4096-f_write_avail). Same for f_read_avail.

I am getting sound out of USB headset but the pitch of the sound is somehow completely modified i.e. female voice is sounding like male voice No idea why.

Now I should try processing the audio properly I think as you have done with webrtc.

Thanking you,

Regards,
Arnab Dey

doitright · August 20, 2018, 7:00pm

Was your buffer overrun problem actually causing a failure? Or was it just unsightly in the log? I also notice that you mention a problem “before answering”. In actual fact, the SCO stream is started as soon as the call is signalled, and long before the call is actually answered. This is for a feature called “in band ring tone”, which means that it will actually play back your PHONE’s ringtone through the car speakers, rather than the radio play back its own ringtone.

For point 3… is your microphone connected to the RIGHT channel? By convention, the left channel comes before the right channel, and my code discards the second channel, which is right, and retains the LEFT channel. Also by convention, mono corresponds to LEFT. Very strange.

and the 24-to-16 conversion immediately after this line;

gitlab.com

HiKey960-Car/android_device_linaro_hikey/blob/master_apr2018_car96/car96audio/audio_hal.c#L1565

    
      
              }
              if (!swapendian && sumofdiff_le > sumofdiff_be){
                  swapendian = true;
                  temp_b = 0x00;
                  ALOGD("%s: Detected broken SCO INPUT, correcting", __func__);
              } else if (swapendian && sumofdiff_le <= sumofdiff_be){
                  swapendian = false; // swap back in case the first few samples were just acting weird.
                  ALOGD("%s: Possible misdetection in brokenness calculation, reverting", __func__);
              }
              loopcounter++;
          }
          
          
if (swapendian){
              temp_b = rejig((uint8_t*)framebuf_far_mono, (uint8_t*)framebuf_far_mono, block_len_bytes_far_mono, temp_b);
          }
          
          
// AudioProcessing: Analyze reverse stream
          audioframe_setdata(frame, framebuf_far_mono, frames_per_block_far);
          audioproc_aec_echo_ref(apm, frame);
          
          
memset(framebuf_near_mono, 0, block_len_bytes_near_mono);

Changes in the pitch suggest that the playback speed differs from the capture speed. Lower frequency output (female voice sounds male) would be caused by a slower playback. Female to male frequency change would be approximately equal to reducing the playback speed by 1/3, which interestingly, corresponds to the change between 16 and 24 bits.

I’m a bit confused by your point 4. The block length I’ve chosen is specifically set to 10 ms, since this is a mandatory input specification for WebRTC’s audio processing library. For stereo 16 kHz sample rate at 16 bits per sample (2 bytes), this will yield a block length of 16000 / 100 = 160 frames per block, 2 * 2 * 160 = 640 bytes per block. For your 48 kHz 24 bit device, that would be 48000 / 100 * 3 * 2 = 28800 bytes per block.

The simplest implementation you could accomplish in this would require only that you add one additional buffer, which is the 28800 bytes for stereo 48 kHz 24 bit. After you have filled the stereo 48 kHz 16 bit buffer, you would then translate that into the 48 kHz 24 bit buffer, and then write it. Then you would read into the same buffer you just wrote from, and immediately translate that into the 48 kHz 16 bit buffer, and carry on as you would before. Just don’t write any more bytes than the space available, and don’t read any more bytes than what is in the buffer waiting to be read, and the entire change really should be this simple.

dey_arnab · August 20, 2018, 8:19pm

Dear Mr. doitright,

Yes. It was causing my USB headset, adb and entire system to stall and the sound stream was getting stopped. After sometime, USB was not getting recognized and I had to unplug and plug each time.

I am not using in-band ringtone feature. iPhones by default have this feature implemented and enabled but android recently(Android 8.0.0) included this and there is a setting from which in-band can be disabled. I have disabled in-band ringtone. Therefore, SCO channel should be opened once the call is active, not in the alerting phase.
Instead of dummy USB read, I was thinking if there is any relationship between sequence of pcm_open() calls. Does it make any difference if I call pcm_open(USB PCM IN) and pcm_open(USB PCM OUT) inside the while loop, just before pcm_read() and pcm_write() respectively, inside some if(!isPcmOpen_flag)? Currently in the code the sequence is - pcm_open(both USB IN+OUT)->pcm_open(both BT IN+OUT) → while(pcm_read(BT)==0). I think dummy read indicates some delay before first USB pcm_read.

True. I am not sure what is causing this. I will verify again from my logic analyzer capture.

Thank you for pointing this out. I am not sure why it threw buffer overrun when I was using 10ms block for both. Now, as you told regarding pitch, my block sizes can also cause this, right? As BT block is 160 frames = 10ms but USB block is 800 frames = 17ms i.e. 10 ms voice gets mapped to 17ms??? Actually to avoid buffer overrun, I wanted to read more from USB than required. Don’t know if it is allowed.

Yes. My implementation is exactly like this. I have additional buffer stereo_s24[] for this purpose along with 48k,16bit buffer stereo_s16[]

Thanking you,

Regards,
Arnab Dey

doitright · August 20, 2018, 9:48pm

I still am thinking that your issues may come down to clock drift between the two audio cards.

Maybe what you want to do is try this;

flush all of the buffers immediately before going into the loop.
for each iteration of the loop, check the state of the usb device buffers.
a) if the usb write buffer starts filling, reduce the size of your writes by a whole number of samples (i.e., some multiple of 6, since each frame is 6 bytes).
b) if the write buffer is emptying, pad the data up by duplicating the last frame.
c) always read the exact amount that you wrote and the buffers should remain in sync.
regardless of what amount you actually read from the usb, pad or trim it to precisely match the block size.

Are you absolutely certain that you have the i2s0 clock running at the expected rate when running these tests? And that your Bluetooth codec is configured correctly to match this rate?

dey_arnab · August 21, 2018, 9:49am

Dear Mr. doitright,

Thank you very much for your insightful answer. I am trying this and will update.

Yes. I have verified that before.

On the other hand, as you told that you used VANTEC NBA-200U(http://www.vantecusa.com/products_detail.php?p_id=155#tab-2), is it completely plug and play device? I do not need any other external hardware to work with this, right?

Thanking you,

Regards,
Arnab Dey

doitright · August 21, 2018, 10:12am

There is alsa support for it, if that’s what you mean.

dey_arnab · August 21, 2018, 10:26am

Dear Mr. doitright,

Alright, that means, once plugged in to Hikey960 board, it will be detected and initialized automatically. That is good.

Thanking you,

Regards,
Arnab Dey

dey_arnab · August 21, 2018, 12:47pm

Dear Mr. doitright,

Finally I could get proper audio. Actually, my s_16 to s_24 conversion was getting padded with extra zeros in between two consecutive samples due to a silly bug in my code
That is why it was interpreted as lower freq audio and causing that pitch conversion effect.

I have reverted back to 10ms block size for both BT and USB and as you told, kept a minimum available buffer restriction of 10 frames to read/write proper data.

Thank you very much for your help. Now I should move to the real issue of making I2S0 slave or at least to make it behave like slave. My goal is to do dynamic clock switching between 16k/8k when the call is ongoing. Had hikey been slave, my BT controller would have taken care of it. Now, I need some mechanism where hikey should be able to change clock not only when sco is established but also whenever user needs. I will try sending something like ‘hfp_set_sampling_rate’ as per user need.

Thanking you,

Regards,
Arnab Dey

doitright · August 21, 2018, 6:12pm

I’m not sure what you mean by “behave like slave”. The ONLY distinction between master and slave, is in which of the multiple devices is driving the clock lines, which means that a device can either be a slave, or the master. There are no other possibilities.

This also has absolutely no bearing on whether or not you can perform “dynamic clock switching” – which I’ve never even heard of. Are you suggesting that you want to change the sample rate after a call is already established? Does HFP even support this? Why would you want to do this? HFP negotiates a common sample rate while the call is being set up, the HFP client will offer a list of available sample rates that it supports to the phone, and the phone will respond to that with a selection (which should be the highest of the offered rates that it is capable of), following which, the audio devices will be opened with that rate.

Once the highest common sample rate is selected, there would be no advantage to changing the sample rate, but assuming that HFP can even do that, the process would require that the audio devices be closed, and then re-opened with the new sample rate.

Now here is the thing, from your point of view, the only thing that might have to be changed (assuming that you haven’t implemented this already), is that you will have to send the new sample rate to your bluetooth chip at the same time as the alsa device is opened. In my case, it is implemented here;

gitlab.com

HiKey960-Car/android_kernel_linaro_hikey/blob/android-hikey-linaro-4.9/sound/soc/codecs/wilink8.c#L31

    
      
          static const struct snd_soc_dapm_widget bt_sco_widgets[] = {
          	SND_SOC_DAPM_INPUT("RX"),
          	SND_SOC_DAPM_OUTPUT("TX"),
          };
          
          
static const struct snd_soc_dapm_route bt_sco_routes[] = {
          	{ "Capture", NULL, "RX" },
          	{ "TX", NULL, "Playback" },
          };
          
          
static int bt_sco_hw_params(struct snd_pcm_substream *substream,
          			    struct snd_pcm_hw_params *params,
          			    struct snd_soc_dai *cpu_dai)
          {
          	u8 pcm8k[34] = {0x00, 0x02, 0x01, 0x40, 0x1f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x01, 0x00, 0x01, 0x10, 0x00, 0x01, 0x00, 0x00, 0x00, 0x10, 0x00, 0x21, 0x00, 0x01, 0x10, 0x00, 0x21, 0x00, 0x00, 0x00};
          	u8 pcm16k[34] = {0x00, 0x04, 0x01, 0x80, 0x3e, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x01, 0x00, 0x01, 0x10, 0x00, 0x01, 0x00, 0x00, 0x00, 0x10, 0x00, 0x21, 0x00, 0x01, 0x10, 0x00, 0x21, 0x00, 0x00, 0x00};
          	u8 *pcm;
          	struct hci_dev *hdev;
          
          
	switch (params_rate(params)) {
          	case 8000:

IF the bluetooth stack is capable of renegotiating sample rates during an HFP call, then the process where it would be implemented would involve these steps;

One device or the other would send a request to the other for a new sample rate.
The other device would either accept or refuse the change.
If the change was accepted, then…
a) CLOSE the audio (send parameter “hfp_enable=false” to audio HAL),
b) Set new rate (send parameter “hfp_set_sampling_rate=X000” to audio HAL),
c) OPEN the audio (send parameter “hfp_enable=true” to audio HAL).

Now the way you are talking about sending the sample rate parameter suggests to me that you are thinking of just randomly sending that parameter whenever you feel like it, and having the audio streams adapt to it somehow. THAT would NOT be possible, for a number of reasons;

The sample rate can only be set when OPENING the audio device.
The sample rate that is negotiated during the call setup phase corresponds to the actual audio rate that the bluetooth devices will be sending OVER THE AIR.
The i2s audio rate MUST match the over the air audio rate.

Also extremely important. When you say “Had hikey been slave, my BT controller would have taken care of it.” – that is NOT correct. Even if the hikey is slave, you STILL have to close the alsa device and re-open it with the new sample rate.

dey_arnab · August 23, 2018, 6:32pm

Dear Mr. doitright,

Actually I wanted to mean that hikey would not be involved in changing the clock dividers. It would only close and reopen the pcm in/out.

Dynamic clock switching is applicable for Bluetooth multi-connection with dual simultaneous SCO. This is an upcoming feature where, say, two different phones are connected to same car kit. Now, on phone1 an NBS(8k) call is going on while an incoming WBS(16k) call comes on phone2 with in-band ringtone(16k SCO will be opened before receiving the call). Now consider that the user wants to just say “Hey, will call you after 5 min” to phone2 caller without actually disconnecting phone1 call. Its like putting phone1 call on hold. (It is different from incoming call on phone1). Now as the first opened SCO was using 8k clock, you need dynamic clock switching to 16k for phone2 and then again switch back to 8k. Actually my implementation for the same is working on a setup where my BT controller is master. Now I want to have same functionality on Hikey…

I am trying to do something similar to this. Will update. Now stuck in a MIC issue

Actually simultaneous dual SCO is about having two different SCO channels running in parallel. Now from UI, whenever user wants to switch from one phone to another(say touching a button ‘switch call’) internally PCM clock should change to match with the opened SCO. It should match only with the active sco(i.e. active call = which is actually routed to speaker/headphone).

Currently android stack does not handle dual sco in case of outgoing phone1+incoming phone2 with in-band. But in case of outgoing phone1+outgoing phone2, BT stack just disconnects the previous one explicitly and connects to the 2nd. With dual SCO, these scenarios can be customized.

Thanking you,

Regards,
Arnab Dey

doitright · August 23, 2018, 7:52pm

You misunderstand what I mean by “THAT would NOT be possible”. “THAT” being to change the alsa parameter without closing and reopening the alsa stream.

In any case, the distinction of which device is master and which is slave is completely irrelevant, since alsa will take care of it in either case.

https://www.kernel.org/doc/html/v4.17/sound/kernel-api/writing-an-alsa-driver.html