[SPI] Can sg_alloc_table get correct pages?

Hi,
I have one question about SPI dma mode.
If the receiving buffer is NOT multiple of PAGE_SIZE (4KB), can sg_alloc_table get correct pages?
In the configure_dma() function, there is below description.
(driver/spi/spi-pl022.c)

    /* Create sglists for the transfers */
    pages = DIV_ROUND_UP(pl022->cur_transfer->len, PAGE_SIZE);

This decides the number of pages corresponding to the size of transferred data.
Then sg_alloc_table is called with pages shown above.
(driver/spi/spi-pl022.c)

  ret = sg_alloc_table(&pl022->sgt_rx, pages, GFP_ATOMIC);
  ret = sg_alloc_table(&pl022->sgt_tx, pages, GFP_ATOMIC);

But, in setup_dma_scatter() which is called after sg_alloc_table(),
the offset between the receiving buffer and PAGE boundary is considered.
(driver/spi/spi-pl022.c)

    if (buffer) {
            for_each_sg(sgtab->sgl, sg, sgtab->nents, i) {
                    /*
                     * If there are less bytes left than what fits
                     * in the current page (plus page alignment offset)
                     * we just feed in this, else we stuff in as much
                     * as we can.
                     */
                    if (bytesleft < (PAGE_SIZE - offset_in_page(bufp)))
                            mapbytes = bytesleft;
                    else
                            mapbytes = PAGE_SIZE - offset_in_page(bufp);
                    sg_set_page(sg, virt_to_page(bufp),
                                mapbytes, offset_in_page(bufp));

As a result, there is the case that BUG_ON is executed because byteleft is NOT equal to zero.
For example,
PAGE_SIZE: 4096 byte
data size: 4096+2048 byte
offset: 4000 byte (This is calculated by offset_in_page() function.)
in this case,
pages= DIV_ROUND_UP(4096+2048, 4096)=2
mapped bytes at page 1: 4096-4000 = 96
mapped bytes at page 2: 4096
remaining byte: (4096+2048)-(96+4096) = 1952 !!
so byteleft is equal to 1952, BUG_ON is executed, and the hardware is restarted.

Is this true?

My environment is shown below.
-Hikey board(LeMaker)
-SPI0@f7106000
-android 6.0 published in AOSP (android-hikey-linaro-4.1)
(I referred to https://source.android.com/source/devices.html)

is this something that you can reproduce? offset can’t be 4000 in the first page (ie, you can’t leave a hole while doing a DMA transfer).

On top of that, note that you are executing a loop (for_each_sg) for all the pages mapped (2 pages in the case you pointed to).

Thank you for your reply.

Above things actually occurred in my environment.
Now we develop the kernel loadable module that is SPI slave driver.
Hisilicon’s SPI is master and the SPI slave device is connected to Hisilicon.

At first, my driver gets the buffer(2,069,504 bytes) by kmalloc with GFP_KERNEL
and this is used as the ring buffer to receive.
My driver checks periodically whether or not data are prepared in the SPI slave device.
If data are prepared in the SPI slave device, my driver will get data from the SPI slave device.
To get the data, my driver call spi_sync() function with spi_transfer structure
whose *rx_buf member indicates somewhere in the ring buffer.
Therefore the offset between the ring buffer pointer and PAGE boundary is variable.

I show the error case below.
Length of data: 64672 bytes
Page size: 4096 bytes
So, the number of pages was calculated as 16 (=DIV_ROUND_UP(64672, 4096);
However the left of the 1st page was 864 bytes, so 864bytes + 4,096bytes * 15pages = 62,304bytes < 64,672 bytes.
Then 2,368 bytes were remaining, so BUG_ON was executed.

7,1357,128437017,-;ssp-pl022 f7106000.spi: [RX]16 pages(len:64672)
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1358,128437075,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee10000, 864 bytes, 63808 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1359,128437118,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee11000, 4096 bytes, 59712 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1360,128437160,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee12000, 4096 bytes, 55616 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1361,128437200,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee13000, 4096 bytes, 51520 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1362,128437241,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee14000, 4096 bytes, 47424 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1363,128437281,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee15000, 4096 bytes, 43328 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1364,128437321,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee16000, 4096 bytes, 39232 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1365,128437421,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee17000, 4096 bytes, 35136 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1366,128437482,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee18000, 4096 bytes, 31040 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1367,128437522,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee19000, 4096 bytes, 26944 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1368,128437562,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee1a000, 4096 bytes, 22848 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1369,128437617,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee1b000, 4096 bytes, 18752 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1370,128437756,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee1c000, 4096 bytes, 14656 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1371,128437781,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee1d000, 4096 bytes, 10560 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1372,128437819,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee1e000, 4096 bytes, 6464 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1373,128437866,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee1f000, 4096 bytes, 2368 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
4,1374,128437915,-;BUG: failure at drivers/spi/spi-pl022.c:939/setup_dma_scatter()!

** Some debug print are appended, so the page number 939 shown above is different from the original source code.

And below is OK case.
The data length and page size are same.
In this case, the pointer match to the page boundary.
So, the last 3,232 bytes are fitted in one page.

7,1319,128401790,-;ssp-pl022 f7106000.spi: [RX]16 pages(len:64672)
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1320,128401805,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee01000, 4096 bytes, 60576 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1321,128401818,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee02000, 4096 bytes, 56480 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1322,128401830,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee03000, 4096 bytes, 52384 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1323,128401842,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee04000, 4096 bytes, 48288 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1324,128401854,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee05000, 4096 bytes, 44192 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1325,128401866,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee06000, 4096 bytes, 40096 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1326,128401877,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee07000, 4096 bytes, 36000 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1327,128401889,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee08000, 4096 bytes, 31904 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1328,128401901,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee09000, 4096 bytes, 27808 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1329,128401913,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee0a000, 4096 bytes, 23712 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1330,128401925,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee0b000, 4096 bytes, 19616 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1331,128401937,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee0c000, 4096 bytes, 15520 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1332,128401949,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee0d000, 4096 bytes, 11424 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1333,128401961,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee0e000, 4096 bytes, 7328 left
SUBSYSTEM=amba
DEVICE=+amba:f7106000.spi
7,1334,128401973,-;ssp-pl022 f7106000.spi: set RX/TX target page @ ffffffc02ee0f000, 4096 bytes, 3232 left

Ah, so your buffer is not page-aligned.
Have you tried doing something like this? (I haven’t tested or compiled it btw)

[jramirez@igloo ~ (hikey-mainline-rebase *%)]$ git diff spi-pl022.c
diff --git a/drivers/spi/spi-pl022.c b/drivers/spi/spi-pl022.c
index 5e5fd77..a3688e8 100644
--- a/drivers/spi/spi-pl022.c
+++ b/drivers/spi/spi-pl022.c
@@ -942,7 +942,7 @@ static int configure_dma(struct pl022 *pl022)
                .direction = DMA_MEM_TO_DEV,
                .device_fc = false,
        };
-   unsigned int pages;
+ unsigned int pages, tx_pages, rx_pages;
        int ret;
        int rx_sglen, tx_sglen;
        struct dma_chan *rxchan = pl022-&gt;dma_rx_channel;
@@ -1033,7 +1033,7 @@ static int configure_dma(struct pl022 *pl022)
                tx_conf.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
                break;
        }
-
+2
        /* SPI pecularity: we need to read and write the same width */
        if (rx_conf.src_addr_width == DMA_SLAVE_BUSWIDTH_UNDEFINED)
                rx_conf.src_addr_width = tx_conf.dst_addr_width;
@@ -1046,13 +1046,16 @@ static int configure_dma(struct pl022 *pl022)
 
        /* Create sglists for the transfers */
        pages = DIV_ROUND_UP(pl022-&gt;cur_transfer-&gt;len, PAGE_SIZE);
+ rx_pages = offset_in_page(pl022-&gt;rx) ? pages + 1 : pages;
+ tx_pages = offset_in_page(pl022-&gt;tx) ? pages + 1 : pages;
+
        dev_dbg(&amp;pl022-&gt;adev-&gt;dev, &quot;using %d pages for transfer\n&quot;, pages);
 
-   ret = sg_alloc_table(&amp;pl022-&gt;sgt_rx, pages, GFP_ATOMIC);
+ ret = sg_alloc_table(&amp;pl022-&gt;sgt_rx, rx_pages, GFP_ATOMIC);
        if (ret)
                goto err_alloc_rx_sg;
 
-   ret = sg_alloc_table(&amp;pl022-&gt;sgt_tx, pages, GFP_ATOMIC);
+ ret = sg_alloc_table(&amp;pl022-&gt;sgt_tx, tx_pages, GFP_ATOMIC);
        if (ret)
                goto err_alloc_tx_sg;

Thank you for reply.

I modified as follows, then sg_alloc_table was succeeded.

--- spi-pl022.c.org	2016-08-22 20:43:10.341880500 +0900
+++ spi-pl022.c	2016-08-22 20:41:54.571304200 +0900
@@ -942,7 +942,7 @@ static int configure_dma(struct pl022 *p
 		.direction = DMA_MEM_TO_DEV,
 		.device_fc = false,
 	};
-	unsigned int pages;
+	unsigned int pages, txpages, rxpages;
 	int ret;
 	int rx_sglen, tx_sglen;
 	struct dma_chan *rxchan = pl022-&gt;dma_rx_channel;
@@ -1038,11 +1038,13 @@ static int configure_dma(struct pl022 *p
 	dmaengine_slave_config(txchan, &amp;tx_conf);
 	/* Create sglists for the transfers */
 	pages = DIV_ROUND_UP(pl022-&gt;cur_transfer-&gt;len, PAGE_SIZE);
-	dev_dbg(&amp;pl022-&gt;adev-&gt;dev, "using %d pages for transfer\n", pages);
-	ret = sg_alloc_table(&amp;pl022-&gt;sgt_rx, pages, GFP_ATOMIC);
+	rxpages= pages; txpages= pages;
+	if (pl022-&gt;cur_transfer-&gt;len &gt; (pages* PAGE_SIZE- offset_in_page(pl022-&gt;rx))) rxpages++;
+	if (pl022-&gt;cur_transfer-&gt;len &gt; (pages* PAGE_SIZE- offset_in_page(pl022-&gt;tx))) txpages++;
+	ret = sg_alloc_table(&amp;pl022-&gt;sgt_rx, rxpages, GFP_ATOMIC);
 	if (ret)
 		goto err_alloc_rx_sg;
-	ret = sg_alloc_table(&amp;pl022-&gt;sgt_tx, pages, GFP_ATOMIC);
+	ret = sg_alloc_table(&amp;pl022-&gt;sgt_tx, txpages, GFP_ATOMIC);
 	if (ret)
 		goto err_alloc_tx_sg;
 	/* Fill in the scatterlists for the RX+TX buffers */

Should I get the page-aligned buffer? It’s possible to get the page-aligned buffer,
but there are some waste of memory.
Moreover, I need NOT to consider page-align when I use dragon410C board.
Please teach me!

yes, the driver requires a page aligned start of buffer so if that is something you can do, please do so you can continue to work upstream (I suppose you could also submit a patch to the maintainer with the proposed change to allow for your use case)

btw just curious, didn’t my patch work?

Thank you for your reply.

Your patch caused other problem.
“DMA ERR” message printed out to /dev/kmsg by the dev_warn() in k3_dma_inthandler() function.
I think that the extra page allocation may be problem, but I didn’t analyze the error.
Even if offset is slight(Not eqaul to zero), pages are calculated as 17.
I suppose the extra one page cause the problem because 16 pages are in sufficient.

btw, who are the maintainer? Do they belong to Linaro or AOSP? I don’t know the basic things.
Could you tell me?

At last, I paste my patch again. Because the line number in previous patch is little strange.

diff --git a/drivers/spi/spi-pl022.c b/drivers/spi/spi-pl022.c
index 94af806..1e7ba50 100644
--- a/drivers/spi/spi-pl022.c
+++ b/drivers/spi/spi-pl022.c
@@ -942,7 +942,7 @@ static int configure_dma(struct pl022 *pl022)
 		.direction = DMA_MEM_TO_DEV,
 		.device_fc = false,
 	};
-	unsigned int pages;
+	unsigned int pages, txpages, rxpages;
 	int ret;
 	int rx_sglen, tx_sglen;
 	struct dma_chan *rxchan = pl022-&gt;dma_rx_channel;
@@ -1046,13 +1046,16 @@ static int configure_dma(struct pl022 *pl022)
 
 	/* Create sglists for the transfers */
 	pages = DIV_ROUND_UP(pl022-&gt;cur_transfer-&gt;len, PAGE_SIZE);
-	dev_dbg(&amp;pl022-&gt;adev-&gt;dev, "using %d pages for transfer\n", pages);
+	rxpages= pages; txpages= pages;
+	if (pl022-&gt;cur_transfer-&gt;len &gt; (pages* PAGE_SIZE- offset_in_page(pl022-&gt;rx))) rxpages++;
+	if (pl022-&gt;cur_transfer-&gt;len &gt; (pages* PAGE_SIZE- offset_in_page(pl022-&gt;tx))) txpages++;
+	dev_dbg(&amp;pl022-&gt;adev-&gt;dev, "using RX:%d / TX:%d pages for transfer\n", rxpages, txpages);
 
-	ret = sg_alloc_table(&amp;pl022-&gt;sgt_rx, pages, GFP_ATOMIC);
+	ret = sg_alloc_table(&amp;pl022-&gt;sgt_rx, rxpages, GFP_ATOMIC);
 	if (ret)
 		goto err_alloc_rx_sg;
 
-	ret = sg_alloc_table(&amp;pl022-&gt;sgt_tx, pages, GFP_ATOMIC);
+	ret = sg_alloc_table(&amp;pl022-&gt;sgt_tx, txpages, GFP_ATOMIC);
 	if (ret)
 		goto err_alloc_tx_sg;
 

Seems strange, both patches are supposed to do the same thing (ie, increase the number of entries in the table if there is an offset in the page).

In any case, this is the standard process for submitting changes to the AOSP; since the changes are in the linux kernel, the following fragment is of relevance.

[...] Android makes use of a number of other open source projects, such as the Linux kernel and WebKit, as described in Codelines, Branches, and Releases. For most projects under external/, changes should be made upstream and then the Android maintainers informed of the new upstream release containing these changes. [...]

For the linux kernel, you should follow the instructions in the following documentation. You can find the list of maintainers in the following file in the Linux kernel sources.

[…]

SPI SUBSYSTEM
M: Mark Brown broonie@kernel.org
L: linux-spi@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git
Q: http://patchwork.kernel.org/project/spi-devel-general/list/
S: Maintained
F: Documentation/devicetree/bindings/spi/
F: Documentation/spi/
F: drivers/spi/
F: include/linux/spi/
F: include/uapi/linux/spi/
[…]

A probably faster way to get your changes merged in the AOSP - yuo should still have the discussion with the MAINTAINERS in the spi mailing list- is to follow the process documented by Akira in this post

Thank you for the detail information!

Hmm…
I think that it’s difficult to submit patch by a beginner like me.
But, I will try it with your advice!

I appreciate your eager support!