mm/page_alloc: use only one PCP list for THP-sized allocations

The per_cpu_pages is cache-aligned on a standard x86-64 distribution
configuration but a later patch will add a new field which would push the
structure into the next cache line.  Use only one list to store THP-sized
pages on the per-cpu list.  This assumes that the vast majority of
THP-sized allocations are GFP_MOVABLE but even if it was another type, it
would not contribute to serious fragmentation that potentially causes a
later THP allocation failure.  Align per_cpu_pages on the cacheline
boundary to ensure there is no false cache sharing.

After this patch, the structure sizing is;

struct per_cpu_pages {
        int                        count;                /*     0     4 */
        int                        high;                 /*     4     4 */
        int                        batch;                /*     8     4 */
        short int                  free_factor;          /*    12     2 */
        short int                  expire;               /*    14     2 */
        struct list_head           lists[13];            /*    16   208 */

        /* size: 256, cachelines: 4, members: 6 */
        /* padding: 32 */
} __attribute__((__aligned__(64)));

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mel Gorman <[email protected]>
Tested-by: Minchan Kim <[email protected]>
Acked-by: Minchan Kim <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Tested-by: Yu Zhao <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nicolas Saenz Julienne <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
2 files changed