aboutsummaryrefslogtreecommitdiff
path: root/arch/arm/lib/copy_page.S
Commit message (Collapse)AuthorAgeFilesLines
* Optimize copy_page for modern ARM platformsHarm Hanemaaijer2017-04-111-24/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | The existing implementation of copy_page for ARM appears to be optimized for older platforms. Benchmark testing in a sandbox environment shows suboptimal performance on modern platforms like armv6 and armv7, with speed-ups ranging from 10% (Cortex A8) to 80% (armv6 used in Raspberry Pi) being achievable. This commit optimizes copy_page and introduces the new compile-time constant PREFETCH_DISTANCE, defined in cache.h, which when multiplied by L1_CACHE_BYTES is equal to the offset used for prefetches performed with the PLD instruction. For platforms where L1_CACHE_BYTES is 32 (armv5 and armv6), copy_page processes 32 bytes at a time while doing one prefetch per iteration, while for armv7 (with L1_CACHE_BYTES equal to 64), 64 bytes are processed at at time with one prefetch per iteration. When no preload instruction is available (platforms earlier than armv5), no preload instructions are generated and 32 bytes are processed at at time. To facilitate specifying instructions for architectures with no preload instruction, the NO_PLD macro is added to assembler.h, augmenting the PLD macro. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com> Signed-off-by: RyTek <rytek1128@outlook.com>
* first commitMeizu OpenSource2016-08-151-0/+47