aboutsummaryrefslogtreecommitdiff
path: root/arch/arm/include/asm/assembler.h
Commit message (Collapse)AuthorAgeFilesLines
* ARM: spectre-v1: add speculation barrier (csdb) macrosRussell King2019-05-031-0/+8
| | | | | | | | | | Add assembly and C macros for the new CSDB instruction. Change-Id: Iff3490a0ebc290edf22128eba9e367dc5134fb3e Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Mark Rutland <mark.rutland@arm.com> Boot-tested-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Tony Lindgren <tony@atomide.com>
* Optimize ARM memset and memzero functionsHarm Hanemaaijer2017-04-111-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ARM memset and memzero functions are optimized with lower overhead for small requests, generation of more 16-bit Thumb2 instructions when compiled in Thumb2 mode, and configurable destination alignment before the main block-copying loop. The new compile-time constant MEMSET_WRITE_ALIGN_BYTES is introduced in assembler.h, to augment the CALGN macro that previously regulated 32-byte write alignment but was only used on the Feroceon platform. MEMSET_WRITE_ALIGN_BYTES can have values of 0 (no write alignment), 8, or 32. Apart from Feroceon, memset write alignment of 32 bytes appears to benefit the armv6 platform, while the armv7 platform seems benefit from alignment to 8 bytes for memset/memzero. The CALGN macro is renamed to MEMSET_CALGN for memset and memzero; the original CALGN macro is reserved for the memcpy family of functions (memcpy, copy_from_user, copy_to_user) currently implemented in copy_template.S, and the associated compile time constant WRITE_ALIGN_BYTES defines the write alignment for memcpy-related functions that will be utilized in subsequent contributions. Because the current CALGN implementation in copy_template.S only implements 32-byte write alignment and is broken on Thumb2, it is only enabled when WRITE_ALIGN_BYTES is equal to 32 and Thumb2 mode is not enabled. Finally, memset and memzero now include a directive to enable unified ARM assembler syntax. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>
* Optimize copy_page for modern ARM platformsHarm Hanemaaijer2017-04-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The existing implementation of copy_page for ARM appears to be optimized for older platforms. Benchmark testing in a sandbox environment shows suboptimal performance on modern platforms like armv6 and armv7, with speed-ups ranging from 10% (Cortex A8) to 80% (armv6 used in Raspberry Pi) being achievable. This commit optimizes copy_page and introduces the new compile-time constant PREFETCH_DISTANCE, defined in cache.h, which when multiplied by L1_CACHE_BYTES is equal to the offset used for prefetches performed with the PLD instruction. For platforms where L1_CACHE_BYTES is 32 (armv5 and armv6), copy_page processes 32 bytes at a time while doing one prefetch per iteration, while for armv7 (with L1_CACHE_BYTES equal to 64), 64 bytes are processed at at time with one prefetch per iteration. When no preload instruction is available (platforms earlier than armv5), no preload instructions are generated and 32 bytes are processed at at time. To facilitate specifying instructions for architectures with no preload instruction, the NO_PLD macro is added to assembler.h, augmenting the PLD macro. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com> Signed-off-by: RyTek <rytek1128@outlook.com>
* Rename ARM assembler push/pull macrosHarm Hanemaaijer2017-04-111-4/+4
| | | | | | | | | | | | | | | | | | The ARM assembler library functions use a macro called "push" that along with a macro called "pull" is used to shift bytes around in a word in an endian-independent way. However, the modern unified ARM assembler syntax also defines the instruction "push" to push data onto the stack, which has specific encodings in the Thumb2 instruction set. For prevent possible conflicts going forward, and to allow the use of the more transparent "push" instruction along with the modern unified assembler syntax, this patch renames all occurrences of the "push" macro to "pushbits", as well as renaming the macro argument, when also called "push", to "pushshift". For consistency, the macro called "pull" with its argument name "pull" are also renamed to "pullbits" and "pullshift", respectively. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>
* first commitMeizu OpenSource2016-08-151-0/+364