aboutsummaryrefslogtreecommitdiff
path: root/kernel
Commit message (Collapse)AuthorAgeFilesLines
* kernel: Only expose su when daemon is runningTom Marshall2017-05-213-0/+39
| | | | | | | | | | | | | | It has been claimed that the PG implementation of 'su' has security vulnerabilities even when disabled. Unfortunately, the people that find these vulnerabilities often like to keep them private so they can profit from exploits while leaving users exposed to malicious hackers. In order to reduce the attack surface for vulnerabilites, it is therefore necessary to make 'su' completely inaccessible when it is not in use (except by the root and system users). Change-Id: I79716c72f74d0b7af34ec3a8054896c6559a181d
* UPSTREAM: tracing: Fix trace_printk() to print when not using bprintk()Joe Maples2017-05-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The trace_printk() code will allocate extra buffers if the compile detects that a trace_printk() is used. To do this, the format of the trace_printk() is saved to the __trace_printk_fmt section, and if that section is bigger than zero, the buffers are allocated (along with a message that this has happened). If trace_printk() uses a format that is not a constant, and thus something not guaranteed to be around when the print happens, the compiler optimizes the fmt out, as it is not used, and the __trace_printk_fmt section is not filled. This means the kernel will not allocate the special buffers needed for the trace_printk() and the trace_printk() will not write anything to the tracing buffer. Adding a "__used" to the variable in the __trace_printk_fmt section will keep it around, even though it is set to NULL. This will keep the string from being printed in the debugfs/tracing/printk_formats section as it is not needed. Reported-by: Vlastimil Babka <vbabka@suse.cz> Fixes: 07d777fe8c398 "tracing: Add percpu buffers for trace_printk()" Cc: stable@vger.kernel.org # v3.5+ Bug: 34277115 Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Change-Id: I10ce56caa41c7644d9d290d9ed272a6d156c938c Signed-off-by: Joe Maples <joe@frap129.org>
* perf: don't leave group_entry on sibling list (use-after-free)John Dias2017-05-071-0/+7
| | | | | | | | | | | | | When perf_group_detach is called on a group leader, it should empty its sibling list. Otherwise, when a sibling is later deallocated, list_del_event() removes the sibling's group_entry from its current list, which can be the now-deallocated group leader's sibling list (use-after-free bug). Bug: 32402548 Change-Id: I99f6bc97c8518df1cb0035814368012ba72ab1f1 Signed-off-by: John Dias <joaodias@google.com>
* um: fix buildMister Oyster2017-05-041-0/+2
|
* BACKPORT: trace: resolve stack corruption due to string copyAmey Telawane2017-05-021-1/+1
| | | | | | | | | | | Strcpy has no limit on string being copied which causes stack corruption leading to kernel panic. Use strlcpy to resolve the issue by providing length of string to be copied. CRs-fixed: 1048480 Bug: 35399704 Change-Id: Ib290b25f7e0ff96927b8530e5c078869441d409f Signed-off-by: Amey Telawane <ameyt@codeaurora.org>
* tracing: do not leak kernel addressesNick Desaulniers2017-05-021-1/+1
| | | | | | | | This likely breaks tracing tools like trace-cmd. It logs in the same format but now addresses are all 0x0. Bug: 34277115 Change-Id: Ifb0d4d2a184bf0d95726de05b1acee0287a375d9
* sched: Disabled Gentle Fair Sleepers for better UI performanceCristoforo Cataldo2017-04-251-1/+1
|
* drivers:lmk: Fix double delete issueHong-Mei Li2017-04-171-0/+3
| | | | | | | | | | | | | | | | | someone may change a process's oom_score_adj by proc fs, even though the process has exited. In that case, the task was deleted from the rb tree already, and the redundant deleting would trigger rb_erase panic finally. In this patch, we make sure to clear the node after deteting and check its empty status before rb_erase. Change-Id: I26098ca3350f111e94567f9e65ec3dce413197aa Signed-off-by: Hong-Mei Li <a21834@motorola.com> Reviewed-on: http://gerrit.mot.com/727760 SME-Granted: SME Approvals Granted SLTApproved: Slta Waiver <sltawvr@motorola.com> Tested-by: Jira Key <jirakey@motorola.com> Reviewed-by: Sheng-Zhe Zhao <a18689@motorola.com> Submit-Approved: Jira Key <jirakey@motorola.com>
* alarmtimer: fix log and indentMister Oyster2017-04-161-3/+6
|
* mtk: alarmtimer: get rid of xlog and debugMister Oyster2017-04-131-29/+4
|
* Set the device to wake up 1 1/2 minutes beforeMarcos Marado2017-04-131-1/+2
| | | | | | | | | | | | | Setting RTC_PWRON_SEC to -90, so it will wake up 1 and 1/2 minutes before the requested time, giving time for the system to boot up and AlarmManager to be ready for when the time comes. Note: Original commit was for 120 seconds, reduced because of faster hardware. Change-Id: Ie579529f417ad2f6044e347a55e88d1b72503731 Ticket: PORRIDGE-12 Signed-off-by: Mister Oyster <oysterized@gmail.com>
* power: Remove HAS_WAKELOCK config and document WAKELOCKDylan Reid2017-04-131-5/+6
| | | | | | | | | | Remove the HAS_WAKELOCK config as it doesn't seem to have been used in the 3.10 or 3.14 kernels. Add some Documentation to CONFIG_WAKELOCK so that it is selectable and can be disabled is desired. Signed-off-by: Dylan Reid <dgreid@chromium.org>
* wakeup_reason: use vsnprintf instead of snsprintf for vargs.Ruchi Kandoi2017-04-131-1/+1
| | | | | Bug: 22368519 Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
* power: wakeup_reason: fix suspend time reportingAmit Pundir2017-04-131-16/+25
| | | | | | | | | | | | | | | | | Suspend time reporting Change-Id: I2cb9a9408a5fd12166aaec11b935a0fd6a408c63 (Power: Report suspend times from last_suspend_time), is broken on 3.16+ kernels because get_xtime_and_monotonic_and_sleep_offset() hrtimer helper routine is removed from kernel timekeeping. The replacement helper routines ktime_get_update_offsets_{tick,now}() are private to core kernel timekeeping so we can't use them, hence using ktime_get() and ktime_get_boottime() instead and sampling the time twice. Idea is to use Monotonic boottime offset to calculate total time spent in last suspend state and CLOCK_MONOTONIC to calculate time spent in last suspend-resume process. Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
* alarmtimer: Export symbols of functions declared in linux/alarmtimer.hMarcus Gelderie2017-04-131-1/+9
| | | | | | | | | | | Export symbols so they can be used by drivers/staging/android/alarm-dev.c if it is built as a module. So far alarm-dev is built-in but module support is planned (see drivers/staging/android/TODO). Signed-off-by: Marcus Gelderie <redmnic@gmail.com> [jstultz: tweaked commit message, also export newly added functions] Signed-off-by: John Stultz <john.stultz@linaro.org>
* alarmtimer: Export symbols of alarmtimer_get_rtcdevPramod Gurav2017-04-131-1/+1
| | | | | | | | | | | Export symbol of alarmtimer_get_rtcdev so that it is used by any driver when built as module like, drivers/staging/android/alarm-dev.c. CC: John Stultz <john.stultz@linaro.org> CC: Marcus Gelderie <redmnic@gmail.com> Signed-off-by: Pramod Gurav <pramod.gurav.etc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* UPSTREAM: seccomp: always propagate NO_NEW_PRIVS on tsyncJann Horn2017-04-131-11/+11
| | | | | | | | | | | | | | | | Before this patch, a process with some permissive seccomp filter that was applied by root without NO_NEW_PRIVS was able to add more filters to itself without setting NO_NEW_PRIVS by setting the new filter from a throwaway thread with NO_NEW_PRIVS. Signed-off-by: Jann Horn <jann@thejh.net> Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Bug: 36656103 (cherry-picked from commit 103502a35cfce0710909da874f092cb44823ca03) Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I5abd7daab9172f1dfd53e11706b7c7f331f2f4f1
* f2fs: avoid hungtask problem caused by losing wake_upYunlei He2017-04-131-0/+1
| | | | | | | | | | | | | | | | | | | The D state of wait_on_all_pages_writeback should be waken by function f2fs_write_end_io when all writeback pages have been succesfully written to device. It's possible that wake_up comes between get_pages and io_schedule. Maybe in this case it will lost wake_up and still in D state even if all pages have been write back to device, and finally, the whole system will be into the hungtask state. if (!get_pages(sbi, F2FS_WRITEBACK)) break; <--------- wake_up io_schedule(); Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Biao He <hebiao6@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* perf: Cure event->pending_disable racePeter Zijlstra2017-04-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 28a967c3a2f99fa3b5f762f25cb2a319d933571b upstream. Because event_sched_out() checks event->pending_disable _before_ actually disabling the event, it can happen that the event fires after it checks but before it gets disabled. This would leave event->pending_disable set and the queued irq_work will try and process it. However, if the event trigger was during schedule(), the event might have been de-scheduled by the time the irq_work runs, and perf_event_disable_local() will fail. Fix this by checking event->pending_disable _after_ we call event->pmu->del(). This depends on the latter being a compiler barrier, such that the compiler does not lift the load and re-creates the problem. Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174948.040469884@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: mydongistiny <jaysonedson@gmail.com>
* sched/fair: Optimize find_idlest_cpu() when there is no choiceMorten Rasmussen2017-04-131-0/+4
| | | | | | | | | | | | | | | | | | | | | | In the current find_idlest_group()/find_idlest_cpu() search we end up calling find_idlest_cpu() in a sched_group containing only one CPU in the end. Checking idle-states becomes pointless when there is no alternative, so bail out instead. Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: linux-kernel@vger.kernel.org Cc: mgalbraith@suse.de Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1466615004-3503-4-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: RyTek <rytek1128@outlook.com>
* idle: Implement a per-cpu idle-polling modeVikram Mulukutla2017-04-131-2/+25
| | | | | | | | | | | | | | | | cpu_idle_poll_ctrl provides a way of switching the idle thread to use cpu_idle_poll instead of the arch specific lower power mode callbacks (arch_cpu_idle). cpu_idle_poll spins on a flag in a tight loop with interrupts enabled. In some cases it may be useful to enter the tight loop polling mode only on a particular CPU. This allows other CPUs to continue using the arch specific low power mode callbacks. Provide an API that allows this. Change-Id: I7c47c3590eb63345996a1c780faa79dbd1d9fdb4 Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
* idle: exit the cpu_idle_poll loop if cpu_idle_force_poll is clearedVikram Mulukutla2017-04-131-1/+1
| | | | | | | | | | | | | | | | | | | | cpu_idle_poll_ctrl allows the enabling/disabling of the idle polling mode; this mode allows a CPU to spin waiting for a new task to be scheduled rather than having to execute the arch specific idle code. However, the loop that checks for a new task does not look at the flag that enables idle polling mode. So, the CPU may continue to spin even though the aforementioned flag has been cleared. Since the CPU is already in idle, it may be a while before a task is scheduled, precluding potential power savings. Modify the while loop conditional in question to also check if the cpu_idle_force_poll flag is set. Change-Id: Ia2e83af97890dc399b86e090459a41d31ce28b6c Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
* idle: Add a memory barrier after setting cpu_idle_force_pollVikram Mulukutla2017-04-131-0/+3
| | | | | | | | | To ensure that CPUs see cpu_idle_force_poll flag updates, add a memory barrier after writing to the flag. Change-Id: Ic3fdef7d17b673247bce5093530ce8aa08694632 Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
* kernel: trace: fix misleading-indentation warningNathan Chancellor2017-04-131-1/+1
| | | | | | | | | | | | kernel/trace/trace_output.c: In function 'trace_graph_ret_raw': kernel/trace/trace_output.c:1198:2: warning: this 'if' clause does not guard... [-Wmisleading-indentation] if (!trace_seq_printf(&iter->seq, "%lx %lld %lld %ld %d\n", ^~ kernel/trace/trace_output.c:1204:3: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the 'if' return TRACE_TYPE_PARTIAL_LINE; ^~~~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
* sysctl: fix maybe-uninitialized warningsNathan Chancellor2017-04-131-2/+2
| | | | | | | | | | | | | | | | | | | | | kernel/sysctl.c: In function '__do_proc_dointvec.isra.3': kernel/sysctl.c:2030:8: warning: 'kbuf' may be used uninitialized in this function [-Wmaybe-uninitialized] char *tmp = skip_spaces(*buf); ^~~ kernel/sysctl.c:2183:8: note: 'kbuf' was declared here char *kbuf; ^~~~ kernel/sysctl.c: In function '__do_proc_doulongvec_minmax': kernel/sysctl.c:2030:8: warning: 'kbuf' may be used uninitialized in this function [-Wmaybe-uninitialized] char *tmp = skip_spaces(*buf); ^~~ kernel/sysctl.c:2433:8: note: 'kbuf' was declared here char *kbuf; ^~~~ This will be initialized to NULL normally. Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
* kernel/panic.c: add missing \nJiri Slaby2017-04-131-1/+1
| | | | | | | | | | When a system panics, the "Rebooting in X seconds.." message is never printed because it lacks a new line. Fix it. Link: http://lkml.kernel.org/r/20170119114751.2724-1-jslaby@suse.cz Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mtk: squashed security updatesMoyster2017-04-111-1/+1
|
* Get rid of __cpuinitMoyster2017-04-1119-42/+42
| | | | | | | | | | | | | | | | | | | | | This commit is the result of find . -name '*.c' | xargs sed -i 's/ __cpuinit / /g' find . -name '*.c' | xargs sed -i 's/ __cpuexit / /g' find . -name '*.c' | xargs sed -i 's/ __cpuinitdata / /g' find . -name '*.c' | xargs sed -i 's/ __cpuinit$//g' find ./arch/ -name '*.h' | xargs sed -i 's/ __cpuinit//g' find . -name '*.c' | xargs sed -i 's/^__cpuinit //g' find . -name '*.c' | xargs sed -i 's/^__cpuinitdata //g' find . -name '*.c' | xargs sed -i 's/\*__cpuinit /\*/g' find . -name '*.c' | xargs sed -i 's/ __cpuinitconst / /g' find . -name '*.h' | xargs sed -i 's/ __cpuinit / /g' find . -name '*.h' | xargs sed -i 's/ __cpuinitdata / /g' git add . git reset include/linux/init.h git checkout -- include/linux/init.h based off : https://github.com/jollaman999/jolla-kernel_bullhead/commit/bc15db84a622eed7d61d3ece579b577154d0ec29
* Power: Report suspend times from last_suspend_timejinqian2017-04-111-0/+36
| | | | | | | | | This node epxorts two values separated by space. From left to right: 1. time spent in suspend/resume process 2. time spent sleep in suspend state Change-Id: I2cb9a9408a5fd12166aaec11b935a0fd6a408c63
* perf: Do not double freePeter Zijlstra2017-04-111-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | In case of: err_file: fput(event_file), we'll end up calling perf_release() which in turn will free the event. Do not then free the event _again_. Change-Id: Ic1de33d0e29e577a1fc2e00c35bf44df26d96ab6 Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.697350349@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* kernel: support task's adj rbtreeYi-wei Zhao2017-04-112-0/+2
| | | | | | | | | | | | | Add (or del) a task to (or from) task's adj rbtree when a task is created or exit. Change-Id: Ic63e03355a1fed8c500097bad223c59c742a2346 Signed-off-by: Hong-Mei Li <a21834@motorola.com> Signed-off-by: Yi-wei Zhao <gbjc64@motorola.com> Reviewed-on: http://gerrit.mot.com/701207 SLTApproved: Slta Waiver <sltawvr@motorola.com> Tested-by: Jira Key <jirakey@motorola.com> Submit-Approved: Jira Key <jirakey@motorola.com>
* perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' racePeter Zijlstra2017-04-111-4/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 321027c1fe77f892f4ea07846aeae08cefbbb290 upstream. Di Shen reported a race between two concurrent sys_perf_event_open() calls where both try and move the same pre-existing software group into a hardware context. The problem is exactly that described in commit: f63a8daa5812 ("perf: Fix event->ctx locking") ... where, while we wait for a ctx->mutex acquisition, the event->ctx relation can have changed under us. That very same commit failed to recognise sys_perf_event_context() as an external access vector to the events and thereby didn't apply the established locking rules correctly. So while one sys_perf_event_open() call is stuck waiting on mutex_lock_double(), the other (which owns said locks) moves the group about. So by the time the former sys_perf_event_open() acquires the locks, the context we've acquired is stale (and possibly dead). Apply the established locking rules as per perf_event_ctx_lock_nested() to the mutex_lock_double() for the 'move_group' case. This obviously means we need to validate state after we acquire the locks. Change-Id: I83d360303e812232ae7aae492350813f0e79cc71 Reported-by: Di Shen (Keen Lab) Tested-by: John Dias <joaodias@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Min Chong <mchong@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: f63a8daa5812 ("perf: Fix event->ctx locking") Link: http://lkml.kernel.org/r/20170106131444.GZ3174@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.2: - Use ACCESS_ONCE() instead of READ_ONCE() - Test perf_event::group_flags instead of group_caps - Add the err_locked cleanup block, which we didn't need before - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* BACKPORT: perf: Fix event->ctx lockingJohn Dias2017-04-111-37/+207
| | | | | | | | | | | | | | | | | | | | | | | | | There have been a few reported issues wrt. the lack of locking around changing event->ctx. This patch tries to address those. It avoids the whole rwsem thing; and while it appears to work, please give it some thought in review. What I did fail at is sensible runtime checks on the use of event->ctx, the RCU use makes it very hard. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20150123125834.209535886@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit f63a8daa5812afef4f06c962351687e1ff9ccb2b) Bug: 30955111 Bug: 31095224 Change-Id: I5bab713034e960fad467637e98e914440de5666d
* perf: protect group_leader from races that cause ctx double-freeJohn Dias2017-04-111-0/+15
| | | | | | | | | | | | | | | | | | When moving a group_leader perf event from a software-context to a hardware-context, there's a race in checking and updating that context. The existing locking solution doesn't work; note that it tries to grab a lock inside the group_leader's context object, which you can only get at by going through a pointer that should be protected from these races. To avoid that problem, and to produce a simple solution, we can just use a lock per group_leader to protect all checks on the group_leader's context. The new lock is grabbed and released when no context locks are held. Bug: 30955111 Bug: 31095224 Change-Id: If37124c100ca6f4aa962559fba3bd5dbbec8e052
* rcu: Merge rcu_sched_force_quiescent_state() with rcu_force_quiescent_state()Andreea-Cristina Bernat2017-04-112-19/+9
| | | | | | | | | | | | | | | This patch merges the function rcu_force_quiescent_state() with rcu_sched_force_quiescent_state(), using the rcu_state pointer. Firstly, the rcu_sched_force_quiescent_state() function is deleted from the file kernel/rcu/tree.c. Also, the rcu_force_quiescent_state() function that was calling force_quiescent_state with the argument rcu_preempt_state pointer was deleted as well. The new function that combines the old ones uses the rcu_state pointer and is located after rcu_batches_completed_bh() in kernel/rcu/tree.c. Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* rcu: Consolidate kfree_call_rcu() to use rcu_state pointerAndreea-Cristina Bernat2017-04-112-30/+14
| | | | | | | | | | | | kfree_call_rcu is defined two times. When defined under CONFIG_TREE_PREEMPT_RCU, it uses rcu_preempt_state. Otherwise, it uses rcu_sched_state. This patch uses the rcu_state_pointer to combine the two definitions into one. The resulting function is placed after the closing of the preprocessor conditional CONFIG_TREE_PREEMPT_RCU. Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* rcu: Replace NR_CPUS with nr_cpu_idsHimangi Saraogi2017-04-111-2/+2
| | | | | | | | | This patch replaces NR_CPUS with nr_cpu_ids as NR_CPUS should consider cpumask_var_t. Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* rcu: Print negatives for stall-warning counter wraparoundPaul E. McKenney2017-04-111-4/+5
| | | | | | | | | | | The print_other_cpu_stall() and print_cpu_stall() functions print grace-period numbers using an unsigned format, which means that the number one less than zero is a very large number. This commit therefore causes these numbers to be printed with a signed format in order to improve readability of the RCU CPU stall-warning output. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* rcu: Stop tracking FSF's postal addressPaul E. McKenney2017-04-1110-20/+20
| | | | | | | | | | | | All of the RCU source files have the usual GPL header, which contains a long-obsolete postal address for FSF. To avoid the need to track the FSF office's movements, this commit substitutes the URL where GPL may be found. Reported-by: Greg KH <gregkh@linuxfoundation.org> Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* rcu: Add ACCESS_ONCE() to ->n_force_qs_lh accessesPaul E. McKenney2017-04-112-3/+3
| | | | | | | | | | | The ->n_force_qs_lh field is accessed without the benefit of any synchronization, so this commit adds the needed ACCESS_ONCE() wrappers. Yes, increments to ->n_force_qs_lh can be lost, but contention should be low and the field is strictly statistical in nature, so this is not a problem. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* rcu: Check both root and current rcu_node when setting up future grace periodPranith Kumar2017-04-111-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu_start_future_gp() function checks the current rcu_node's ->gpnum and ->completed twice, once without ACCESS_ONCE() and once with it. Which is pointless because we hold that rcu_node's ->lock at that point. The intent was to check the current rcu_node structure and the root rcu_node structure, the latter locklessly with ACCESS_ONCE(). This commit therefore makes that change. The reason that it is safe to locklessly check the root rcu_nodes's ->gpnum and ->completed fields is that we hold the current rcu_node's ->lock, which constrains the root rcu_node's ability to change its ->gpnum and ->completed fields. Of course, if there is a single rcu_node structure, then rnp_root==rnp, and holding the lock prevents all changes. If there is more than one rcu_node structure, then the code updates the fields in the following order: 1. Increment rnp_root->gpnum to start new grace period. 2. Increment rnp->gpnum to initialize the current rcu_node, continuing initialization for the new grace period. 3. Increment rnp_root->completed to end the current grace period. 4. Increment rnp->completed to continue cleaning up after the old grace period. So there are four possible combinations of relative values of these four fields: N N N N: RCU idle, new grace period must be initiated. Although rnp_root->gpnum might be incremented immediately after we check, that will just result in unnecessary work. The grace period already started, and we try to start it. N+1 N N N: RCU grace period just started. No further change is possible because we hold rnp->lock, so the checks of rnp_root->gpnum and rnp_root->completed are stable. We know that our request for a future grace period will be seen during grace-period cleanup. N+1 N N+1 N: RCU grace period is ongoing. Because rnp->gpnum is different than rnp->completed, we won't even look at rnp_root->gpnum and rnp_root->completed, so the possible concurrent change to rnp_root->completed does not matter. We know that our request for a future grace period will be seen during grace-period cleanup, which cannot pass this rcu_node because we hold its ->lock. N+1 N+1 N+1 N: RCU grace period has ended, but not yet been cleaned up. Because rnp->gpnum is different than rnp->completed, we won't look at rnp_root->gpnum and rnp_root->completed, so the possible concurrent change to rnp_root->completed does not matter. We know that our request for a future grace period will be seen during grace-period cleanup, which cannot pass this rcu_node because we hold its ->lock. Therefore, despite initial appearances, the lockless check is safe. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> [ paulmck: Update comment to say why the lockless check is safe. ] Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: 48bd8e9b82a750b983823f391c67e70553757afa Signed-off-by: Kishan Kumar <kishank@codeaurora.org> Signed-off-by: Pranav Vashi <neobuddy89@gmail.com> Change-Id: I2ce0b10e34e5183ffcd6810cada86962ecf85d8f
* rcu: Update cpu_needs_another_gp() for futures from non-NOCB CPUsPaul E. McKenney2017-04-113-29/+29
| | | | | | | | | | | | | | | | | | | In the old days, the only source of requests for future grace periods was NOCB CPUs. This has changed: CPUs routinely post requests for future grace periods in order to promote power efficiency and reduce OS jitter with minimal impact on grace-period latency. This commit therefore updates cpu_needs_another_gp() to invoke rcu_future_needs_gp() instead of rcu_nocb_needs_gp(). The latter is no longer used, so is now removed. This commit also adds tracing for the irq_work_queue() wakeup case. Change-Id: Ifafd85017d358804b0b7a757ef68c1aebf435a99 Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: 365187fbc04fd55766bf6a94e37e558505bf480a Signed-off-by: Kishan Kumar <kishank@codeaurora.org> Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>
* rcu: Protect ->gp_flags accesses with ACCESS_ONCE()Paul E. McKenney2017-04-111-7/+6
| | | | | | | | | | | | | | | | | A number of ->gp_flags accesses don't have ACCESS_ONCE(), but all of the can race against other loads or stores. This commit therefore applies ACCESS_ONCE() to the unprotected ->gp_flags accesses. Reported-by: Alexey Roytman <alexey.roytman@oracle.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: 91dc95427a0d30ac2c58d6e943c7f40a3f25d908 [kishank@codeaurora.org resolve trivial conflicts] Signed-off-by: Kishan Kumar <kishank@codeaurora.org> Signed-off-by: Pranav Vashi <neobuddy89@gmail.com> Change-Id: I2ed581b545fb9c93468658fa621e71c091ec250b
* rcu: Prevent spurious-wakeup DoS attack on rcu_gp_kthread()Paul E. McKenney2017-04-111-3/+8
| | | | | | | | | | | | | | | | | | | | | Spurious wakeups in the force-quiescent-state loop in rcu_gp_kthread() cause the timeout to be recalculated, which would prevent rcu_gp_fqs() from ever being called. This would in turn would prevent the grace period from ever ending for as long as there was at least one CPU in an extended quiescent state that had not yet passed through a quiescent state. This commit therefore avoids recalculating the timeout unless the previous pass's call to wait_event_interruptible_timeout() actually did time out, thus preventing the above scenario. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: 88d6df612cc3c99f56cc18461fcc531c3a145544 [kishank@codeaurora.org resolve trivial conflicts] Signed-off-by: Kishan Kumar <kishank@codeaurora.org> Signed-off-by: Pranav Vashi <neobuddy89@gmail.com> Change-Id: I43f22a80d4334ea5a7105a6da6f929239df76a11
* rcu: Avoid redundant grace-period kthread wakeupsPaul E. McKenney2017-04-111-3/+5
| | | | | | | | | | | | | | | | | | | | When setting up an in-the-future "advanced" grace period, the code needs to wake up the relevant grace-period kthread, which it currently does unconditionally. However, this results in needless wakeups in the case where the advanced grace period is being set up by the grace-period kthread itself, which is a non-uncommon situation. This commit therefore checks to see if the running thread is the grace-period kthread, and avoids doing the irq_work_queue()-mediated wakeup in that case. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: 1eafd31c640d6799c63136246a59d608bed93c74 [kishank@codeaurora.org resolve trivial conflicts] Signed-off-by: Kishan Kumar <kishank@codeaurora.org> Signed-off-by: Pranav Vashi <neobuddy89@gmail.com> Change-Id: I1fdcf6664fdc7dce188488f05f33931e3614c853
* rcu: Switch callers from rcu_process_gp_end() to note_gp_changes()Paul E. McKenney2017-04-111-28/+3
| | | | | | | | | | | | | Because note_gp_changes() now incorporates rcu_process_gp_end() function, this commit switches to the former and eliminates the latter. In addition, this commit changes external calls from __rcu_process_gp_end() to __note_gp_changes(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: 470716fc043aba2fea832334e58d5cd5d82288a3 Signed-off-by: Kishan Kumar <kishank@codeaurora.org>
* rcu: Flag lockless access to ->gp_flags with ACCESS_ONCE()Paul E. McKenney2017-04-111-1/+1
| | | | | | | | | | | | | This commit applies ACCESS_ONCE() to an outside-of-lock access to ->gp_flags. Although it is hard to imagine any sane compiler messing this particular case up, the documentation benefits are substantial. Plus the definition of "sane compiler" grows ever looser. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: 591c6d1710cd73824057d08eda302cf2a7cfd18a [kishank@codeaurora.org resolve trivial conflicts] Signed-off-by: Kishan Kumar <kishank@codeaurora.org>
* rcu: Merge __rcu_process_gp_end() into __note_gp_changes()Paul E. McKenney2017-04-111-42/+6
| | | | | | | | | | | This commit eliminates some duplicated code by merging __rcu_process_gp_end() into __note_gp_changes(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: ba9fbe955f026780e6b27c279dba7c86dfdcb7d5 Signed-off-by: Kishan Kumar <kishank@codeaurora.org>
* rcu: Rename note_new_gpnum() to note_gp_changes()Paul E. McKenney2017-04-111-6/+7
| | | | | | | | | | | | Because note_new_gpnum() now also checks for the ends of old grace periods, this commit changes its name to note_gp_changes(). Later commits will merge rcu_process_gp_end() into note_gp_changes(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: d34ea3221a0f34ed42eadabf054604bbcc7ecd27 Signed-off-by: Kishan Kumar <kishank@codeaurora.org>
* rcu: Make __note_new_gpnum() check for ends of prior grace periodsPaul E. McKenney2017-04-111-0/+3
| | | | | | | | | | | | | | | | | | | | | | | The current implementation can detect the beginning of a new grace period before noting the end of a previous grace period. Although the current implementation correctly handles this sort of nonsense, it would be good to reduce RCU's state space by making such nonsense unnecessary, which is now possible thanks to the fact that RCU's callback groups are now numbered. This commit therefore makes __note_new_gpnum() invoke __rcu_process_gp_end() in order to note the ends of prior grace periods before noting the beginnings of new grace periods. Of course, this now means that note_new_gpnum() notes both the beginnings and ends of grace periods, and could therefore be used in place of rcu_process_gp_end(). But that is a job for later commits. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: 398ebe6000c16135d12ce2ff64318f306ffb20b0 Signed-off-by: Kishan Kumar <kishank@codeaurora.org>