Optimized task_sched_runtime for upto 20% increase in Performance

author: Shoaib0597 <Shoaib0595@gmail.com> 2016-01-25 16:23:49 +0530
committer: Mister Oyster <oysterized@gmail.com> 2017-04-11 10:59:06 +0200
commit: e0110f3f8c29b078a1ace6498f41d11068ee399c (patch)
tree: 666136862c564e788e78a32261f38ab2ea2063b4 /kernel/sched
parent: 1b29dc0c745da6c47ba673a196fd2f9dc8975b09 (diff)
1 files changed, 14 insertions, 0 deletions
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4fb64bd67..6acc7385a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2987,6 +2987,20 @@ unsigned long long task_sched_runtime(struct task_struct *p)
 	struct rq *rq;
 	u64 ns = 0;
 
+#if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
+ /*
+	* 64-bit doesn't need locks to atomically read a 64bit value.
+	* So we have a optimization chance when the task's delta_exec is 0.
+	* Reading ->on_cpu is racy, but this is ok.
+	*
+	* If we race with it leaving cpu, we'll take a lock. So we're correct.
+	* If we race with it entering cpu, unaccounted time is 0. This is
+	* indistinguishable from the read occurring a few cycles earlier.
+	*/
+ if (!p->on_cpu)
+ return p->se.sum_exec_runtime;
+#endif
+
 	rq = task_rq_lock(p, &flags);
 	ns = p->se.sum_exec_runtime + do_task_delta_exec(p, rq);
 	task_rq_unlock(rq, p, &flags);
author	Shoaib0597 <Shoaib0595@gmail.com>	2016-01-25 16:23:49 +0530
committer	Mister Oyster <oysterized@gmail.com>	2017-04-11 10:59:06 +0200
commit	e0110f3f8c29b078a1ace6498f41d11068ee399c (patch)
tree	666136862c564e788e78a32261f38ab2ea2063b4 /kernel/sched
parent	1b29dc0c745da6c47ba673a196fd2f9dc8975b09 (diff)