<feed xmlns='http://www.w3.org/2005/Atom'>
<title>xavi/android_kernel_m2note/mm, branch lp-5.1</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<id>https://gitea.privatedns.org/xavi/android_kernel_m2note/atom?h=lp-5.1</id>
<link rel='self' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/atom?h=lp-5.1'/>
<link rel='alternate' type='text/html' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/'/>
<updated>2016-11-07T12:44:42+00:00</updated>
<entry>
<title>proc: much faster /proc/vmstat</title>
<updated>2016-11-07T12:44:42+00:00</updated>
<author>
<name>Francisco Franco</name>
<email>franciscofranco.1990@gmail.com</email>
</author>
<published>2016-10-21T04:50:05+00:00</published>
<link rel='alternate' type='text/html' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/commit/?id=772797eb360d83167fc85359e7082380446c86f6'/>
<id>urn:sha1:772797eb360d83167fc85359e7082380446c86f6</id>
<content type='text'>
Every current KDE system has process named ksysguardd polling files
below once in several seconds:

	$ strace -e trace=open -p $(pidof ksysguardd)
	Process 1812 attached
	open("/etc/mtab", O_RDONLY|O_CLOEXEC)   = 8
	open("/etc/mtab", O_RDONLY|O_CLOEXEC)   = 8
	open("/proc/net/dev", O_RDONLY)         = 8
	open("/proc/net/wireless", O_RDONLY)    = -1 ENOENT (No such file or directory)
	open("/proc/stat", O_RDONLY)            = 8
	open("/proc/vmstat", O_RDONLY)          = 8

Hell knows what it is doing but speed up reading /proc/vmstat by 33%!

Benchmark is open+read+close 1.000.000 times.

			BEFORE
$ perf stat -r 10 taskset -c 3 ./proc-vmstat

 Performance counter stats for 'taskset -c 3 ./proc-vmstat' (10 runs):

      13146.768464      task-clock (msec)         #    0.960 CPUs utilized            ( +-  0.60% )
                15      context-switches          #    0.001 K/sec                    ( +-  1.41% )
                 1      cpu-migrations            #    0.000 K/sec                    ( +- 11.11% )
               104      page-faults               #    0.008 K/sec                    ( +-  0.57% )
    45,489,799,349      cycles                    #    3.460 GHz                      ( +-  0.03% )
     9,970,175,743      stalled-cycles-frontend   #   21.92% frontend cycles idle     ( +-  0.10% )
     2,800,298,015      stalled-cycles-backend    #   6.16% backend cycles idle       ( +-  0.32% )
    79,241,190,850      instructions              #    1.74  insn per cycle
                                                  #    0.13  stalled cycles per insn  ( +-  0.00% )
    17,616,096,146      branches                  # 1339.956 M/sec                    ( +-  0.00% )
       176,106,232      branch-misses             #    1.00% of all branches          ( +-  0.18% )

      13.691078109 seconds time elapsed                                          ( +-  0.03% )
      ^^^^^^^^^^^^

			AFTER
$ perf stat -r 10 taskset -c 3 ./proc-vmstat

 Performance counter stats for 'taskset -c 3 ./proc-vmstat' (10 runs):

       8688.353749      task-clock (msec)         #    0.950 CPUs utilized            ( +-  1.25% )
                10      context-switches          #    0.001 K/sec                    ( +-  2.13% )
                 1      cpu-migrations            #    0.000 K/sec
               104      page-faults               #    0.012 K/sec                    ( +-  0.56% )
    30,384,010,730      cycles                    #    3.497 GHz                      ( +-  0.07% )
    12,296,259,407      stalled-cycles-frontend   #   40.47% frontend cycles idle     ( +-  0.13% )
     3,370,668,651      stalled-cycles-backend    #  11.09% backend cycles idle       ( +-  0.69% )
    28,969,052,879      instructions              #    0.95  insn per cycle
                                                  #    0.42  stalled cycles per insn  ( +-  0.01% )
     6,308,245,891      branches                  #  726.058 M/sec                    ( +-  0.00% )
       214,685,502      branch-misses             #    3.40% of all branches          ( +-  0.26% )

       9.146081052 seconds time elapsed                                          ( +-  0.07% )
       ^^^^^^^^^^^

vsnprintf() is slow because:

1. format_decode() is busy looking for format specifier: 2 branches
   per character (not in this case, but in others)

2. approximately million branches while parsing format mini language
   and everywhere

3.  just look at what string() does /proc/vmstat is good case because
   most of its content are strings

Link: http://lkml.kernel.org/r/20160806125455.GA1187@p183.telecom.by
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Joe Perches &lt;joe@perches.com&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Francisco Franco &lt;franciscofranco.1990@gmail.com&gt;
</content>
</entry>
<entry>
<title>mm: remove gup_flags FOLL_WRITE games from __get_user_pages()</title>
<updated>2016-11-07T12:44:37+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-10-13T20:07:36+00:00</published>
<link rel='alternate' type='text/html' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/commit/?id=bc646beccf8aa8aba7138584af19f49ef2236f7a'/>
<id>urn:sha1:bc646beccf8aa8aba7138584af19f49ef2236f7a</id>
<content type='text'>
commit 19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619 upstream.

This is an ancient bug that was actually attempted to be fixed once
(badly) by me eleven years ago in commit 4ceb5db9757a ("Fix
get_user_pages() race for write access") but that was then undone due to
problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").

In the meantime, the s390 situation has long been fixed, and we can now
fix it by checking the pte_dirty() bit properly (and do it better).  The
s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement
software dirty bits") which made it into v3.9.  Earlier kernels will
have to look at the page state itself.

Also, the VM has become more scalable, and what used a purely
theoretical race back then has become easier to trigger.

To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
we already did a COW" rather than play racy games with FOLL_WRITE that
is very fundamental, and then use the pte dirty flag to validate that
the FOLL_COW flag is still valid.

Reported-and-tested-by: Phil "not Paul" Oester &lt;kernel@linuxace.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Cc: Nick Piggin &lt;npiggin@gmail.com&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[wt: s/gup.c/memory.c; s/follow_page_pte/follow_page_mask;
     s/faultin_page/__get_user_page]
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
</content>
</entry>
<entry>
<title>mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED</title>
<updated>2016-11-07T12:44:36+00:00</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2016-02-26T23:19:28+00:00</published>
<link rel='alternate' type='text/html' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/commit/?id=c9008aab18483f2586c78b902eb57f1a99b8b4d5'/>
<id>urn:sha1:c9008aab18483f2586c78b902eb57f1a99b8b4d5</id>
<content type='text'>
commit ad33bb04b2a6cee6c1f99fabb15cddbf93ff0433 upstream.

pmd_trans_unstable()/pmd_none_or_trans_huge_or_clear_bad() were
introduced to locklessy (but atomically) detect when a pmd is a regular
(stable) pmd or when the pmd is unstable and can infinitely transition
from pmd_none() and pmd_trans_huge() from under us, while only holding
the mmap_sem for reading (for writing not).

While holding the mmap_sem only for reading, MADV_DONTNEED can run from
under us and so before we can assume the pmd to be a regular stable pmd
we need to compare it against pmd_none() and pmd_trans_huge() in an
atomic way, with pmd_trans_unstable().  The old pmd_trans_huge() left a
tiny window for a race.

Useful applications are unlikely to notice the difference as doing
MADV_DONTNEED concurrently with a page fault would lead to undefined
behavior.

[js] 3.12 backport: no pmd_devmap in 3.12 yet.

[akpm@linux-foundation.org: tidy up comment grammar/layout]
Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Reported-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
</content>
</entry>
<entry>
<title>mm, vmalloc: remove useless variable in vmap_block</title>
<updated>2016-09-28T13:15:58+00:00</updated>
<author>
<name>Joonsoo Kim</name>
<email>iamjoonsoo.kim@lge.com</email>
</author>
<published>2013-09-11T21:21:39+00:00</published>
<link rel='alternate' type='text/html' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/commit/?id=87f22c44bb5c7b917143b7ea1f9fbda198c11fd4'/>
<id>urn:sha1:87f22c44bb5c7b917143b7ea1f9fbda198c11fd4</id>
<content type='text'>
vbq in vmap_block isn't used. So remove it.

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Reviewed-by: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Pranav Vashi &lt;neobuddy89@gmail.com&gt;
Signed-off-by: franciscofranco &lt;franciscofranco.1990@gmail.com&gt;
Signed-off-by: engstk &lt;eng.stk@sapo.pt&gt;
</content>
</entry>
<entry>
<title>mm, vmalloc: use well-defined find_last_bit() func</title>
<updated>2016-09-28T13:15:57+00:00</updated>
<author>
<name>Joonsoo Kim</name>
<email>iamjoonsoo.kim@lge.com</email>
</author>
<published>2013-09-11T21:21:40+00:00</published>
<link rel='alternate' type='text/html' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/commit/?id=1c33224568f7a92bb549df47ba8a11342ca504ff'/>
<id>urn:sha1:1c33224568f7a92bb549df47ba8a11342ca504ff</id>
<content type='text'>
Our intention in here is to find last_bit within the region to flush.
There is well-defined function, find_last_bit() for this purpose and its
performance may be slightly better than current implementation.  So change
it.

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Reviewed-by: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Pranav Vashi &lt;neobuddy89@gmail.com&gt;
Signed-off-by: franciscofranco &lt;franciscofranco.1990@gmail.com&gt;
Signed-off-by: engstk &lt;eng.stk@sapo.pt&gt;
</content>
</entry>
<entry>
<title>proc/maps: make vm_is_stack() logic namespace-friendly</title>
<updated>2016-09-28T13:15:16+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2014-10-09T22:25:54+00:00</published>
<link rel='alternate' type='text/html' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/commit/?id=740561452df6f151d608d336aaf53bc0c6602e75'/>
<id>urn:sha1:740561452df6f151d608d336aaf53bc0c6602e75</id>
<content type='text'>
- Rename vm_is_stack() to task_of_stack() and change it to return
  "struct task_struct *" rather than the global (and thus wrong in
  general) pid_t.

- Add the new pid_of_stack() helper which calls task_of_stack() and
  uses the right namespace to report the correct pid_t.

  Unfortunately we need to define this helper twice, in task_mmu.c
  and in task_nommu.c. perhaps it makes sense to add fs/proc/util.c
  and move at least pid_of_stack/task_of_stack there to avoid the
  code duplication.

- Change show_map_vma() and show_numa_map() to use the new helper.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Greg Ungerer &lt;gerg@uclinux.org&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: W4TCH0UT &lt;ateekujjawal@gmail.com&gt;

Conflicts:
	fs/proc/task_nommu.c
	mm/util.c
</content>
</entry>
<entry>
<title>readahead: make context readahead more conservative</title>
<updated>2016-09-28T13:14:50+00:00</updated>
<author>
<name>Fengguang Wu</name>
<email>fengguang.wu@intel.com</email>
</author>
<published>2013-09-11T21:21:47+00:00</published>
<link rel='alternate' type='text/html' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/commit/?id=d71d5a8a5c17fc48f65a06422c49d25899efbdf9'/>
<id>urn:sha1:d71d5a8a5c17fc48f65a06422c49d25899efbdf9</id>
<content type='text'>
This helps performance on moderately dense random reads on SSD.

Transaction-Per-Second numbers provided by Taobao:

		QPS	case
		-------------------------------------------------------
		7536	disable context readahead totally
w/ patch:	7129	slower size rampup and start RA on the 3rd read
		6717	slower size rampup
w/o patch:	5581	unmodified context readahead

Before, readahead will be started whenever reading page N+1 when it happen
to read N recently.  After patch, we'll only start readahead when *three*
random reads happen to access pages N, N+1, N+2.  The probability of this
happening is extremely low for pure random reads, unless they are very
dense, which actually deserves some readahead.

Also start with a smaller readahead window.  The impact to interleaved
sequential reads should be small, because for a long run stream, the the
small readahead window rampup phase is negletable.

The context readahead actually benefits clustered random reads on HDD
whose seek cost is pretty high.  However as SSD is increasingly used for
random read workloads it's better for the context readahead to concentrate
on interleaved sequential reads.

Another SSD rand read test from Miao

        # file size:        2GB
        # read IO amount: 625MB
        sysbench --test=fileio          \
                --max-requests=10000    \
                --num-threads=1         \
                --file-num=1            \
                --file-block-size=64K   \
                --file-test-mode=rndrd  \
                --file-fsync-freq=0     \
                --file-fsync-end=off    run

shows the performance of btrfs grows up from 69MB/s to 121MB/s, ext4 from
104MB/s to 121MB/s.

Signed-off-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Tested-by: Tao Ma &lt;tm@tao.ma&gt;
Tested-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: flar2 &lt;asegaert@gmail.com&gt;
Signed-off-by: Brandon Berhent &lt;bbedward@gmail.com&gt;
Signed-off-by: W4TCH0UT &lt;ateekujjawal@gmail.com&gt;
</content>
</entry>
<entry>
<title>slub: do not assert not having lock in removing freed partial</title>
<updated>2016-09-28T13:14:32+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2014-02-10T22:25:46+00:00</published>
<link rel='alternate' type='text/html' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/commit/?id=1d0dc70511de207883ce0252801a77b796ec196f'/>
<id>urn:sha1:1d0dc70511de207883ce0252801a77b796ec196f</id>
<content type='text'>
Vladimir reported the following issue:

Commit c65c1877bd68 ("slub: use lockdep_assert_held") requires
remove_partial() to be called with n-&gt;list_lock held, but free_partial()
called from kmem_cache_close() on cache destruction does not follow this
rule, leading to a warning:

  WARNING: CPU: 0 PID: 2787 at mm/slub.c:1536 __kmem_cache_shutdown+0x1b2/0x1f0()
  Modules linked in:
  CPU: 0 PID: 2787 Comm: modprobe Tainted: G        W    3.14.0-rc1-mm1+ #1
  Hardware name:
   0000000000000600 ffff88003ae1dde8 ffffffff816d9583 0000000000000600
   0000000000000000 ffff88003ae1de28 ffffffff8107c107 0000000000000000
   ffff880037ab2b00 ffff88007c240d30 ffffea0001ee5280 ffffea0001ee52a0
  Call Trace:
    __kmem_cache_shutdown+0x1b2/0x1f0
    kmem_cache_destroy+0x43/0xf0
    xfs_destroy_zones+0x103/0x110 [xfs]
    exit_xfs_fs+0x38/0x4e4 [xfs]
    SyS_delete_module+0x19a/0x1f0
    system_call_fastpath+0x16/0x1b

His solution was to add a spinlock in order to quiet lockdep.  Although
there would be no contention to adding the lock, that lock also requires
disabling of interrupts which will have a larger impact on the system.

Instead of adding a spinlock to a location where it is not needed for
lockdep, make a __remove_partial() function that does not test if the
list_lock is held, as no one should have it due to it being freed.

Also added a __add_partial() function that does not do the lock
validation either, as it is not needed for the creation of the cache.

Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Reported-by: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Suggested-by: David Rientjes &lt;rientjes@google.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Acked-by: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Acked-by: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: W4TCH0UT &lt;ateekujjawal@gmail.com&gt;
</content>
</entry>
<entry>
<title>mm: slub: work around unneeded lockdep warning</title>
<updated>2016-09-28T13:14:31+00:00</updated>
<author>
<name>Dave Hansen</name>
<email>dave.hansen@linux.intel.com</email>
</author>
<published>2014-01-24T15:20:23+00:00</published>
<link rel='alternate' type='text/html' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/commit/?id=6d920febf5a78394fb5da2ffa4b0cf6636b707d8'/>
<id>urn:sha1:6d920febf5a78394fb5da2ffa4b0cf6636b707d8</id>
<content type='text'>
The slub code does some setup during early boot in
early_kmem_cache_node_alloc() with some local data.  There is no
possible way that another CPU can see this data, so the slub code
doesn't unnecessarily lock it.  However, some new lockdep asserts
check to make sure that add_partial() _always_ has the list_lock
held.

Just add the locking, even though it is technically unnecessary.

Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Signed-off-by: Pekka Enberg &lt;penberg@kernel.org&gt;
Signed-off-by: W4TCH0UT &lt;ateekujjawal@gmail.com&gt;
Signed-off-by: Anik1199 &lt;anik9280@gmail.com&gt;

Conflicts:
	mm/slub.c
</content>
</entry>
<entry>
<title>slab: do not panic if we fail to create memcg cache</title>
<updated>2016-09-28T13:14:28+00:00</updated>
<author>
<name>Vladimir Davydov</name>
<email>vdavydov@parallels.com</email>
</author>
<published>2014-01-23T23:53:05+00:00</published>
<link rel='alternate' type='text/html' href='https://gitea.privatedns.org/xavi/android_kernel_m2note/commit/?id=3c24bbc144755b300e75d802d3cabc882e78c729'/>
<id>urn:sha1:3c24bbc144755b300e75d802d3cabc882e78c729</id>
<content type='text'>
There is no point in flooding logs with warnings or especially crashing
the system if we fail to create a cache for a memcg.  In this case we
will be accounting the memcg allocation to the root cgroup until we
succeed to create its own cache, but it isn't that critical.

Signed-off-by: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Glauber Costa &lt;glommer@gmail.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: W4TCH0UT &lt;ateekujjawal@gmail.com&gt;
</content>
</entry>
</feed>
