aboutsummaryrefslogtreecommitdiff
path: root/fs/fs-writeback.c
Commit message (Collapse)AuthorAgeFilesLines
* debug: remove: remove some dmesg logspam from Linux mainline 3.4 fsktoonsez2016-09-131-1/+1
| | | | Signed-off-by: engstk <eng.stk@sapo.pt>
* writeback: Fix occasional slow sync(1)Jan Kara2016-09-011-4/+2
| | | | | | | | | | | | | | | | | In case when system contains no dirty pages, wakeup_flusher_threads() will submit WB_SYNC_NONE writeback for 0 pages so wb_writeback() exits immediately without doing anything. Thus sync(1) will write all the dirty inodes from a WB_SYNC_ALL writeback pass which is slow. Fix the problem by using get_nr_dirty_pages() in wakeup_flusher_threads() instead of calculating number of dirty pages manually. That function also takes number of dirty inodes into account. CC: stable@vger.kernel.org Reported-by: Paul Taysom <taysom@chromium.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Cristoforo Cataldo <cristoforo.cataldo@gmail.com> Signed-off-by: flar2 <asegaert@gmail.com>
* writeback: fix writeback cache thrashingNamjae Jeon2016-09-011-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider Process A: huge I/O on sda doing heavy write operation - dirty memory becomes more than dirty_background_ratio on HDD - flusher thread flush-8:0 Consider Process B: small I/O on sdb doing while [1]; read 1024K + rewrite 1024K + sleep 2sec on Flash device - flusher thread flush-8:16 As Process A is a heavy dirtier, dirty memory becomes more than dirty_background_thresh. Due to this, below check becomes true(checking global_page_state in over_bground_thresh) for all bdi devices(even for very small dirtied bdi - sdb): In this case, even small cached data on 'sdb' is forced to flush and writeback cache thrashing happens. When we added debug prints inside above 'if' condition and ran above Process A(heavy dirtier on bdi with flush-8:0) and Process B(1024K frequent read/rewrite on bdi with flush-8:16) we got below prints: [Test setup: ARM dual core CPU, 512 MB RAM] [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 56064 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 56704 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 84720 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 94720 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 384 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 960 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 64 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 92160 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 256 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 768 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 64 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 256 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 320 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 0 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 92032 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 91968 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 192 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 1024 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 64 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 192 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 576 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 0 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 84352 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 192 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 512 KB [over_bground_thresh]: wakeup flush-8:16 : BDI_RECLAIMABLE = 0 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 92608 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 92544 KB As mentioned in above log, when global dirty memory > global background_thresh small cached data is also forced to flush by flush-8:16. If removing global background_thresh checking code, we can reduce cache thrashing of frequently used small data. And It will be great if we can reserve a portion of writeback cache using min_ratio. After applying patch: $ echo 5 > /sys/block/sdb/bdi/min_ratio $ cat /sys/block/sdb/bdi/min_ratio 5 [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 56064 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 56704 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 84160 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 96960 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 94080 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 93120 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 93120 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 91520 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 89600 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 93696 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 93696 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 72960 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 90624 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 90624 KB [over_bground_thresh]: wakeup flush-8:0 : BDI_RECLAIMABLE = 90688 KB As mentioned in the above logs, once cache is reserved for Process B, and patch is applied there is less writeback cache thrashing on sdb by frequent forced writeback by flush-8:16 in over_bground_thresh. After all, small cached data will be flushed by periodic writeback once every dirty_writeback_interval. Suggested-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Vivek Trivedi <t.vivek@samsung.com> Signed-off-by: flar2 <asegaert@gmail.com>
* writeback: fix race that cause writeback hungJunxiao Bi2016-09-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a race between mark inode dirty and writeback thread, see the following scenario. In this case, writeback thread will not run though there is dirty_io. __mark_inode_dirty() bdi_writeback_workfn() ... ... spin_lock(&inode->i_lock); ... if (bdi_cap_writeback_dirty(bdi)) { <<< assume wb has dirty_io, so wakeup_bdi is false. <<< the following inode_dirty also have wakeup_bdi false. if (!wb_has_dirty_io(&bdi->wb)) wakeup_bdi = true; } spin_unlock(&inode->i_lock); <<< assume last dirty_io is removed here. pages_written = wb_do_writeback(wb); ... <<< work_list empty and wb has no dirty_io, <<< delayed_work will not be queued. if (!list_empty(&bdi->work_list) || (wb_has_dirty_io(wb) && dirty_writeback_interval)) queue_delayed_work(bdi_wq, &wb->dwork, msecs_to_jiffies(dirty_writeback_interval * 10)); spin_lock(&bdi->wb.list_lock); inode->dirtied_when = jiffies; <<< new dirty_io is added. list_move(&inode->i_wb_list, &bdi->wb.b_dirty); spin_unlock(&bdi->wb.list_lock); <<< though there is dirty_io, but wakeup_bdi is false, <<< so writeback thread will not be waked up and <<< the new dirty_io will not be flushed. if (wakeup_bdi) bdi_wakeup_thread_delayed(bdi); Writeback will run until there is a new flush work queued. This may cause a lot of dirty pages stay in memory for a long time. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: flar2 <asegaert@gmail.com>
* first commitMeizu OpenSource2016-08-151-0/+1473