aboutsummaryrefslogtreecommitdiff
path: root/drivers/s390
Commit message (Collapse)AuthorAgeFilesLines
* Replace <asm/uaccess.h> with <linux/uaccess.h> globallyLinus Torvalds2018-11-2929-29/+29
| | | | | | | | | | | | | | This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Moyster <oysterized@gmail.com>
* UPSTREAM: block: disable entropy contributions for nonrot devicesMike Snitzer2017-12-272-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | (cherry picked from commit b277da0a8a594308e17881f4926879bd5fca2a2d) Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set QUEUE_FLAG_NONROT. Historically, all block devices have automatically made entropy contributions. But as previously stated in commit e2e1a148 ("block: add sysfs knob for turning off disk entropy contributions"): - On SSD disks, the completion times aren't as random as they are for rotational drives. So it's questionable whether they should contribute to the random pool in the first place. - Calling add_disk_randomness() has a lot of overhead. There are more reliable sources for randomness than non-rotational block devices. From a security perspective it is better to err on the side of caution than to allow entropy contributions from unreliable "random" sources. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Mister Oyster <oysterized@gmail.com>
* scsi: zfcp: trace HBA FSF response by default on dismiss or timedout late ↵Steffen Maier2017-11-061-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | response commit fdb7cee3b9e3c561502e58137a837341f10cbf8b upstream. At the default trace level, we only trace unsuccessful events including FSF responses. zfcp_dbf_hba_fsf_response() only used protocol status and FSF status to decide on an unsuccessful response. However, this is only one of multiple possible sources determining a failed struct zfcp_fsf_req. An FSF request can also "fail" if its response runs into an ERP timeout or if it gets dismissed because a higher level recovery was triggered [trace tags "erscf_1" or "erscf_2" in zfcp_erp_strategy_check_fsfreq()]. FSF requests with ERP timeout are: FSF_QTCB_EXCHANGE_CONFIG_DATA, FSF_QTCB_EXCHANGE_PORT_DATA, FSF_QTCB_OPEN_PORT_WITH_DID or FSF_QTCB_CLOSE_PORT or FSF_QTCB_CLOSE_PHYSICAL_PORT for target ports, FSF_QTCB_OPEN_LUN, FSF_QTCB_CLOSE_LUN. One example is slow queue processing which can cause follow-on errors, e.g. FSF_PORT_ALREADY_OPEN after FSF_QTCB_OPEN_PORT_WITH_DID timed out. In order to see the root cause, we need to see late responses even if the channel presented them successfully with FSF_PROT_GOOD and FSF_GOOD. Example trace records formatted with zfcpdbf from the s390-tools package: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : ... Record ID : 1 Tag : fcegpf1 LUN : 0xffffffffffffffff WWPN : 0x<WWPN> D_ID : 0x00<D_ID> Adapter status : 0x5400050b Port status : 0x41200000 LUN status : 0x00000000 Ready count : 0x00000001 Running count : 0x... ERP want : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT ERP need : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT | Timestamp : ... 30 seconds later Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : ... Record ID : 2 Tag : erscf_2 LUN : 0xffffffffffffffff WWPN : 0x<WWPN> D_ID : 0x00<D_ID> Adapter status : 0x5400050b Port status : 0x41200000 LUN status : 0x00000000 Request ID : 0x<request_ID> ERP status : 0x10000000 ZFCP_STATUS_ERP_TIMEDOUT ERP step : 0x0800 ZFCP_ERP_STEP_PORT_OPENING ERP action : 0x02 ZFCP_ERP_ACTION_REOPEN_PORT ERP count : 0x00 | Timestamp : ... later than previous record Area : HBA Subarea : 00 Level : 5 > default level => 3 <= default level Exception : - CPU ID : 00 Caller : ... Record ID : 1 Tag : fs_qtcb => fs_rerr Request ID : 0x<request_ID> Request status : 0x00001010 ZFCP_STATUS_FSFREQ_DISMISSED | ZFCP_STATUS_FSFREQ_CLEANUP FSF cmnd : 0x00000005 FSF sequence no: 0x... FSF issued : ... > 30 seconds ago FSF stat : 0x00000000 FSF_GOOD FSF stat qual : 00000000 00000000 00000000 00000000 Prot stat : 0x00000001 FSF_PROT_GOOD Prot stat qual : 00000000 00000000 00000000 00000000 Port handle : 0x... LUN handle : 0x00000000 QTCB log length: ... QTCB log info : ... In case of problems detecting that new responses are waiting on the input queue, we sooner or later trigger adapter recovery due to an FSF request timeout (trace tag "fsrth_1"). FSF requests with FSF request timeout are: typically FSF_QTCB_ABORT_FCP_CMND; but theoretically also FSF_QTCB_EXCHANGE_CONFIG_DATA or FSF_QTCB_EXCHANGE_PORT_DATA via sysfs, FSF_QTCB_OPEN_PORT_WITH_DID or FSF_QTCB_CLOSE_PORT for WKA ports, FSF_QTCB_FCP_CMND for task management function (LUN / target reset). One or more pending requests can meanwhile have FSF_PROT_GOOD and FSF_GOOD because the channel filled in the response via DMA into the request's QTCB. In a theroretical case, inject code can create an erroneous FSF request on purpose. If data router is enabled, it uses deferred error reporting. A READ SCSI command can succeed with FSF_PROT_GOOD, FSF_GOOD, and SAM_STAT_GOOD. But on writing the read data to host memory via DMA, it can still fail, e.g. if an intentionally wrong scatter list does not provide enough space. Rather than getting an unsuccessful response, we get a QDIO activate check which in turn triggers adapter recovery. One or more pending requests can meanwhile have FSF_PROT_GOOD and FSF_GOOD because the channel filled in the response via DMA into the request's QTCB. Example trace records formatted with zfcpdbf from the s390-tools package: Timestamp : ... Area : HBA Subarea : 00 Level : 6 > default level => 3 <= default level Exception : - CPU ID : .. Caller : ... Record ID : 1 Tag : fs_norm => fs_rerr Request ID : 0x<request_ID2> Request status : 0x00001010 ZFCP_STATUS_FSFREQ_DISMISSED | ZFCP_STATUS_FSFREQ_CLEANUP FSF cmnd : 0x00000001 FSF sequence no: 0x... FSF issued : ... FSF stat : 0x00000000 FSF_GOOD FSF stat qual : 00000000 00000000 00000000 00000000 Prot stat : 0x00000001 FSF_PROT_GOOD Prot stat qual : ........ ........ 00000000 00000000 Port handle : 0x... LUN handle : 0x... | Timestamp : ... Area : SCSI Subarea : 00 Level : 3 Exception : - CPU ID : .. Caller : ... Record ID : 1 Tag : rsl_err Request ID : 0x<request_ID2> SCSI ID : 0x... SCSI LUN : 0x... SCSI result : 0x000e0000 DID_TRANSPORT_DISRUPTED SCSI retries : 0x00 SCSI allowed : 0x05 SCSI scribble : 0x<request_ID2> SCSI opcode : 28... Read(10) FCP rsp inf cod: 0x00 FCP rsp IU : 00000000 00000000 00000000 00000000 ^^ SAM_STAT_GOOD 00000000 00000000 Only with luck in both above cases, we could see a follow-on trace record of an unsuccesful event following a successful but late FSF response with FSF_PROT_GOOD and FSF_GOOD. Typically this was the case for I/O requests resulting in a SCSI trace record "rsl_err" with DID_TRANSPORT_DISRUPTED [On ZFCP_STATUS_FSFREQ_DISMISSED, zfcp_fsf_protstatus_eval() sets ZFCP_STATUS_FSFREQ_ERROR seen by the request handler functions as failure]. However, the reason for this follow-on trace was invisible because the corresponding HBA trace record was missing at the default trace level (by default hidden records with tags "fs_norm", "fs_qtcb", or "fs_open"). On adapter recovery, after we had shut down the QDIO queues, we perform unsuccessful pseudo completions with flag ZFCP_STATUS_FSFREQ_DISMISSED for each pending FSF request in zfcp_fsf_req_dismiss_all(). In order to find the root cause, we need to see all pseudo responses even if the channel presented them successfully with FSF_PROT_GOOD and FSF_GOOD. Therefore, check zfcp_fsf_req.status for ZFCP_STATUS_FSFREQ_DISMISSED or ZFCP_STATUS_FSFREQ_ERROR and trace with a new tag "fs_rerr". It does not matter that there are numerous places which set ZFCP_STATUS_FSFREQ_ERROR after the location where we trace an FSF response early. These cases are based on protocol status != FSF_PROT_GOOD or == FSF_PROT_FSF_STATUS_PRESENTED and are thus already traced by default as trace tag "fs_perr" or "fs_ferr" respectively. NB: The trace record with tag "fssrh_1" for status read buffers on dismiss all remains. zfcp_fsf_req_complete() handles this and returns early. All other FSF request types are handled separately and as described above. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 8a36e4532ea1 ("[SCSI] zfcp: enhancement of zfcp debug features") Fixes: 2e261af84cdb ("[SCSI] zfcp: Only collect FSF/HBA debug data for matching trace levels") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* scsi: zfcp: fix payload with full FCP_RSP IU in SCSI trace recordsSteffen Maier2017-11-061-4/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 12c3e5754c8022a4f2fd1e9f00d19e99ee0d3cc1 upstream. If the FCP_RSP UI has optional parts (FCP_SNS_INFO or FCP_RSP_INFO) and thus does not fit into the fsp_rsp field built into a SCSI trace record, trace the full FCP_RSP UI with all optional parts as payload record instead of just FCP_SNS_INFO as payload and a 1 byte RSP_INFO_CODE part of FCP_RSP_INFO built into the SCSI record. That way we would also get the full FCP_SNS_INFO in case a target would ever send more than min(SCSI_SENSE_BUFFERSIZE==96, ZFCP_DBF_PAY_MAX_REC==256)==96. The mandatory part of FCP_RSP IU is only 24 bytes. PAYload costs at least one full PAY record of 256 bytes anyway. We cap to the hardware response size which is only FSF_FCP_RSP_SIZE==128. So we can just put the whole FCP_RSP IU with any optional parts into PAYload similarly as we do for SAN PAY since v4.9 commit aceeffbb59bb ("zfcp: trace full payload of all SAN records (req,resp,iels)"). This does not cause any additional trace records wasting memory. Decoded trace records were confusing because they showed a hard-coded sense data length of 96 even if the FCP_RSP_IU field FCP_SNS_LEN showed actually less. Since the same commit, we set pl_len for SAN traces to the full length of a request/response even if we cap the corresponding trace. In contrast, here for SCSI traces we set pl_len to the pre-computed length of FCP_RSP IU considering SNS_LEN or RSP_LEN if valid. Nonetheless we trace a hardcoded payload of length FSF_FCP_RSP_SIZE==128 if there were optional parts. This makes it easier for the zfcpdbf tool to format only the relevant part of the long FCP_RSP UI buffer. And any trailing information is still available in the payload trace record just in case. Rename the payload record tag from "fcp_sns" to "fcp_riu" to make the new content explicit to zfcpdbf which can then pick a suitable field name such as "FCP rsp IU all:" instead of "Sense info :" Also, the same zfcpdbf can still be backwards compatible with "fcp_sns". Old example trace record before this fix, formatted with the tool zfcpdbf from s390-tools: Timestamp : ... Area : SCSI Subarea : 00 Level : 3 Exception : - CPU id : .. Caller : 0x... Record id : 1 Tag : rsl_err Request id : 0x<request_id> SCSI ID : 0x... SCSI LUN : 0x... SCSI result : 0x00000002 SCSI retries : 0x00 SCSI allowed : 0x05 SCSI scribble : 0x<request_id> SCSI opcode : 00000000 00000000 00000000 00000000 FCP rsp inf cod: 0x00 FCP rsp IU : 00000000 00000000 00000202 00000000 ^^==FCP_SNS_LEN_VALID 00000020 00000000 ^^^^^^^^==FCP_SNS_LEN==32 Sense len : 96 <==min(SCSI_SENSE_BUFFERSIZE,ZFCP_DBF_PAY_MAX_REC) Sense info : 70000600 00000018 00000000 29000000 00000400 00000000 00000000 00000000 00000000 00000000 00000000 00000000<==superfluous 00000000 00000000 00000000 00000000<==superfluous 00000000 00000000 00000000 00000000<==superfluous 00000000 00000000 00000000 00000000<==superfluous New example trace records with this fix: Timestamp : ... Area : SCSI Subarea : 00 Level : 3 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : rsl_err Request ID : 0x<request_id> SCSI ID : 0x... SCSI LUN : 0x... SCSI result : 0x00000002 SCSI retries : 0x00 SCSI allowed : 0x03 SCSI scribble : 0x<request_id> SCSI opcode : a30c0112 00000000 02000000 00000000 FCP rsp inf cod: 0x00 FCP rsp IU : 00000000 00000000 00000a02 00000200 00000020 00000000 FCP rsp IU len : 56 FCP rsp IU all : 00000000 00000000 00000a02 00000200 ^^=FCP_RESID_UNDER|FCP_SNS_LEN_VALID 00000020 00000000 70000500 00000018 ^^^^^^^^==FCP_SNS_LEN ^^^^^^^^^^^^^^^^^ 00000000 240000cb 00011100 00000000 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 00000000 00000000 ^^^^^^^^^^^^^^^^^==FCP_SNS_INFO Timestamp : ... Area : SCSI Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : lr_okay Request ID : 0x<request_id> SCSI ID : 0x... SCSI LUN : 0x... SCSI result : 0x00000000 SCSI retries : 0x00 SCSI allowed : 0x05 SCSI scribble : 0x<request_id> SCSI opcode : <CDB of unrelated SCSI command passed to eh handler> FCP rsp inf cod: 0x00 FCP rsp IU : 00000000 00000000 00000100 00000000 00000000 00000008 FCP rsp IU len : 32 FCP rsp IU all : 00000000 00000000 00000100 00000000 ^^==FCP_RSP_LEN_VALID 00000000 00000008 00000000 00000000 ^^^^^^^^==FCP_RSP_LEN ^^^^^^^^^^^^^^^^^==FCP_RSP_INFO Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 250a1352b95e ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* scsi: zfcp: fix missing trace records for early returns in TMF eh handlersSteffen Maier2017-11-061-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 1a5d999ebfc7bfe28deb48931bb57faa8e4102b6 upstream. For problem determination we need to see that we were in scsi_eh as well as whether and why we were successful or not. The following commits introduced new early returns without adding a trace record: v2.6.35 commit a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh") on fc_block_scsi_eh() returning != 0 which is FAST_IO_FAIL, v2.6.30 commit 63caf367e1c9 ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp") on not having gotten an FSF request after the maximum number of retry attempts and thus could not issue a TMF and has to return FAILED. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh") Fixes: 63caf367e1c9 ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* scsi: zfcp: add handling for FCP_RESID_OVER to the fcp ingress pathBenjamin Block2017-11-061-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a099b7b1fc1f0418ab8d79ecf98153e1e134656e upstream. Up until now zfcp would just ignore the FCP_RESID_OVER flag in the FCP response IU. When this flag is set, it is possible, in regards to the FCP standard, that the storage-server processes the command normally, up to the point where data is missing and simply ignores those. In this case no CHECK CONDITION would be set, and because we ignored the FCP_RESID_OVER flag we resulted in at least a data loss or even -corruption as a follow-up error, depending on how the applications/layers on top behave. To prevent this, we now set the host-byte of the corresponding scsi_cmnd to DID_ERROR. Other storage-behaviors, where the same condition results in a CHECK CONDITION set in the answer, don't need to be changed as they are handled in the mid-layer already. Following is an example trace record decoded with zfcpdbf from the s390-tools package. We forcefully injected a fc_dl which is one byte too small: Timestamp : ... Area : SCSI Subarea : 00 Level : 3 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : rsl_err Request ID : 0x... SCSI ID : 0x... SCSI LUN : 0x... SCSI result : 0x00070000 ^^DID_ERROR SCSI retries : 0x.. SCSI allowed : 0x.. SCSI scribble : 0x... SCSI opcode : 2a000000 00000000 08000000 00000000 FCP rsp inf cod: 0x00 FCP rsp IU : 00000000 00000000 00000400 00000001 ^^fr_flags==FCP_RESID_OVER ^^fr_status==SAM_STAT_GOOD ^^^^^^^^fr_resid 00000000 00000000 As of now, we don't actively handle to possibility that a response IU has both flags - FCP_RESID_OVER and FCP_RESID_UNDER - set at once. Reported-by: Luke M. Hopkins <lmhopkin@us.ibm.com> Reviewed-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 553448f6c483 ("[SCSI] zfcp: Message cleanup") Fixes: ea127f975424 ("[PATCH] s390 (7/7): zfcp host adapter.") (tglx/history.git) Cc: <stable@vger.kernel.org> #2.6.33+ Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* scsi: zfcp: fix queuecommand for scsi_eh commands when DIX enabledSteffen Maier2017-11-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 71b8e45da51a7b64a23378221c0a5868bd79da4f upstream. Since commit db007fc5e20c ("[SCSI] Command protection operation"), scsi_eh_prep_cmnd() saves scmd->prot_op and temporarily resets it to SCSI_PROT_NORMAL. Other FCP LLDDs such as qla2xxx and lpfc shield their queuecommand() to only access any of scsi_prot_sg...() if (scsi_get_prot_op(cmd) != SCSI_PROT_NORMAL). Do the same thing for zfcp, which introduced DIX support with commit ef3eb71d8ba4 ("[SCSI] zfcp: Introduce experimental support for DIF/DIX"). Otherwise, TUR SCSI commands as part of scsi_eh likely fail in zfcp, because the regular SCSI command with DIX protection data, that scsi_eh re-uses in scsi_send_eh_cmnd(), of course still has (scsi_prot_sg_count() != 0) and so zfcp sends down bogus requests to the FCP channel hardware. This causes scsi_eh_test_devices() to have (finish_cmds == 0) [not SCSI device is online or not scsi_eh_tur() failed] so regular SCSI commands, that caused / were affected by scsi_eh, are moved to work_q and scsi_eh_test_devices() itself returns false. In turn, it unnecessarily escalates in our case in scsi_eh_ready_devs() beyond host reset to finally scsi_eh_offline_sdevs() which sets affected SCSI devices offline with the following kernel message: "kernel: sd H:0:T:L: Device offlined - not ready after error recovery" Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: ef3eb71d8ba4 ("[SCSI] zfcp: Introduce experimental support for DIF/DIX") Cc: <stable@vger.kernel.org> #2.6.36+ Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* s390/vmlogrdr: fix IUCV buffer allocationGerald Schaefer2017-07-041-1/+1
| | | | | | | | | | | | | | | commit 5457e03de918f7a3e294eb9d26a608ab8a579976 upstream. The buffer for iucv_message_receive() needs to be below 2 GB. In __iucv_message_receive(), the buffer address is casted to an u32, which would result in either memory corruption or an addressing exception when using addresses >= 2 GB. Fix this by using GFP_DMA for the buffer allocation. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* s390/qdio: clear DSCI prior to scanning multiple input queuesJulian Wiedmann2017-06-171-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 1e4a382fdc0ba8d1a85b758c0811de3a3631085e upstream. For devices with multiple input queues, tiqdio_call_inq_handlers() iterates over all input queues and clears the device's DSCI during each iteration. If the DSCI is re-armed during one of the later iterations, we therefore do not scan the previous queues again. The re-arming also raises a new adapter interrupt. But its handler does not trigger a rescan for the device, as the DSCI has already been erroneously cleared. This can result in queue stalls on devices with multiple input queues. Fix it by clearing the DSCI just once, prior to scanning the queues. As the code is moved in front of the loop, we also need to access the DSCI directly (ie irq->dsci) instead of going via each queue's parent pointer to the same irq. This is not a functional change, and a follow-up patch will clean up the other users. In practice, this bug only affects CQ-enabled HiperSockets devices, ie. devices with sysfs-attribute "hsuid" set. Setting a hsuid is needed for AF_IUCV socket applications that use HiperSockets communication. Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks") Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* scsi: zfcp: fix use-after-free by not tracing WKA port open/close on failed sendSteffen Maier2017-06-171-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2dfa6688aafdc3f74efeb1cf05fb871465d67f79 upstream. Dan Carpenter kindly reported: <quote> The patch d27a7cb91960: "zfcp: trace on request for open and close of WKA port" from Aug 10, 2016, leads to the following static checker warning: drivers/s390/scsi/zfcp_fsf.c:1615 zfcp_fsf_open_wka_port() warn: 'req' was already freed. drivers/s390/scsi/zfcp_fsf.c 1609 zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT); 1610 retval = zfcp_fsf_req_send(req); 1611 if (retval) 1612 zfcp_fsf_req_free(req); ^^^ Freed. 1613 out: 1614 spin_unlock_irq(&qdio->req_q_lock); 1615 if (req && !IS_ERR(req)) 1616 zfcp_dbf_rec_run_wka("fsowp_1", wka_port, req->req_id); ^^^^^^^^^^^ Use after free. 1617 return retval; 1618 } Same thing for zfcp_fsf_close_wka_port() as well. </quote> Rather than relying on req being NULL (or ERR_PTR) for all cases where we don't want to trace or should not trace, simply check retval which is unconditionally initialized with -EIO != 0 and it can only become 0 on successful retval = zfcp_fsf_req_send(req). With that we can also remove the then again unnecessary unconditional initialization of req which was introduced with that earlier commit. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Suggested-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: d27a7cb91960 ("zfcp: trace on request for open and close of WKA port") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Jens Remus <jremus@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* scsi: zfcp: fix rport unblock race with LUN recoverySteffen Maier2017-06-174-9/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 6f2ce1c6af37191640ee3ff6e8fc39ea10352f4c upstream. It is unavoidable that zfcp_scsi_queuecommand() has to finish requests with DID_IMM_RETRY (like fc_remote_port_chkready()) during the time window when zfcp detected an unavailable rport but fc_remote_port_delete(), which is asynchronous via zfcp_scsi_schedule_rport_block(), has not yet blocked the rport. However, for the case when the rport becomes available again, we should prevent unblocking the rport too early. In contrast to other FCP LLDDs, zfcp has to open each LUN with the FCP channel hardware before it can send I/O to a LUN. So if a port already has LUNs attached and we unblock the rport just after port recovery, recoveries of LUNs behind this port can still be pending which in turn force zfcp_scsi_queuecommand() to unnecessarily finish requests with DID_IMM_RETRY. This also opens a time window with unblocked rport (until the followup LUN reopen recovery has finished). If a scsi_cmnd timeout occurs during this time window fc_timed_out() cannot work as desired and such command would indeed time out and trigger scsi_eh. This prevents a clean and timely path failover. This should not happen if the path issue can be recovered on FC transport layer such as path issues involving RSCNs. Fix this by only calling zfcp_scsi_schedule_rport_register(), to asynchronously trigger fc_remote_port_add(), after all LUN recoveries as children of the rport have finished and no new recoveries of equal or higher order were triggered meanwhile. Finished intentionally includes any recovery result no matter if successful or failed (still unblock rport so other successful LUNs work). For simplicity, we check after each finished LUN recovery if there is another LUN recovery pending on the same port and then do nothing. We handle the special case of a successful recovery of a port without LUN children the same way without changing this case's semantics. For debugging we introduce 2 new trace records written if the rport unblock attempt was aborted due to still unfinished or freshly triggered recovery. The records are only written above the default trace level. Benjamin noticed the important special case of new recovery that can be triggered between having given up the erp_lock and before calling zfcp_erp_action_cleanup() within zfcp_erp_strategy(). We must avoid the following sequence: ERP thread rport_work other context ------------------------- -------------- -------------------------------- port is unblocked, rport still blocked, due to pending/running ERP action, so ((port->status & ...UNBLOCK) != 0) and (port->rport == NULL) unlock ERP zfcp_erp_action_cleanup() case ZFCP_ERP_ACTION_REOPEN_LUN: zfcp_erp_try_rport_unblock() ((status & ...UNBLOCK) != 0) [OLD!] zfcp_erp_port_reopen() lock ERP zfcp_erp_port_block() port->status clear ...UNBLOCK unlock ERP zfcp_scsi_schedule_rport_block() port->rport_task = RPORT_DEL queue_work(rport_work) zfcp_scsi_rport_work() (port->rport_task != RPORT_ADD) port->rport_task = RPORT_NONE zfcp_scsi_rport_block() if (!port->rport) return zfcp_scsi_schedule_rport_register() port->rport_task = RPORT_ADD queue_work(rport_work) zfcp_scsi_rport_work() (port->rport_task == RPORT_ADD) port->rport_task = RPORT_NONE zfcp_scsi_rport_register() (port->rport == NULL) rport = fc_remote_port_add() port->rport = rport; Now the rport was erroneously unblocked while the zfcp_port is blocked. This is another situation we want to avoid due to scsi_eh potential. This state would at least remain until the new recovery from the other context finished successfully, or potentially forever if it failed. In order to close this race, we take the erp_lock inside zfcp_erp_try_rport_unblock() when checking the status of zfcp_port or LUN. With that, the possible corresponding rport state sequences would be: (unblock[ERP thread],block[other context]) if the ERP thread gets erp_lock first and still sees ((port->status & ...UNBLOCK) != 0), (block[other context],NOP[ERP thread]) if the ERP thread gets erp_lock after the other context has already cleard ...UNBLOCK from port->status. Since checking fields of struct erp_action is unsafe because they could have been overwritten (re-used for new recovery) meanwhile, we only check status of zfcp_port and LUN since these are only changed under erp_lock elsewhere. Regarding the check of the proper status flags (port or port_forced are similar to the shown adapter recovery): [zfcp_erp_adapter_shutdown()] zfcp_erp_adapter_reopen() zfcp_erp_adapter_block() * clear UNBLOCK ---------------------------------------+ zfcp_scsi_schedule_rports_block() | write_lock_irqsave(&adapter->erp_lock, flags);-------+ | zfcp_erp_action_enqueue() | | zfcp_erp_setup_act() | | * set ERP_INUSE -----------------------------------|--|--+ write_unlock_irqrestore(&adapter->erp_lock, flags);--+ | | .context-switch. | | zfcp_erp_thread() | | zfcp_erp_strategy() | | write_lock_irqsave(&adapter->erp_lock, flags);------+ | | ... | | | zfcp_erp_strategy_check_target() | | | zfcp_erp_strategy_check_adapter() | | | zfcp_erp_adapter_unblock() | | | * set UNBLOCK -----------------------------------|--+ | zfcp_erp_action_dequeue() | | * clear ERP_INUSE ---------------------------------|-----+ ... | write_unlock_irqrestore(&adapter->erp_lock, flags);-+ Hence, we should check for both UNBLOCK and ERP_INUSE because they are interleaved. Also we need to explicitly check ERP_FAILED for the link down case which currently does not clear the UNBLOCK flag in zfcp_fsf_link_down_info_eval(). Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 8830271c4819 ("[SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport") Fixes: a2fa0aede07c ("[SCSI] zfcp: Block FC transport rports early on errors") Fixes: 5f852be9e11d ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI") Fixes: 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable") Fixes: 3859f6a248cb ("[PATCH] zfcp: add rports to enable scsi_add_device to work again") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* scsi: zfcp: do not trace pure benign residual HBA responses at default levelSteffen Maier2017-06-172-3/+30
| | | | | | | | | | | | | | | | | | | | | | | commit 56d23ed7adf3974f10e91b643bd230e9c65b5f79 upstream. Since quite a while, Linux issues enough SCSI commands per scsi_device which successfully return with FCP_RESID_UNDER, FSF_FCP_RSP_AVAILABLE, and SAM_STAT_GOOD. This floods the HBA trace area and we cannot see other and important HBA trace records long enough. Therefore, do not trace HBA response errors for pure benign residual under counts at the default trace level. This excludes benign residual under count combined with other validity bits set in FCP_RSP_IU, such as FCP_SNS_LEN_VAL. For all those other cases, we still do want to see both the HBA record and the corresponding SCSI record by default. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* scsi: zfcp: fix use-after-"free" in FC ingress path after TMFBenjamin Block2017-06-173-3/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit dac37e15b7d511e026a9313c8c46794c144103cd upstream. When SCSI EH invokes zFCP's callbacks for eh_device_reset_handler() and eh_target_reset_handler(), it expects us to relent the ownership over the given scsi_cmnd and all other scsi_cmnds within the same scope - LUN or target - when returning with SUCCESS from the callback ('release' them). SCSI EH can then reuse those commands. We did not follow this rule to release commands upon SUCCESS; and if later a reply arrived for one of those supposed to be released commands, we would still make use of the scsi_cmnd in our ingress tasklet. This will at least result in undefined behavior or a kernel panic because of a wrong kernel pointer dereference. To fix this, we NULLify all pointers to scsi_cmnds (struct zfcp_fsf_req *)->data in the matching scope if a TMF was successful. This is done under the locks (struct zfcp_adapter *)->abort_lock and (struct zfcp_reqlist *)->lock to prevent the requests from being removed from the request-hashtable, and the ingress tasklet from making use of the scsi_cmnd-pointer in zfcp_fsf_fcp_cmnd_handler(). For cases where a reply arrives during SCSI EH, but before we get a chance to NULLify the pointer - but before we return from the callback -, we assume that the code is protected from races via the CAS operation in blk_complete_request() that is called in scsi_done(). The following stacktrace shows an example for a crash resulting from the previous behavior: Unable to handle kernel pointer dereference at virtual kernel address fffffee17a672000 Oops: 0038 [#1] SMP CPU: 2 PID: 0 Comm: swapper/2 Not tainted task: 00000003f7ff5be0 ti: 00000003f3d38000 task.ti: 00000003f3d38000 Krnl PSW : 0404d00180000000 00000000001156b0 (smp_vcpu_scheduled+0x18/0x40) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3 Krnl GPRS: 000000200000007e 0000000000000000 fffffee17a671fd8 0000000300000015 ffffffff80000000 00000000005dfde8 07000003f7f80e00 000000004fa4e800 000000036ce8d8f8 000000036ce8d9c0 00000003ece8fe00 ffffffff969c9e93 00000003fffffffd 000000036ce8da10 00000000003bf134 00000003f3b07918 Krnl Code: 00000000001156a2: a7190000 lghi %r1,0 00000000001156a6: a7380015 lhi %r3,21 #00000000001156aa: e32050000008 ag %r2,0(%r5) >00000000001156b0: 482022b0 lh %r2,688(%r2) 00000000001156b4: ae123000 sigp %r1,%r2,0(%r3) 00000000001156b8: b2220020 ipm %r2 00000000001156bc: 8820001c srl %r2,28 00000000001156c0: c02700000001 xilf %r2,1 Call Trace: ([<0000000000000000>] 0x0) [<000003ff807bdb8e>] zfcp_fsf_fcp_cmnd_handler+0x3de/0x490 [zfcp] [<000003ff807be30a>] zfcp_fsf_req_complete+0x252/0x800 [zfcp] [<000003ff807c0a48>] zfcp_fsf_reqid_check+0xe8/0x190 [zfcp] [<000003ff807c194e>] zfcp_qdio_int_resp+0x66/0x188 [zfcp] [<000003ff80440c64>] qdio_kick_handler+0xdc/0x310 [qdio] [<000003ff804463d0>] __tiqdio_inbound_processing+0xf8/0xcd8 [qdio] [<0000000000141fd4>] tasklet_action+0x9c/0x170 [<0000000000141550>] __do_softirq+0xe8/0x258 [<000000000010ce0a>] do_softirq+0xba/0xc0 [<000000000014187c>] irq_exit+0xc4/0xe8 [<000000000046b526>] do_IRQ+0x146/0x1d8 [<00000000005d6a3c>] io_return+0x0/0x8 [<00000000005d6422>] vtime_stop_cpu+0x4a/0xa0 ([<0000000000000000>] 0x0) [<0000000000103d8a>] arch_cpu_idle+0xa2/0xb0 [<0000000000197f94>] cpu_startup_entry+0x13c/0x1f8 [<0000000000114782>] smp_start_secondary+0xda/0xe8 [<00000000005d6efe>] restart_int_handler+0x56/0x6c [<0000000000000000>] 0x0 Last Breaking-Event-Address: [<00000000003bf12e>] arch_spin_lock_wait+0x56/0xb0 Suggested-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com> Fixes: ea127f9754 ("[PATCH] s390 (7/7): zfcp host adapter.") (tglx/history.git) Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* scsi: zfcp: spin_lock_irqsave() is not nestableDan Carpenter2017-04-111-1/+1
| | | | | | | | | | | | | | | commit e7cb08e894a0b876443ef8fdb0706575dc00a5d2 upstream. We accidentally overwrite the original saved value of "flags" so that we can't re-enable IRQs at the end of the function. Presumably this function is mostly called with IRQs disabled or it would be obvious in testing. Fixes: aceeffbb59bb ("zfcp: trace full payload of all SAN records (req,resp,iels)") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* zfcp: trace full payload of all SAN records (req,resp,iels)Steffen Maier2017-04-112-13/+104
| | | | | | | | | | | | | | | | | | | | commit aceeffbb59bb91404a0bda32a542d7ebf878433a upstream. This was lost with commit 2c55b750a884b86dea8b4cc5f15e1484cc47a25c ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.") but is necessary for problem determination, e.g. to see the currently active zone set during automatic port scan. For the large GPN_FT response (4 pages), save space by not dumping any empty residual entries. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 2c55b750a884 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.") Reviewed-by: Alexey Ishchuk <aishchuk@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* zfcp: fix payload trace length for SAN request&responseSteffen Maier2017-04-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 94db3725f049ead24c96226df4a4fb375b880a77 upstream. commit 2c55b750a884b86dea8b4cc5f15e1484cc47a25c ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.") started to add FC_CT_HDR_LEN which made zfcp dump random data out of bounds for RSPN GS responses because u.rspn.rsp is the largest and last field in the union of struct zfcp_fc_req. Other request/response types only happened to stay within bounds due to the padding of the union or due to the trace capping of u.gspn.rsp to ZFCP_DBF_SAN_MAX_PAYLOAD. Timestamp : ... Area : SAN Subarea : 00 Level : 1 Exception : - CPU id : .. Caller : ... Record id : 2 Tag : fsscth2 Request id : 0x... Destination ID : 0x00fffffc Payload short : 01000000 fc020000 80020000 00000000 xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx <=== 00000000 00000000 00000000 00000000 Payload length : 32 <=== struct zfcp_fc_req { [0] struct zfcp_fsf_ct_els ct_els; [56] struct scatterlist sg_req; [96] struct scatterlist sg_rsp; union { struct {req; rsp;} adisc; SIZE: 28+28= 56 struct {req; rsp;} gid_pn; SIZE: 24+20= 44 struct {rspsg; req;} gpn_ft; SIZE: 40*4+20=180 struct {req; rsp;} gspn; SIZE: 20+273= 293 struct {req; rsp;} rspn; SIZE: 277+16= 293 [136] } u; } SIZE: 432 Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 2c55b750a884 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.") Reviewed-by: Alexey Ishchuk <aishchuk@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* zfcp: fix D_ID field with actual value on tracing SAN responsesSteffen Maier2017-04-113-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 771bf03537ddfa4a4dde62ef9dfbc82e4f77ab20 upstream. With commit 2c55b750a884b86dea8b4cc5f15e1484cc47a25c ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.") we lost the N_Port-ID where an ELS response comes from. With commit 7c7dc196814b9e1d5cc254dc579a5fa78ae524f7 ("[SCSI] zfcp: Simplify handling of ct and els requests") we lost the N_Port-ID where a CT response comes from. It's especially useful if the request SAN trace record with D_ID was already lost due to trace buffer wrap. GS uses an open WKA port handle and ELS just a D_ID, and only for ELS we could get D_ID from QTCB bottom via zfcp_fsf_req. To cover both cases, add a new field to zfcp_fsf_ct_els and fill it in on request to use in SAN response trace. Strictly speaking the D_ID on SAN response is the FC frame's S_ID. We don't need a field for the other end which is always us. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 2c55b750a884 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.") Fixes: 7c7dc196814b ("[SCSI] zfcp: Simplify handling of ct and els requests") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* zfcp: restore tracing of handle for port and LUN with HBA recordsSteffen Maier2017-04-112-0/+4
| | | | | | | | | | | | | | | | commit 7c964ffe586bc0c3d9febe9bf97a2e4b2866e5b7 upstream. This information was lost with commit a54ca0f62f953898b05549391ac2a8a4dad6482b ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.") but is required to debug e.g. invalid handle situations. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* zfcp: trace on request for open and close of WKA portSteffen Maier2017-04-113-2/+39
| | | | | | | | | | | | | | | | | | | | commit d27a7cb91960cf1fdd11b10071e601828cbf4b1f upstream. Since commit a54ca0f62f953898b05549391ac2a8a4dad6482b ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.") HBA records no longer contain WWPN, D_ID, or LUN to reduce duplicate information which is already in REC records. In contrast to "regular" target ports, we don't use recovery to open WKA ports such as directory/nameserver, so we don't get REC records. Therefore, introduce pseudo REC running records without any actual recovery action but including D_ID of WKA port on open/close. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* zfcp: restore: Dont use 0 to indicate invalid LUN in rec traceSteffen Maier2017-04-111-1/+2
| | | | | | | | | | | | | | | | | | commit 0102a30a6ff60f4bb4c07358ca3b1f92254a6c25 upstream. bring back commit d21e9daa63e009ce5b87bbcaa6d11ce48e07bbbe ("[SCSI] zfcp: Dont use 0 to indicate invalid LUN in rec trace") which was lost with commit ae0904f60fab7cb20c48d32eefdd735e478b91fb ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.") Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: ae0904f60fab ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* zfcp: retain trace level for SCSI and HBA FSF response recordsSteffen Maier2017-04-113-10/+12
| | | | | | | | | | | | | | | | | | | | | | commit 35f040df97fa0e94c7851c054ec71533c88b4b81 upstream. While retaining the actual filtering according to trace level, the following commits started to write such filtered records with a hardcoded record level of 1 instead of the actual record level: commit 250a1352b95e1db3216e5c5d4f4365bea5122f4a ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.") commit a54ca0f62f953898b05549391ac2a8a4dad6482b ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.") Now we can distinguish written records again for offline level filtering. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 250a1352b95e ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.") Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* zfcp: close window with unblocked rport during rport goneSteffen Maier2017-04-113-5/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 4eeaa4f3f1d6c47b69f70e222297a4df4743363e upstream. On a successful end of reopen port forced, zfcp_erp_strategy_followup_success() re-uses the port erp_action and the subsequent zfcp_erp_action_cleanup() now sees ZFCP_ERP_SUCCEEDED with erp_action->action==ZFCP_ERP_ACTION_REOPEN_PORT instead of ZFCP_ERP_ACTION_REOPEN_PORT_FORCED but must not perform zfcp_scsi_schedule_rport_register(). We can detect this because the fresh port reopen erp_action is in its very first step ZFCP_ERP_STEP_UNINITIALIZED. Otherwise this opens a time window with unblocked rport (until the followup port reopen recovery would block it again). If a scsi_cmnd timeout occurs during this time window fc_timed_out() cannot work as desired and such command would indeed time out and trigger scsi_eh. This prevents a clean and timely path failover. This should not happen if the path issue can be recovered on FC transport layer such as path issues involving RSCNs. Also, unnecessary and repeated DID_IMM_RETRY for pending and undesired new requests occur because internally zfcp still has its zfcp_port blocked. As follow-on errors with scsi_eh, it can cause, in the worst case, permanently lost paths due to one of: sd <scsidev>: [<scsidisk>] Medium access timeout failure. Offlining disk! sd <scsidev>: Device offlined - not ready after error recovery For fix validation and to aid future debugging with other recoveries we now also trace (un)blocking of rports. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 5767620c383a ("[SCSI] zfcp: Do not unblock rport from REOPEN_PORT_FORCED") Fixes: a2fa0aede07c ("[SCSI] zfcp: Block FC transport rports early on errors") Fixes: 5f852be9e11d ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI") Fixes: 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable") Fixes: 3859f6a248cb ("[PATCH] zfcp: add rports to enable scsi_add_device to work again") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* zfcp: fix ELS/GS request&response length for hardware data routerSteffen Maier2017-04-111-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 70369f8e15b220f50a16348c79a61d3f7054813c upstream. In the hardware data router case, introduced with kernel 3.2 commit 86a9668a8d29 ("[SCSI] zfcp: support for hardware data router") the ELS/GS request&response length needs to be initialized as in the chained SBAL case. Otherwise, the FCP channel rejects ELS requests with FSF_REQUEST_SIZE_TOO_LARGE. Such ELS requests can be issued by user space through BSG / HBA API, or zfcp itself uses ADISC ELS for remote port link test on RSCN. The latter can cause a short path outage due to unnecessary remote target port recovery because the always failing ADISC cannot detect extremely short path interruptions beyond the local FCP channel. Below example is decoded with zfcpdbf from s390-tools: Timestamp : ... Area : SAN Subarea : 00 Level : 1 Exception : - CPU id : .. Caller : zfcp_dbf_san_req+0408 Record id : 1 Tag : fssels1 Request id : 0x<reqid> Destination ID : 0x00<target d_id> Payload info : 52000000 00000000 <our wwpn > [ADISC] <our wwnn > 00<s_id> 00000000 00000000 00000000 00000000 00000000 Timestamp : ... Area : HBA Subarea : 00 Level : 1 Exception : - CPU id : .. Caller : zfcp_dbf_hba_fsf_res+0740 Record id : 1 Tag : fs_ferr Request id : 0x<reqid> Request status : 0x00000010 FSF cmnd : 0x0000000b [FSF_QTCB_SEND_ELS] FSF sequence no: 0x... FSF issued : ... FSF stat : 0x00000061 [FSF_REQUEST_SIZE_TOO_LARGE] FSF stat qual : 00000000 00000000 00000000 00000000 Prot stat : 0x00000100 Prot stat qual : 00000000 00000000 00000000 00000000 Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 86a9668a8d29 ("[SCSI] zfcp: support for hardware data router") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* zfcp: fix fc_host port_type with NPIVSteffen Maier2017-04-111-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit bd77befa5bcff8c51613de271913639edf85fbc2 upstream. For an NPIV-enabled FCP device, zfcp can erroneously show "NPort (fabric via point-to-point)" instead of "NPIV VPORT" for the port_type sysfs attribute of the corresponding fc_host. s390-tools that can be affected are dbginfo.sh and ziomon. zfcp_fsf_exchange_config_evaluate() ignores fsf_qtcb_bottom_config.connection_features indicating NPIV and only sets fc_host_port_type to FC_PORTTYPE_NPORT if fsf_qtcb_bottom_config.fc_topology is FSF_TOPO_FABRIC. Only the independent zfcp_fsf_exchange_port_evaluate() evaluates connection_features to overwrite fc_host_port_type to FC_PORTTYPE_NPIV in case of NPIV. Code was introduced with upstream kernel 2.6.30 commit 0282985da5923fa6365adcc1a1586ae0c13c1617 ("[SCSI] zfcp: Report fc_host_port_type as NPIV"). This works during FCP device recovery (such as set online) because it performs FSF_QTCB_EXCHANGE_CONFIG_DATA followed by FSF_QTCB_EXCHANGE_PORT_DATA in sequence. However, the zfcp-specific scsi host sysfs attributes "requests", "megabytes", or "seconds_active" trigger only zfcp_fsf_exchange_config_evaluate() resetting fc_host port_type to FC_PORTTYPE_NPORT despite NPIV. The zfcp-specific scsi host sysfs attribute "utilization" triggers only zfcp_fsf_exchange_port_evaluate() correcting the fc_host port_type again in case of NPIV. Evaluate fsf_qtcb_bottom_config.connection_features in zfcp_fsf_exchange_config_evaluate() where it belongs to. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 0282985da592 ("[SCSI] zfcp: Report fc_host_port_type as NPIV") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* s390/dasd: fix hanging device after clear subchannelStefan Haberland2017-04-111-1/+9
| | | | | | | | | | | | | | | | | | commit 9ba333dc55cbb9523553df973adb3024d223e905 upstream. When a device is in a status where CIO has killed all I/O by itself the interrupt for a clear request may not contain an irb to determine the clear function. Instead it contains an error pointer -EIO. This was ignored by the DASD int_handler leading to a hanging device waiting for a clear interrupt. Handle -EIO error pointer correctly for requests that are clear pending and treat the clear as successful. Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* 3.10.102-> 3.10.103Jan Engelmohr2016-09-102-0/+2
|
* Linux 3.10.99 (accumulative patch)Stefan Guendhoer2016-08-261-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e39c17904aadf3a107b2bc292c03bfd9f850fd08 Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Date: Thu Mar 3 15:07:51 2016 -0800 Linux 3.10.99 commit d012f71377e1dad1165a0926c2920043e4047438 Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Date: Thu Feb 11 16:10:26 2016 -0500 xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted. commit 4d8c8bd6f2062c9988817183a91fe2e623c8aa5e upstream. Occasionaly PV guests would crash with: pciback 0000:00:00.1: Xen PCI mapped GSI0 to IRQ16 BUG: unable to handle kernel paging request at 0000000d1a8c0be0 .. snip.. <ffffffff8139ce1b>] find_next_bit+0xb/0x10 [<ffffffff81387f22>] cpumask_next_and+0x22/0x40 [<ffffffff813c1ef8>] pci_device_probe+0xb8/0x120 [<ffffffff81529097>] ? driver_sysfs_add+0x77/0xa0 [<ffffffff815293e4>] driver_probe_device+0x1a4/0x2d0 [<ffffffff813c1ddd>] ? pci_match_device+0xdd/0x110 [<ffffffff81529657>] __device_attach_driver+0xa7/0xb0 [<ffffffff815295b0>] ? __driver_attach+0xa0/0xa0 [<ffffffff81527622>] bus_for_each_drv+0x62/0x90 [<ffffffff8152978d>] __device_attach+0xbd/0x110 [<ffffffff815297fb>] device_attach+0xb/0x10 [<ffffffff813b75ac>] pci_bus_add_device+0x3c/0x70 [<ffffffff813b7618>] pci_bus_add_devices+0x38/0x80 [<ffffffff813dc34e>] pcifront_scan_root+0x13e/0x1a0 [<ffffffff817a0692>] pcifront_backend_changed+0x262/0x60b [<ffffffff814644c6>] ? xenbus_gather+0xd6/0x160 [<ffffffff8120900f>] ? put_object+0x2f/0x50 [<ffffffff81465c1d>] xenbus_otherend_changed+0x9d/0xa0 [<ffffffff814678ee>] backend_changed+0xe/0x10 [<ffffffff81463a28>] xenwatch_thread+0xc8/0x190 [<ffffffff810f22f0>] ? woken_wake_function+0x10/0x10 which was the result of two things: When we call pci_scan_root_bus we would pass in 'sd' (sysdata) pointer which was an 'pcifront_sd' structure. However in the pci_device_add it expects that the 'sd' is 'struct sysdata' and sets the dev->node to what is in sd->node (offset 4): set_dev_node(&dev->dev, pcibus_to_node(bus)); __pcibus_to_node(const struct pci_bus *bus) { const struct pci_sysdata *sd = bus->sysdata; return sd->node; } However our structure was pcifront_sd which had nothing at that offset: struct pcifront_sd { int domain; /* 0 4 */ /* XXX 4 bytes hole, try to pack */ struct pcifront_device * pdev; /* 8 8 */ } That is an hole - filled with garbage as we used kmalloc instead of kzalloc (the second problem). This patch fixes the issue by: 1) Use kzalloc to initialize to a well known state. 2) Put 'struct pci_sysdata' at the start of 'pcifront_sd'. That way access to the 'node' will access the right offset. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 1b153db5ac6c1598022b2b11ffb3be1303e3bffb Author: Al Viro <viro@zeniv.linux.org.uk> Date: Sat Feb 27 19:17:33 2016 -0500 do_last(): don't let a bogus return value from ->open() et.al. to confuse us commit c80567c82ae4814a41287618e315a60ecf513be6 upstream. ... into returning a positive to path_openat(), which would interpret that as "symlink had been encountered" and proceed to corrupt memory, etc. It can only happen due to a bug in some ->open() instance or in some LSM hook, etc., so we report any such event *and* make sure it doesn't trick us into further unpleasantness. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit b19e7a870c21104186d5834d469b6e4a60d5cc6a Author: Simon Guinot <simon.guinot@sequanux.org> Date: Thu Sep 10 00:15:18 2015 +0200 kernel/resource.c: fix muxed resource handling in __request_region() commit 59ceeaaf355fa0fb16558ef7c24413c804932ada upstream. In __request_region, if a conflict with a BUSY and MUXED resource is detected, then the caller goes to sleep and waits for the resource to be released. A pointer on the conflicting resource is kept. At wake-up this pointer is used as a parent to retry to request the region. A first problem is that this pointer might well be invalid (if for example the conflicting resource have already been freed). Another problem is that the next call to __request_region() fails to detect a remaining conflict. The previously conflicting resource is passed as a parameter and __request_region() will look for a conflict among the children of this resource and not at the resource itself. It is likely to succeed anyway, even if there is still a conflict. Instead, the parent of the conflicting resource should be passed to __request_region(). As a fix, this patch doesn't update the parent resource pointer in the case we have to wait for a muxed region right after. Reported-and-tested-by: Vincent Pelletier <plr.vincent@gmail.com> Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> Tested-by: Vincent Donnefort <vdonnefort@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit a7d9970fb5419b78310fb827615bd14bf7d94963 Author: Stefan Hajnoczi <stefanha@redhat.com> Date: Thu Feb 18 18:55:54 2016 +0000 sunrpc/cache: fix off-by-one in qword_get() commit b7052cd7bcf3c1478796e93e3dff2b44c9e82943 upstream. The qword_get() function NUL-terminates its output buffer. If the input string is in hex format \xXXXX... and the same length as the output buffer, there is an off-by-one: int qword_get(char **bpp, char *dest, int bufsize) { ... while (len < bufsize) { ... *dest++ = (h << 4) | l; len++; } ... *dest = '\0'; return len; } This patch ensures the NUL terminator doesn't fall outside the output buffer. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit eb63a905ff5f0d258693fd9991a94bc49188dcc3 Author: Steven Rostedt (Red Hat) <rostedt@goodmis.org> Date: Wed Feb 24 09:04:24 2016 -0500 tracing: Fix showing function event in available_events commit d045437a169f899dfb0f6f7ede24cc042543ced9 upstream. The ftrace:function event is only displayed for parsing the function tracer data. It is not used to enable function tracing, and does not include an "enable" file in its event directory. Originally, this event was kept separate from other events because it did not have a ->reg parameter. But perf added a "reg" parameter for its use which caused issues, because it made the event available to functions where it was not compatible for. Commit 9b63776fa3ca9 "tracing: Do not enable function event with enable" added a TRACE_EVENT_FL_IGNORE_ENABLE flag that prevented the function event from being enabled by normal trace events. But this commit missed keeping the function event from being displayed by the "available_events" directory, which is used to show what events can be enabled by set_event. One documented way to enable all events is to: cat available_events > set_event But because the function event is displayed in the available_events, this now causes an INVALID error: cat: write error: Invalid argument Reported-by: Chunyu Hu <chuhu@redhat.com> Fixes: 9b63776fa3ca9 "tracing: Do not enable function event with enable" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit bd54a801362f13ac756b4de4bb65d1a48c7d5fad Author: Christian Borntraeger <borntraeger@de.ibm.com> Date: Fri Feb 19 13:11:46 2016 +0100 KVM: async_pf: do not warn on page allocation failures commit d7444794a02ff655eda87e3cc54e86b940e7736f upstream. In async_pf we try to allocate with NOWAIT to get an element quickly or fail. This code also handle failures gracefully. Lets silence potential page allocation failures under load. qemu-system-s39: page allocation failure: order:0,mode:0x2200000 [...] Call Trace: ([<00000000001146b8>] show_trace+0xf8/0x148) [<000000000011476a>] show_stack+0x62/0xe8 [<00000000004a36b8>] dump_stack+0x70/0x98 [<0000000000272c3a>] warn_alloc_failed+0xd2/0x148 [<000000000027709e>] __alloc_pages_nodemask+0x94e/0xb38 [<00000000002cd36a>] new_slab+0x382/0x400 [<00000000002cf7ac>] ___slab_alloc.constprop.30+0x2dc/0x378 [<00000000002d03d0>] kmem_cache_alloc+0x160/0x1d0 [<0000000000133db4>] kvm_setup_async_pf+0x6c/0x198 [<000000000013dee8>] kvm_arch_vcpu_ioctl_run+0xd48/0xd58 [<000000000012fcaa>] kvm_vcpu_ioctl+0x372/0x690 [<00000000002f66f6>] do_vfs_ioctl+0x3be/0x510 [<00000000002f68ec>] SyS_ioctl+0xa4/0xb8 [<0000000000781c5e>] system_call+0xd6/0x264 [<000003ffa24fa06a>] 0x3ffa24fa06a Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 9c8e8b9e50d8e19ae8cbc721cd0a4da8af17fa26 Author: Christoph Hellwig <hch@lst.de> Date: Mon Feb 8 21:11:50 2016 +0100 nfs: fix nfs_size_to_loff_t commit 50ab8ec74a153eb30db26529088bc57dd700b24c upstream. See http: //www.infradead.org/rpr.html X-Evolution-Source: 1451162204.2173.11@leira.trondhjem.org Content-Transfer-Encoding: 8bit Mime-Version: 1.0 We support OFFSET_MAX just fine, so don't round down below it. Also switch to using min_t to make the helper more readable. Signed-off-by: Christoph Hellwig <hch@lst.de> Fixes: 433c92379d9c ("NFS: Clean up nfs_size_to_loff_t()") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit dcc121e02f773b4b8fc88522810214af39ea8313 Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Mon Jan 25 10:08:00 2016 -0600 PCI/AER: Flush workqueue on device remove to avoid use-after-free commit 4ae2182b1e3407de369f8c5d799543b7db74221b upstream. A Root Port's AER structure (rpc) contains a queue of events. aer_irq() enqueues AER status information and schedules aer_isr() to dequeue and process it. When we remove a device, aer_remove() waits for the queue to be empty, then frees the rpc struct. But aer_isr() references the rpc struct after dequeueing and possibly emptying the queue, which can cause a use-after-free error as in the following scenario with two threads, aer_isr() on the left and a concurrent aer_remove() on the right: Thread A Thread B -------- -------- aer_irq(): rpc->prod_idx++ aer_remove(): wait_event(rpc->prod_idx == rpc->cons_idx) # now blocked until queue becomes empty aer_isr(): # ... rpc->cons_idx++ # unblocked because queue is now empty ... kfree(rpc) mutex_unlock(&rpc->rpc_mutex) To prevent this problem, use flush_work() to wait until the last scheduled instance of aer_isr() has completed before freeing the rpc struct in aer_remove(). I reproduced this use-after-free by flashing a device FPGA and re-enumerating the bus to find the new device. With SLUB debug, this crashes with 0x6b bytes (POISON_FREE, the use-after-free magic number) in GPR25: pcieport 0000:00:00.0: AER: Multiple Corrected error received: id=0000 Unable to handle kernel paging request for data at address 0x27ef9e3e Workqueue: events aer_isr GPR24: dd6aa000 6b6b6b6b 605f8378 605f8360 d99b12c0 604fc674 606b1704 d99b12c0 NIP [602f5328] pci_walk_bus+0xd4/0x104 [bhelgaas: changelog, stable tag] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 43a349917d77250e1fb7dbcb44e50a80b8cab026 Author: Tejun Heo <tj@kernel.org> Date: Mon Feb 1 11:33:21 2016 -0500 libata: fix sff host state machine locking while polling commit 8eee1d3ed5b6fc8e14389567c9a6f53f82bb7224 upstream. The bulk of ATA host state machine is implemented by ata_sff_hsm_move(). The function is called from either the interrupt handler or, if polling, a work item. Unlike from the interrupt path, the polling path calls the function without holding the host lock and ata_sff_hsm_move() selectively grabs the lock. This is completely broken. If an IRQ triggers while polling is in progress, the two can easily race and end up accessing the hardware and updating state machine state at the same time. This can put the state machine in an illegal state and lead to a crash like the following. kernel BUG at drivers/ata/libata-sff.c:1302! invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN Modules linked in: CPU: 1 PID: 10679 Comm: syz-executor Not tainted 4.5.0-rc1+ #300 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88002bd00000 ti: ffff88002e048000 task.ti: ffff88002e048000 RIP: 0010:[<ffffffff83a83409>] [<ffffffff83a83409>] ata_sff_hsm_move+0x619/0x1c60 ... Call Trace: <IRQ> [<ffffffff83a84c31>] __ata_sff_port_intr+0x1e1/0x3a0 drivers/ata/libata-sff.c:1584 [<ffffffff83a85611>] ata_bmdma_port_intr+0x71/0x400 drivers/ata/libata-sff.c:2877 [< inline >] __ata_sff_interrupt drivers/ata/libata-sff.c:1629 [<ffffffff83a85bf3>] ata_bmdma_interrupt+0x253/0x580 drivers/ata/libata-sff.c:2902 [<ffffffff81479f98>] handle_irq_event_percpu+0x108/0x7e0 kernel/irq/handle.c:157 [<ffffffff8147a717>] handle_irq_event+0xa7/0x140 kernel/irq/handle.c:205 [<ffffffff81484573>] handle_edge_irq+0x1e3/0x8d0 kernel/irq/chip.c:623 [< inline >] generic_handle_irq_desc include/linux/irqdesc.h:146 [<ffffffff811a92bc>] handle_irq+0x10c/0x2a0 arch/x86/kernel/irq_64.c:78 [<ffffffff811a7e4d>] do_IRQ+0x7d/0x1a0 arch/x86/kernel/irq.c:240 [<ffffffff86653d4c>] common_interrupt+0x8c/0x8c arch/x86/entry/entry_64.S:520 <EOI> [< inline >] rcu_lock_acquire include/linux/rcupdate.h:490 [< inline >] rcu_read_lock include/linux/rcupdate.h:874 [<ffffffff8164b4a1>] filemap_map_pages+0x131/0xba0 mm/filemap.c:2145 [< inline >] do_fault_around mm/memory.c:2943 [< inline >] do_read_fault mm/memory.c:2962 [< inline >] do_fault mm/memory.c:3133 [< inline >] handle_pte_fault mm/memory.c:3308 [< inline >] __handle_mm_fault mm/memory.c:3418 [<ffffffff816efb16>] handle_mm_fault+0x2516/0x49a0 mm/memory.c:3447 [<ffffffff8127dc16>] __do_page_fault+0x376/0x960 arch/x86/mm/fault.c:1238 [<ffffffff8127e358>] trace_do_page_fault+0xe8/0x420 arch/x86/mm/fault.c:1331 [<ffffffff8126f514>] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:264 [<ffffffff86655578>] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:986 Fix it by ensuring that the polling path is holding the host lock before entering ata_sff_hsm_move() so that all hardware accesses and state updates are performed under the host lock. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-tested-by: Dmitry Vyukov <dvyukov@google.com> Link: http://lkml.kernel.org/g/CACT4Y+b_JsOxJu2EZyEf+mOXORc_zid5V1-pLZSroJVxyWdSpw@mail.gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 49662cfccb3f4ade7beb77a1f67f2c2db0027edc Author: Tejun Heo <tj@kernel.org> Date: Tue Feb 9 16:11:26 2016 -0500 Revert "workqueue: make sure delayed work run in local cpu" commit 041bd12e272c53a35c54c13875839bcb98c999ce upstream. This reverts commit 874bbfe600a660cba9c776b3957b1ce393151b76. Workqueue used to implicity guarantee that work items queued without explicit CPU specified are put on the local CPU. Recent changes in timer broke the guarantee and led to vmstat breakage which was fixed by 176bed1de5bf ("vmstat: explicitly schedule per-cpu work on the CPU we need it to run on"). vmstat is the most likely to expose the issue and it's quite possible that there are other similar problems which are a lot more difficult to trigger. As a preventive measure, 874bbfe600a6 ("workqueue: make sure delayed work run in local cpu") was applied to restore the local CPU guarnatee. Unfortunately, the change exposed a bug in timer code which got fixed by 22b886dd1018 ("timers: Use proper base migration in add_timer_on()"). Due to code restructuring, the commit couldn't be backported beyond certain point and stable kernels which only had 874bbfe600a6 started crashing. The local CPU guarantee was accidental more than anything else and we want to get rid of it anyway. As, with the vmstat case fixed, 874bbfe600a6 is causing more problems than it's fixing, it has been decided to take the chance and officially break the guarantee by reverting the commit. A debug feature will be added to force foreign CPU assignment to expose cases relying on the guarantee and fixes for the individual cases will be backported to stable as necessary. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 874bbfe600a6 ("workqueue: make sure delayed work run in local cpu") Link: http://lkml.kernel.org/g/20160120211926.GJ10810@quack.suse.cz Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Daniel Bilik <daniel.bilik@neosystem.cz> Cc: Jan Kara <jack@suse.cz> Cc: Shaohua Li <shli@fb.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Bilik <daniel.bilik@neosystem.cz> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 01e527fd4c73d05125d04ce3bfd4413eb5af2581 Author: Johannes Berg <johannes.berg@intel.com> Date: Tue Jan 26 11:29:03 2016 +0100 rfkill: fix rfkill_fop_read wait_event usage commit 6736fde9672ff6717ac576e9bba2fd5f3dfec822 upstream. The code within wait_event_interruptible() is called with !TASK_RUNNING, so mustn't call any functions that can sleep, like mutex_lock(). Since we re-check the list_empty() in a loop after the wait, it's safe to simply use list_empty() without locking. This bug has existed forever, but was only discovered now because all userspace implementations, including the default 'rfkill' tool, use poll() or select() to get a readable fd before attempting to read. Fixes: c64fb01627e24 ("rfkill: create useful userspace interface") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit eb80decbf08882c0dc6573d329fd9b5c1aff3c31 Author: Oliver Neukum <oneukum@suse.com> Date: Mon Jan 18 15:45:18 2016 +0100 cdc-acm:exclude Samsung phone 04e8:685d commit e912e685f372ab62a2405a1acd923597f524e94a upstream. This phone needs to be handled by a specialised firmware tool and is reported to crash irrevocably if cdc-acm takes it. Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 636a9c8a87da5056b4254ff9eaf67cf52c8c2d1d Author: Ilya Dryomov <idryomov@gmail.com> Date: Wed Feb 17 20:04:08 2016 +0100 libceph: don't bail early from try_read() when skipping a message commit e7a88e82fe380459b864e05b372638aeacb0f52d upstream. The contract between try_read() and try_write() is that when called each processes as much data as possible. When instructed by osd_client to skip a message, try_read() is violating this contract by returning after receiving and discarding a single message instead of checking for more. try_write() then gets a chance to write out more requests, generating more replies/skips for try_read() to handle, forcing the messenger into a starvation loop. Reported-by: Varada Kari <Varada.Kari@sandisk.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Varada Kari <Varada.Kari@sandisk.com> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit b6c92a436f3e930f7d7d8d456cf3ac36602039cf Author: Mike Marciniszyn <mike.marciniszyn@intel.com> Date: Thu Jan 7 16:44:10 2016 -0500 IB/qib: fix mcast detach when qp not attached commit 09dc9cd6528f5b52bcbd3292a6312e762c85260f upstream. The code produces the following trace: [1750924.419007] general protection fault: 0000 [#3] SMP [1750924.420364] Modules linked in: nfnetlink autofs4 rpcsec_gss_krb5 nfsv4 dcdbas rfcomm bnep bluetooth nfsd auth_rpcgss nfs_acl dm_multipath nfs lockd scsi_dh sunrpc fscache radeon ttm drm_kms_helper drm serio_raw parport_pc ppdev i2c_algo_bit lpc_ich ipmi_si ib_mthca ib_qib dca lp parport ib_ipoib mac_hid ib_cm i3000_edac ib_sa ib_uverbs edac_core ib_umad ib_mad ib_core ib_addr tg3 ptp dm_mirror dm_region_hash dm_log psmouse pps_core [1750924.420364] CPU: 1 PID: 8401 Comm: python Tainted: G D 3.13.0-39-generic #66-Ubuntu [1750924.420364] Hardware name: Dell Computer Corporation PowerEdge 860/0XM089, BIOS A04 07/24/2007 [1750924.420364] task: ffff8800366a9800 ti: ffff88007af1c000 task.ti: ffff88007af1c000 [1750924.420364] RIP: 0010:[<ffffffffa0131d51>] [<ffffffffa0131d51>] qib_mcast_qp_free+0x11/0x50 [ib_qib] [1750924.420364] RSP: 0018:ffff88007af1dd70 EFLAGS: 00010246 [1750924.420364] RAX: 0000000000000001 RBX: ffff88007b822688 RCX: 000000000000000f [1750924.420364] RDX: ffff88007b822688 RSI: ffff8800366c15a0 RDI: 6764697200000000 [1750924.420364] RBP: ffff88007af1dd78 R08: 0000000000000001 R09: 0000000000000000 [1750924.420364] R10: 0000000000000011 R11: 0000000000000246 R12: ffff88007baa1d98 [1750924.420364] R13: ffff88003ecab000 R14: ffff88007b822660 R15: 0000000000000000 [1750924.420364] FS: 00007ffff7fd8740(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000 [1750924.420364] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1750924.420364] CR2: 00007ffff597c750 CR3: 000000006860b000 CR4: 00000000000007e0 [1750924.420364] Stack: [1750924.420364] ffff88007b822688 ffff88007af1ddf0 ffffffffa0132429 000000007af1de20 [1750924.420364] ffff88007baa1dc8 ffff88007baa0000 ffff88007af1de70 ffffffffa00cb313 [1750924.420364] 00007fffffffde88 0000000000000000 0000000000000008 ffff88003ecab000 [1750924.420364] Call Trace: [1750924.420364] [<ffffffffa0132429>] qib_multicast_detach+0x1e9/0x350 [ib_qib] [1750924.568035] [<ffffffffa00cb313>] ? ib_uverbs_modify_qp+0x323/0x3d0 [ib_uverbs] [1750924.568035] [<ffffffffa0092d61>] ib_detach_mcast+0x31/0x50 [ib_core] [1750924.568035] [<ffffffffa00cc213>] ib_uverbs_detach_mcast+0x93/0x170 [ib_uverbs] [1750924.568035] [<ffffffffa00c61f6>] ib_uverbs_write+0xc6/0x2c0 [ib_uverbs] [1750924.568035] [<ffffffff81312e68>] ? apparmor_file_permission+0x18/0x20 [1750924.568035] [<ffffffff812d4cd3>] ? security_file_permission+0x23/0xa0 [1750924.568035] [<ffffffff811bd214>] vfs_write+0xb4/0x1f0 [1750924.568035] [<ffffffff811bdc49>] SyS_write+0x49/0xa0 [1750924.568035] [<ffffffff8172f7ed>] system_call_fastpath+0x1a/0x1f [1750924.568035] Code: 66 2e 0f 1f 84 00 00 00 00 00 31 c0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 8b 7f 10 <f0> ff 8f 40 01 00 00 74 0e 48 89 df e8 8e f8 06 e1 5b 5d c3 0f [1750924.568035] RIP [<ffffffffa0131d51>] qib_mcast_qp_free+0x11/0x50 [ib_qib] [1750924.568035] RSP <ffff88007af1dd70> [1750924.650439] ---[ end trace 73d5d4b3f8ad4851 ] The fix is to note the qib_mcast_qp that was found. If none is found, then return EINVAL indicating the error. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit a81bfc00a3f9eb778bad0d97551d91c332ec2f1b Author: Rasmus Villemoes <linux@rasmusvillemoes.dk> Date: Mon Feb 15 19:41:47 2016 +0100 drm/radeon: use post-decrement in error handling commit bc3f5d8c4ca01555820617eb3b6c0857e4df710d upstream. We need to use post-decrement to get the pci_map_page undone also for i==0, and to avoid some very unpleasant behaviour if pci_map_page failed already at i==0. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7 Author: Nicolai Hähnle <nicolai.haehnle@amd.com> Date: Fri Feb 5 14:35:53 2016 -0500 drm/radeon: hold reference to fences in radeon_sa_bo_new commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream. An arbitrary amount of time can pass between spin_unlock and radeon_fence_wait_any, so we need to ensure that nobody frees the fences from under us. Based on the analogous fix for amdgpu. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit cb071fb6ace47b8422edd02ee91116d4f31c2c92 Author: Alex Deucher <alexander.deucher@amd.com> Date: Thu Dec 17 12:52:17 2015 -0500 drm/radeon: clean up fujitsu quirks commit 0eb1c3d4084eeb6fb3a703f88d6ce1521f8fcdd1 upstream. Combine the two quirks. bug: https://bugzilla.kernel.org/show_bug.cgi?id=109481 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit a1c393a2a324f2a1210a72fa60e5c9da03b1f1ce Author: Rob Clark <robdclark@gmail.com> Date: Wed Oct 15 15:00:47 2014 -0400 drm/vmwgfx: respect 'nomodeset' commit 96c5d076f0a5e2023ecdb44d8261f87641ee71e0 upstream. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>. Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 39e88dd4da3ddc4c07150fe75d9590a648d0eb0f Author: Dmitry V. Levin <ldv@altlinux.org> Date: Sun Dec 27 02:13:27 2015 +0300 sparc64: fix incorrect sign extension in sys_sparc64_personality commit 525fd5a94e1be0776fa652df5c687697db508c91 upstream. The value returned by sys_personality has type "long int". It is saved to a variable of type "int", which is not a problem yet because the type of task_struct->pesonality is "unsigned int". The problem is the sign extension from "int" to "long int" that happens on return from sys_sparc64_personality. For example, a userspace call personality((unsigned) -EINVAL) will result to any subsequent personality call, including absolutely harmless read-only personality(0xffffffff) call, failing with errno set to EINVAL. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit bf64271c53bfc54deeebf7ffa69d2df03ae58cf5 Author: Linus Walleij <linus.walleij@linaro.org> Date: Mon Jan 4 02:21:55 2016 +0100 mmc: mmci: fix an ages old detection error commit 0bcb7efdff63564e80fe84dd36a9fbdfbf6697a4 upstream. commit 4956e10903fd ("ARM: 6244/1: mmci: add variant data and default MCICLOCK support") added variant data for ARM, U300 and Ux500 variants. The Nomadik NHK8815/8820 variant was erroneously labeled as a U300 variant, and when the proper Nomadik variant was later introduced in commit 34fd421349ff ("ARM: 7378/1: mmci: add support for the Nomadik MMCI variant") this was not fixes. Let's say this fixes the latter commit as there was no proper Nomadik support until then. Fixes: 34fd421349ff ("ARM: 7378/1: mmci: add support for the Nomadik...") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 794c33bd215bbd1f7a3513f9c70a8e0afcbfcd7a Author: Richard Cochran <richardcochran@gmail.com> Date: Tue Dec 22 22:19:58 2015 +0100 posix-clock: Fix return code on the poll method's error path commit 1b9f23727abb92c5e58f139e7d180befcaa06fe0 upstream. The posix_clock_poll function is supposed to return a bit mask of POLLxxx values. However, in case the hardware has disappeared (due to hot plugging for example) this code returns -ENODEV in a futile attempt to throw an error at the file descriptor level. The kernel's file_operations interface does not accept such error codes from the poll method. Instead, this function aught to return POLLERR. The value -ENODEV does, in fact, contain the POLLERR bit (and almost all the other POLLxxx bits as well), but only by chance. This patch fixes code to return a proper bit mask. Credit goes to Markus Elfring for pointing out the suspicious signed/unsigned mismatch. Reported-by: Markus Elfring <elfring@users.sourceforge.net> igned-off-by: Richard Cochran <richardcochran@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Julia Lawall <julia.lawall@lip6.fr> Link: http://lkml.kernel.org/r/1450819198-17420-1-git-send-email-richardcochran@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit eeb7b0e01658684a743ccfd66a668e8a56d5ebb9 Author: Mikulas Patocka <mpatocka@redhat.com> Date: Fri Jan 8 19:07:55 2016 -0500 dm snapshot: fix hung bios when copy error occurs commit 385277bfb57faac44e92497104ba542cdd82d5fe upstream. When there is an error copying a chunk dm-snapshot can incorrectly hold associated bios indefinitely, resulting in hung IO. The function copy_callback sets pe->error if there was error copying the chunk, and then calls complete_exception. complete_exception calls pending_complete on error, otherwise it calls commit_exception with commit_callback (and commit_callback calls complete_exception). The persistent exception store (dm-snap-persistent.c) assumes that calls to prepare_exception and commit_exception are paired. persistent_prepare_exception increases ps->pending_count and persistent_commit_exception decreases it. If there is a copy error, persistent_prepare_exception is called but persistent_commit_exception is not. This results in the variable ps->pending_count never returning to zero and that causes some pending exceptions (and their associated bios) to be held forever. Fix this by unconditionally calling commit_exception regardless of whether the copy was successful. A new "valid" parameter is added to commit_exception -- when the copy fails this parameter is set to zero so that the chunk that failed to copy (and all following chunks) is not recorded in the snapshot store. Also, remove commit_callback now that it is merely a wrapper around pending_complete. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 2cd529968dd0647bf75e24f4e36fef99fc536b58 Author: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Date: Wed Feb 3 17:33:48 2016 -0200 tda1004x: only update the frontend properties if locked commit e8beb02343e7582980c6705816cd957cf4f74c7a upstream. The tda1004x was updating the properties cache before locking. If the device is not locked, the data at the registers are just random values with no real meaning. This caused the driver to fail with libdvbv5, as such library calls GET_PROPERTY from time to time, in order to return the DVB stats. Tested with a saa7134 card 78: ASUSTeK P7131 Dual, vendor PCI ID: 1043:4862 Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 590d6fd45445a32e6efecd8158a96e8dd09f8281 Author: Antonio Ospite <ao2@ao2.it> Date: Fri Oct 2 17:33:13 2015 -0300 gspca: ov534/topro: prevent a division by 0 commit dcc7fdbec53a960588f2c40232db2c6466c09917 upstream. v4l2-compliance sends a zeroed struct v4l2_streamparm in v4l2-test-formats.cpp::testParmType(), and this results in a division by 0 in some gspca subdrivers: divide error: 0000 [#1] SMP Modules linked in: gspca_ov534 gspca_main ... CPU: 0 PID: 17201 Comm: v4l2-compliance Not tainted 4.3.0-rc2-ao2 #1 Hardware name: System manufacturer System Product Name/M2N-E SLI, BIOS ASUS M2N-E SLI ACPI BIOS Revision 1301 09/16/2010 task: ffff8800818306c0 ti: ffff880095c4c000 task.ti: ffff880095c4c000 RIP: 0010:[<ffffffffa079bd62>] [<ffffffffa079bd62>] sd_set_streamparm+0x12/0x60 [gspca_ov534] RSP: 0018:ffff880095c4fce8 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffff8800c9522000 RCX: ffffffffa077a140 RDX: 0000000000000000 RSI: ffff880095e0c100 RDI: ffff8800c9522000 RBP: ffff880095e0c100 R08: ffffffffa077a100 R09: 00000000000000cc R10: ffff880067ec7740 R11: 0000000000000016 R12: ffffffffa07bb400 R13: 0000000000000000 R14: ffff880081b6a800 R15: 0000000000000000 FS: 00007fda0de78740(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000014630f8 CR3: 00000000cf349000 CR4: 00000000000006f0 Stack: ffffffffa07a6431 ffff8800c9522000 ffffffffa077656e 00000000c0cc5616 ffff8800c9522000 ffffffffa07a5e20 ffff880095e0c100 0000000000000000 ffff880067ec7740 ffffffffa077a140 ffff880067ec7740 0000000000000016 Call Trace: [<ffffffffa07a6431>] ? v4l_s_parm+0x21/0x50 [videodev] [<ffffffffa077656e>] ? vidioc_s_parm+0x4e/0x60 [gspca_main] [<ffffffffa07a5e20>] ? __video_do_ioctl+0x280/0x2f0 [videodev] [<ffffffffa07a5ba0>] ? video_ioctl2+0x20/0x20 [videodev] [<ffffffffa07a59b9>] ? video_usercopy+0x319/0x4e0 [videodev] [<ffffffff81182dc1>] ? page_add_new_anon_rmap+0x71/0xa0 [<ffffffff811afb92>] ? mem_cgroup_commit_charge+0x52/0x90 [<ffffffff81179b18>] ? handle_mm_fault+0xc18/0x1680 [<ffffffffa07a15cc>] ? v4l2_ioctl+0xac/0xd0 [videodev] [<ffffffff811c846f>] ? do_vfs_ioctl+0x28f/0x480 [<ffffffff811c86d4>] ? SyS_ioctl+0x74/0x80 [<ffffffff8154a8b6>] ? entry_SYSCALL_64_fastpath+0x16/0x75 Code: c7 93 d9 79 a0 5b 5d e9 f1 f3 9a e0 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 31 d2 48 89 fb 48 83 ec 08 8b 46 10 <f7> 76 0c 80 bf ac 0c 00 00 00 88 87 4e 0e 00 00 74 09 80 bf 4f RIP [<ffffffffa079bd62>] sd_set_streamparm+0x12/0x60 [gspca_ov534] RSP <ffff880095c4fce8> ---[ end trace 279710c2c6c72080 ]--- Following what the doc says about a zeroed timeperframe (see http://www.linuxtv.org/downloads/v4l-dvb-apis/vidioc-g-parm.html): ... To reset manually applications can just set this field to zero. fix the issue by resetting the frame rate to a default value in case of an unusable timeperframe. The fix is done in the subdrivers instead of gspca.c because only the subdrivers have notion of a default frame rate to reset the camera to. Signed-off-by: Antonio Ospite <ao2@ao2.it> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 1b29bcfbae9971d931fd65f6ae075d394907517b Author: Malcolm Priestley <tvboxspy@gmail.com> Date: Mon Aug 31 06:13:45 2015 -0300 media: dvb-core: Don't force CAN_INVERSION_AUTO in oneshot mode commit c9d57de6103e343f2d4e04ea8d9e417e10a24da7 upstream. When in FE_TUNE_MODE_ONESHOT the frontend must report the actual capabilities so user can take appropriate action. With frontends that can't do auto inversion this is done by dvb-core automatically so CAN_INVERSION_AUTO is valid. However, when in FE_TUNE_MODE_ONESHOT this is not true. So only set FE_CAN_INVERSION_AUTO in modes other than FE_TUNE_MODE_ONESHOT Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 4c77a51256e0b05b0948c1f5dd06dfb2b5abe489 Author: Vegard Nossum <vegard.nossum@oracle.com> Date: Wed Dec 16 21:59:56 2015 +0100 uml: fix hostfs mknod() commit 9f2dfda2f2f1c6181c3732c16b85c59ab2d195e0 upstream. An inverted return value check in hostfs_mknod() caused the function to return success after handling it as an error (and cleaning up). It resulted in the following segfault when trying to bind() a named unix socket: Pid: 198, comm: a.out Not tainted 4.4.0-rc4 RIP: 0033:[<0000000061077df6>] RSP: 00000000daae5d60 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 000000006092a460 RCX: 00000000dfc54208 RDX: 0000000061073ef1 RSI: 0000000000000070 RDI: 00000000e027d600 RBP: 00000000daae5de0 R08: 00000000da980ac0 R09: 0000000000000000 R10: 0000000000000003 R11: 00007fb1ae08f72a R12: 0000000000000000 R13: 000000006092a460 R14: 00000000daaa97c0 R15: 00000000daaa9a88 Kernel panic - not syncing: Kernel mode fault at addr 0x40, ip 0x61077df6 CPU: 0 PID: 198 Comm: a.out Not tainted 4.4.0-rc4 #1 Stack: e027d620 dfc54208 0000006f da981398 61bee000 0000c1ed daae5de0 0000006e e027d620 dfcd4208 00000005 6092a460 Call Trace: [<60dedc67>] SyS_bind+0xf7/0x110 [<600587be>] handle_syscall+0x7e/0x80 [<60066ad7>] userspace+0x3e7/0x4e0 [<6006321f>] ? save_registers+0x1f/0x40 [<6006c88e>] ? arch_prctl+0x1be/0x1f0 [<60054985>] fork_handler+0x85/0x90 Let's also get rid of the "cosmic ray protection" while we're at it. Fixes: e9193059b1b3 "hostfs: fix races in dentry_name() and inode_name()" Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 9cbb43b99bf138e44deef9957678bc464f3bfd82 Author: Vegard Nossum <vegard.nossum@oracle.com> Date: Fri Dec 18 21:28:53 2015 +0100 uml: flush stdout before forking commit 0754fb298f2f2719f0393491d010d46cfb25d043 upstream. I was seeing some really weird behaviour where piping UML's output somewhere would cause output to get duplicated: $ ./vmlinux | head -n 40 Checking that ptrace can change system call numbers...Core dump limits : soft - 0 hard - NONE OK Checking syscall emulation patch for ptrace...Core dump limits : soft - 0 hard - NONE OK Checking advanced syscall emulation patch for ptrace...Core dump limits : soft - 0 hard - NONE OK Core dump limits : soft - 0 hard - NONE This is because these tests do a fork() which duplicates the non-empty stdout buffer, then glibc flushes the duplicated buffer as each child exits. A simple workaround is to flush before forking. Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 509a4bccfecaaf92af4bc27b9b212fa4a24e756c Author: Stefan Haberland <stefan.haberland@de.ibm.com> Date: Tue Dec 15 10:45:05 2015 +0100 s390/dasd: fix refcount for PAV reassignment commit 9d862ababb609439c5d6987f6d3ddd09e703aa0b upstream. Add refcount to the DASD device when a summary unit check worker is scheduled. This prevents that the device is set offline with worker in place. Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 9155d3f258f83643bad27297e7539a67bb81122c Author: Stefan Haberland <stefan.haberland@de.ibm.com> Date: Tue Dec 15 10:16:43 2015 +0100 s390/dasd: prevent incorrect length error under z/VM after PAV changes commit 020bf042e5b397479c1174081b935d0ff15d1a64 upstream. The channel checks the specified length and the provided amount of data for CCWs and provides an incorrect length error if the size does not match. Under z/VM with simulation activated the length may get changed. Having the suppress length indication bit set is stated as good CCW coding practice and avoids errors under z/VM. Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 396a61bef1418705af82ab7b5d1e1a193a699dd2 Author: Ard Biesheuvel <ard.biesheuvel@linaro.org> Date: Fri Jan 1 13:39:22 2016 +0100 s390: fix normalization bug in exception table sorting commit bcb7825a77f41c7dd91da6f7ac10b928156a322e upstream. The normalization pass in the sorting routine of the relative exception table serves two purposes: - it ensures that the address fields of the exception table entries are fully ordered, so that no ambiguities arise between entries with identical instruction offsets (i.e., when two instructions that are exactly 8 bytes apart each have an exception table entry associated with them) - it ensures that the offsets of both the instruction and the fixup fields of each entry are relative to their final location after sorting. Commit eb608fb366de ("s390/exceptions: switch to relative exception table entries") ported the relative exception table format from x86, but modified the sorting routine to only normalize the instruction offset field and not the fixup offset field. The result is that the fixup offset of each entry will be relative to the original location of the entry before sorting, likely leading to crashes when those entries are dereferenced. Fixes: eb608fb366de ("s390/exceptions: switch to relative exception table entries") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit dbf7fab09e9e0612a23ec7fee9570663b21fdfc9 Author: Filipe Manana <fdmanana@suse.com> Date: Thu Dec 31 18:16:29 2015 +0000 Btrfs: fix number of transaction units required to create symlink commit 9269d12b2d57d9e3d13036bb750762d1110d425c upstream. We weren't accounting for the insertion of an inline extent item for the symlink inode nor that we need to update the parent inode item (through the call to btrfs_add_nondir()). So fix this by including two more transaction units. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 1f81573daa963299aaf5692de7dfc39185f8a96d Author: Filipe Manana <fdmanana@suse.com> Date: Thu Dec 31 18:07:59 2015 +0000 Btrfs: send, don't BUG_ON() when an empty symlink is found commit a879719b8c90e15c9e7fa7266d5e3c0ca962f9df upstream. When a symlink is successfully created it always has an inline extent containing the source path. However if an error happens when creating the symlink, we can leave in the subvolume's tree a symlink inode without any such inline extent item - this happens if after btrfs_symlink() calls btrfs_end_transaction() and before it calls the inode eviction handler (through the final iput() call), the transaction gets committed and a crash happens before the eviction handler gets called, or if a snapshot of the subvolume is made before the eviction handler gets called. Sadly we can't just avoid this by making btrfs_symlink() call btrfs_end_transaction() after it calls the eviction handler, because the later can commit the current transaction before it removes any items from the subvolume tree (if it encounters ENOSPC errors while reserving space for removing all the items). So make send fail more gracefully, with an -EIO error, and print a message to dmesg/syslog informing that there's an empty symlink inode, so that the user can delete the empty symlink or do something else about it. Reported-by: Stephen R. van den Berg <srb@cuci.nl> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit d6ecb7ece33b2da65dffa50ce13f28fec392ec54 Author: Josef Bacik <jbacik@fb.com> Date: Thu Oct 22 15:05:09 2015 -0400 Btrfs: igrab inode in writepage commit be7bd730841e69fe8f70120098596f648cd1f3ff upstream. We hit this panic on a few of our boxes this week where we have an ordered_extent with an NULL inode. We do an igrab() of the inode in writepages, but weren't doing it in writepage which can be called directly from the VM on dirty pages. If the inode has been unlinked then we could have I_FREEING set which means igrab() would return NULL and we get this panic. Fix this by trying to igrab in btrfs_writepage, and if it returns NULL then just redirty the page and return AOP_WRITEPAGE_ACTIVATE; so the VM knows it wasn't successful. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit bf00d124e07fda629e60709c6187437d66848a9d Author: Anand Jain <anand.jain@oracle.com> Date: Wed Oct 7 17:23:23 2015 +0800 Btrfs: add missing brelse when superblock checksum fails commit b2acdddfad13c38a1e8b927d83c3cf321f63601a upstream. Looks like oversight, call brelse() when checksum fails. Further down the code, in the non error path, we do call brelse() and so we don't see brelse() in the goto error paths. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 099e9d4f2c57fba73d62db6764e09644636cbbc2 Author: Russell King <rmk+kernel@arm.linux.org.uk> Date: Fri Dec 11 12:09:03 2015 +0000 scripts: recordmcount: break hardlinks commit dd39a26538e37f6c6131e829a4a510787e43c783 upstream. recordmcount edits the file in-place, which can cause problems when using ccache in hardlink mode. Arrange for recordmcount to break a hardlinked object. Link: http://lkml.kernel.org/r/E1a7MVT-0000et-62@rmk-PC.arm.linux.org.uk Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit f80e6add955c84db83cc5a230c967635f0a808b9 Author: James Bottomley <James.Bottomley@HansenPartnership.com> Date: Fri Dec 11 09:16:38 2015 -0800 ses: fix additional element traversal bug commit 5e1033561da1152c57b97ee84371dba2b3d64c25 upstream. KASAN found that our additional element processing scripts drop off the end of the VPD page into unallocated space. The reason is that not every element has additional information but our traversal routines think they do, leading to them expecting far more additional information than is present. Fix this by adding a gate to the traversal routine so that it only processes elements that are expected to have additional information (list is in SES-2 section 6.1.13.1: Additional Element Status diagnostic page overview) Reported-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Tested-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit b8569305e453645f1227a627ec1ced1c68291d63 Author: James Bottomley <James.Bottomley@HansenPartnership.com> Date: Tue Dec 8 09:00:31 2015 -0800 ses: Fix problems with simple enclosures commit 3417c1b5cb1fdc10261dbed42b05cc93166a78fd upstream. Simple enclosure implementations (mostly USB) are allowed to return only page 8 to every diagnostic query. That really confuses our implementation because we assume the return is the page we asked for and end up doing incorrect offsets based on bogus information leading to accesses outside of allocated ranges. Fix that by checking the page code of the return and giving an error if it isn't the one we asked for. This should fix reported bugs with USB storage by simply refusing to attach to enclosures that behave like this. It's also good defensive practise now that we're starting to see more USB enclosures. Reported-by: Andrea Gelmini <andrea.gelmini@gelma.net> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit ffb785e178acd0d965e4338c561f82bdb2d054b6 Author: Johannes Berg <johannes.berg@intel.com> Date: Thu Dec 10 10:37:51 2015 +0100 rfkill: copy the name into the rfkill struct commit b7bb110008607a915298bf0f47d25886ecb94477 upstream. Some users of rfkill, like NFC and cfg80211, use a dynamic name when allocating rfkill, in those cases dev_name(). Therefore, the pointer passed to rfkill_alloc() might not be valid forever, I specifically found the case that the rfkill name was quite obviously an invalid pointer (or at least garbage) when the wiphy had been renamed. Fix this by making a copy of the rfkill name in rfkill_alloc(). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 9cec78832326106ebd03bca2235a461aa6fac804 Author: Kirill A. Shutemov <kirill@shutemov.name> Date: Mon Nov 30 04:17:31 2015 +0200 vgaarb: fix signal handling in vga_get() commit 9f5bd30818c42c6c36a51f93b4df75a2ea2bd85e upstream. There are few defects in vga_get() related to signal hadning: - we shouldn't check for pending signals for TASK_UNINTERRUPTIBLE case; - if we found pending signal we must remove ourself from wait queue and change task state back to running; - -ERESTARTSYS is more appropriate, I guess. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 859ac05006374e785820017deba204859bbdf1ad Author: Joe Thornber <ejt@redhat.com> Date: Thu Dec 10 14:37:53 2015 +0000 dm btree: fix bufio buffer leaks in dm_btree_del() error path commit ed8b45a3679eb49069b094c0711b30833f27c734 upstream. If dm_btree_del()'s call to push_frame() fails, e.g. due to btree_node_validator finding invalid metadata, the dm_btree_del() error path must unlock all frames (which have active dm-bufio buffers) that were pushed onto the del_stack. Otherwise, dm_bufio_client_destroy() will BUG_ON() because dm-bufio buffers have leaked, e.g.: device-mapper: bufio: leaked buffer 3, hold count 1, list 0 Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit d6eab6f74f15e0f38d875d523f74e6bc55492ba7 Author: Mikulas Patocka <mpatocka@redhat.com> Date: Thu Nov 26 12:00:59 2015 -0500 sata_sil: disable trim commit d98f1cd0a3b70ea91f1dfda3ac36c3b2e1a4d5e2 upstream. When I connect an Intel SSD to SATA SIL controller (PCI ID 1095:3114), any TRIM command results in I/O errors being reported in the log. There is other similar error reported with TRIM and the SIL controller: https://bugs.centos.org/view.php?id=5880 Apparently the controller doesn't support TRIM commands. This patch disables TRIM support on the SATA SIL controller. ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata7.00: BMDMA2 stat 0x50001 ata7.00: failed command: DATA SET MANAGEMENT ata7.00: cmd 06/01:01:00:00:00/00:00:00:00:00/a0 tag 0 dma 512 out res 51/04:01:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error) ata7.00: status: { DRDY ERR } ata7.00: error: { ABRT } ata7.00: device reported invalid CHS sector 0 sd 8:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE sd 8:0:0:0: [sdb] tag#0 Sense Key : Illegal Request [current] [descriptor] sd 8:0:0:0: [sdb] tag#0 Add. Sense: Unaligned write command sd 8:0:0:0: [sdb] tag#0 CDB: Write same(16) 93 08 00 00 00 00 00 21 95 88 00 20 00 00 00 00 blk_update_request: I/O error, dev sdb, sector 2200968 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 6bf97b05008739ad7d644758301a427460479537 Author: Sasha Levin <sasha.levin@oracle.com> Date: Mon Nov 30 20:34:20 2015 -0500 sched/core: Remove false-positive warning from wake_up_process() commit 119d6f6a3be8b424b200dcee56e74484d5445f7e upstream. Because wakeups can (fundamentally) be late, a task might not be in the expected state. Therefore testing against a task's state is racy, and can yield false positives. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: oleg@redhat.com Fixes: 9067ac85d533 ("wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task") Link: http://lkml.kernel.org/r/1448933660-23082-1-git-send-email-sasha.levin@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 4af5f2490fb0cab9e29f8b45b81cd69aa4d9babe Author: Mirza Krak <mirza.krak@hostmobility.com> Date: Tue Nov 10 14:59:34 2015 +0100 can: sja1000: clear interrupts on start commit 7cecd9ab80f43972c056dc068338f7bcc407b71c upstream. According to SJA1000 data sheet error-warning (EI) interrupt is not cleared by setting the controller in to reset-mode. Then if we have the following case: - system is suspended (echo mem > /sys/power/state) and SJA1000 is left in operating state - A bus error condition occurs which activates EI interrupt, system is still suspended which means EI interrupt will be not be handled nor cleared. If the above two events occur, on resume there is no way to return the SJA1000 to operating state, except to cycle power to it. By simply reading the IR register on start we will clear any previous conditions that could be present. Signed-off-by: Mirza Krak <mirza.krak@hostmobility.com> Reported-by: Christian Magnusson <Christian.Magnusson@semcon.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit e2f3d50558b8e5deeaacf4990e478fc844444de5 Author: Quentin Casasnovas <quentin.casasnovas@oracle.com> Date: Tue Nov 24 17:13:21 2015 -0500 RDS: fix race condition when sending a message on unbound socket commit 8c7188b23474cca017b3ef354c4a58456f68303a upstream. Sasha's found a NULL pointer dereference in the RDS connection code when sending a message to an apparently unbound socket. The problem is caused by the code checking if the socket is bound in rds_sendmsg(), which checks the rs_bound_addr field without taking a lock on the socket. This opens a race where rs_bound_addr is temporarily set but where the transport is not in rds_bind(), leading to a NULL pointer dereference when trying to dereference 'trans' in __rds_conn_create(). Vegard wrote a reproducer for this issue, so kindly ask him to share if you're interested. I cannot reproduce the NULL pointer dereference using Vegard's reproducer with this patch, whereas I could without. Complete earlier incomplete fix to CVE-2015-6937: 74e98eb08588 ("RDS: verify the underlying transport exists before creating a connection") Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com> Reviewed-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 1467bec9a74c7369fbf38c9fbd51f1f61a3f14c7 Author: Johannes Berg <johannes.berg@intel.com> Date: Tue Nov 17 14:25:21 2015 +0100 mac80211: mesh: fix call_rcu() usage commit c2e703a55245bfff3db53b1f7cbe59f1ee8a4339 upstream. When using call_rcu(), the called function may be delayed quite significantly, and without a matching rcu_barrier() there's no way to be sure it has finished. Therefore, global state that could be gone/freed/reused should never be touched in the callback. Fix this in mesh by moving the atomic_dec() into the caller; that's not really a problem since we already unlinked the path and it will be destroyed anyway. This fixes a crash Jouni observed when running certain tests in a certain order, in which the mesh interface was torn down, the memory reused for a function pointer (work struct) and running that then crashed since the pointer had been decremented by 1, resulting in an invalid instruction byte stream. Fixes: eb2b9311fd00 ("mac80211: mesh path table implementation") Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 963e16256e30f627f5c105814a8d9658f2107b7e Author: Suman Anna <s-anna@ti.com> Date: Wed Sep 16 19:29:17 2015 -0500 virtio: fix memory leak of virtio ida cache layers commit c13f99b7e945dad5273a8b7ee230f4d1f22d3354 upstream. The virtio core uses a static ida named virtio_index_ida for assigning index numbers to virtio devices during registration. The ida core may allocate some internal idr cache layers and an ida bitmap upon any ida allocation, and all these layers are truely freed only upon the ida destruction. The virtio_index_ida is not destroyed at present, leading to a memory leak when using the virtio core as a module and atleast one virtio device is registered and unregistered. Fix this by invoking ida_destroy() in the virtio core module exit. Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 6039f028a9bbf7cb34ecfac31d5a9df68453221d Author: Steven Rostedt (Red Hat) <rostedt@goodmis.org> Date: Mon Nov 23 10:35:36 2015 -0500 ring-buffer: Update read stamp with first real commit on page commit b81f472a208d3e2b4392faa6d17037a89442f4ce upstream. Do not update the read stamp after swapping out the reader page from the write buffer. If the reader page is swapped out of the buffer before an event is written to it, then the read_stamp may get an out of date timestamp, as the page timestamp is updated on the first commit to that page. rb_get_reader_page() only returns a page if it has an event on it, otherwise it will return NULL. At that point, check if the page being returned has events and has not been read yet. Then at that point update the read_stamp to match the time stamp of the reader page. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit b63a96fada140801597b62a2a6a6818a96ae39e9 Author: Jan Kara <jack@suse.cz> Date: Mon Nov 23 13:09:51 2015 +0100 vfs: Avoid softlockups with sendfile(2) commit c2489e07c0a71a56fb2c84bc0ee66cddfca7d068 upstream. The following test program from Dmitry can cause softlockups or RCU stalls as it copies 1GB from tmpfs into eventfd and we don't have any scheduling point at that path in sendfile(2) implementation: int r1 = eventfd(0, 0); int r2 = memfd_create("", 0); unsigned long n = 1<<30; fallocate(r2, 0, 0, n); sendfile(r1, r2, 0, n); Add cond_resched() into __splice_from_pipe() to fix the problem. CC: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 677eea664a18f46ec7186d1b0a1595f8ac2cb959 Author: Vineet Gupta <vgupta@synopsys.com> Date: Mon Nov 23 19:32:51 2015 +0530 ARC: dw2 unwind: Remove falllback linear search thru FDE entries commit 2e22502c080f27afeab5e6f11e618fb7bc7aea53 upstream. Fixes STAR 9000953410: "perf callgraph profiling causing RCU stalls" | perf record -g -c 15000 -e cycles /sbin/hackbench | | INFO: rcu_preempt self-detected stall on CPU | 1: (1 GPs behind) idle=609/140000000000002/0 softirq=2914/2915 fqs=603 | Task dump for CPU 1: in-kernel dwarf unwinder has a fast binary lookup and a fallback linear search (which iterates thru each of ~11K entries) thus takes 2 orders of magnitude longer (~3 million cycles vs. 2000). Routines written in hand assembler lack dwarf info (as we don't support assembler CFI pseudo-ops yet) fail the unwinder binary lookup, hit linear search, failing nevertheless in the end. However the linear search is pointless as binary lookup tables are created from it in first place. It is impossible to have binary lookup fail while succeed the linear search. It is pure waste of cycles thus removed by this patch. This manifested as RCU stalls / NMI watchdog splat when running hackbench under perf with callgraph profiling. The triggering condition was perf counter overflowing in routine lacking dwarf info (like memset) leading to patheic 3 million cycle unwinder slow path and by the time it returned new interrupts were already pending (Timer, IPI) and taken rightaway. The original memset didn't make forward progress, system kept accruing more interrupts and more unwinder delayes in a vicious feedback loop, ultimately triggering the NMI diagnostic. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 2a27f61bd411e564eb4651c18d225f6e9e1de534 Author: Kees Cook <keescook@chromium.org> Date: Thu Nov 19 17:18:54 2015 -0800 mac: validate mac_partition is within sector commit 02e2a5bfebe99edcf9d694575a75032d53fe1b73 upstream. If md->signature == MAC_DRIVER_MAGIC and md->block_size == 1023, a single 512 byte sector would be read (secsize / 512). However the partition structure would be located past the end of the buffer (secsize % 512). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit e3dda035201bd6299d95414c7d9943f721b7be87 Author: Luca Porzio <lporzio@micron.com> Date: Fri Nov 6 15:12:26 2015 +0000 mmc: remove bondage between REQ_META and reliable write commit d3df0465db00cf4ed9f90d0bfc3b827d32b9c796 upstream. Anytime a write operation is performed with Reliable Write flag enabled, the eMMC device is enforced to bypass the cache and do a write to the underling NVM device by Jedec specification; this causes a performance penalty since write operations can't be optimized by the device cache. In our tests, we replayed a typical mobile daily trace pattern and found ~9% overall time reduction in trace replay by using this patch. Also the write ops within 4KB~64KB chunk size range get a 40~60% performance improvement by using the patch (as this range of write chunks are the ones affected by REQ_META). This patch has been discussed in the Mobile & Embedded Linux Storage Forum and it's the results of feedbacks from many people. We also checked with fsdevl and f2fs mailing list developers that this change in the usage of REQ_META is not affecting FS behavior and we got positive feedbacks. Reporting here the feedbacks: http://comments.gmane.org/gmane.linux.file-systems/97219 http://thread.gmane.org/gmane.linux.file-systems.f2fs/3178/focus=3183 Signed-off-by: Bruce Ford <bford@micron.com> Signed-off-by: Luca Porzio <lporzio@micron.com> Fixes: ce39f9d17c14 ("mmc: support packed write command for eMMC4.5 devices") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit eef125560fd129761cd4842ae061d543e81b533a Author: sumit.saxena@avagotech.com <sumit.saxena@avagotech.com> Date: Thu Oct 15 13:40:54 2015 +0530 megaraid_sas : SMAP restriction--do not access user memory from IOCTL code commit 323c4a02c631d00851d8edc4213c4d184ef83647 upstream. This is an issue on SMAP enabled CPUs and 32 bit apps running on 64 bit OS. Do not access user memory from kernel code. The SMAP bit restricts accessing user memory from kernel code. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 64896131a6e3c60133360b2ff2a70487eb35f721 Author: sumit.saxena@avagotech.com <sumit.saxena@avagotech.com> Date: Thu Oct 15 13:40:04 2015 +0530 megaraid_sas: Do not use PAGE_SIZE for max_sectors commit 357ae967ad66e357f78b5cfb5ab6ca07fb4a7758 upstream. Do not use PAGE_SIZE marco to calculate max_sectors per I/O request. Driver code assumes PAGE_SIZE will be always 4096 which can lead to wrongly calculated value if PAGE_SIZE is not 4096. This issue was reported in Ubuntu Bugzilla Bug #1475166. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 621264c8898d4e0a5d14919ea900d82c2eab6262 Author: Valentin Rothberg <valentinrothberg@gmail.com> Date: Tue Sep 22 19:00:40 2015 +0200 wm831x_power: Use IRQF_ONESHOT to request threaded IRQs commit 90adf98d9530054b8e665ba5a928de4307231d84 upstream. Since commit 1c6c69525b40 ("genirq: Reject bogus threaded irq requests") threaded IRQs without a primary handler need to be requested with IRQF_ONESHOT, otherwise the request will fail. scripts/coccinelle/misc/irqf_oneshot.cocci detected this issue. Fixes: b5874f33bbaf ("wm831x_power: Use genirq") Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 736652169c60e99df027befd07f9217bcaba840d Author: Dan Carpenter <dan.carpenter@oracle.com> Date: Mon Sep 21 19:21:51 2015 +0300 devres: fix a for loop bounds check commit 1f35d04a02a652f14566f875aef3a6f2af4cb77b upstream. The iomap[] array has PCIM_IOMAP_MAX (6) elements and not DEVICE_COUNT_RESOURCE (16). This bug was found using a static checker. It may be that the "if (!(mask & (1 << i)))" check means we never actually go past the end of the array in real life. Fixes: ec04b075843d ('iomap: implement pcim_iounmap_regions()') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit db13625b0968e8ccaa14dfa0fcb2b347524be05e Author: Andrey Ryabinin <aryabinin@virtuozzo.com> Date: Wed Sep 23 15:49:29 2015 +0300 lockd: create NSM handles per net namespace commit 0ad95472bf169a3501991f8f33f5147f792a8116 upstream. Commit cb7323fffa85 ("lockd: create and use per-net NSM RPC clients on MON/UNMON requests") introduced per-net NSM RPC clients. Unfortunately this doesn't make any sense without per-net nsm_handle. E.g. the following scenario could happen Two hosts (X and Y) in different namespaces (A and B) share the same nsm struct. 1. nsm_monitor(host_X) called => NSM rpc client created, nsm->sm_monitored bit set. 2. nsm_mointor(host-Y) called => nsm->sm_monitored already set, we just exit. Thus in namespace B ln->nsm_clnt == NULL. 3. host X destroyed => nsm->sm_count decremented to 1 4. host Y destroyed => nsm_unmonitor() => nsm_mon_unmon() => NULL-ptr dereference of *ln->nsm_clnt So this could be fixed by making per-net nsm_handles list, instead of global. Thus different net namespaces will not be able share the same nsm_handle. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 1644fe6cc1567ecde034ea8acd5f4d6146e395b5 Author: Roman Volkov <rvolkov@v1ros.org> Date: Fri Jan 1 16:24:41 2016 +0300 clocksource/drivers/vt8500: Increase the minimum delta commit f9eccf24615672896dc13251410c3f2f33a14f95 upstream. The vt8500 clocksource driver declares itself as capable to handle the minimum delay of 4 cycles by passing the value into clockevents_config_and_register(). The vt8500_timer_set_next_event() requires the passed cycles value to be at least 16. The impact is that userspace hangs in nanosleep() calls with small delay intervals. This problem is reproducible in Linux 4.2 starting from: c6eb3f70d448 ('hrtimer: Get rid of hrtimer softirq') From Russell King, more detailed explanation: "It's a speciality of the StrongARM/PXA hardware. It takes a certain number of OSCR cycles for the value written to hit the compare registers. So, if a very small delta is written (eg, the compare register is written with a value of OSCR + 1), the OSCR will have incremented past this value before it hits the underlying hardware. The result is, that you end up waiting a very long time for the OSCR to wrap before the event fires. So, we introduce a check in set_next_event() to detect this and return -ETIME if the calculated delta is too small, which causes the generic clockevents code to retry after adding the min_delta specified in clockevents_config_and_register() to the current time value. min_delta must be sufficient that we don't re-trip the -ETIME check - if we do, we will return -ETIME, forward the next event time, try to set it, return -ETIME again, and basically lock the system up. So, min_delta must be larger than the check inside set_next_event(). A factor of two was chosen to ensure that this situation would never occur. The PXA code worked on PXA systems for years, and I'd suggest no one changes this mechanism without access to a wide range of PXA systems, otherwise they're risking breakage." Cc: Russell King <linux@arm.linux.org.uk> Acked-by: Alexey Charkov <alchark@gmail.com> Signed-off-by: Roman Volkov <rvolkov@v1ros.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit bf5cd0c632e49ca583cd3531b55693e615a2b332 Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun Dec 13 18:12:30 2015 +0100 genirq: Prevent chip buslock deadlock commit abc7e40c81d113ef4bacb556f0a77ca63ac81d85 upstream. If a interrupt chip utilizes chip->buslock then free_irq() can deadlock in the following way: CPU0 CPU1 interrupt(X) (Shared or spurious) free_irq(X) interrupt_thread(X) chip_bus_lock(X) irq_finalize_oneshot(X) chip_bus_lock(X) synchronize_irq(X) synchronize_irq() waits for the interrupt thread to complete, i.e. forever. Solution is simple: Drop chip_bus_lock() before calling synchronize_irq() as we do with the irq_desc lock. There is nothing to be protected after the point where irq_desc lock has been released. This adds chip_bus_lock/unlock() to the remove_irq() code path, but that's actually correct in the case where remove_irq() is called on such an interrupt. The current users of remove_irq() are not affected as none of those interrupts is on a chip which requires buslock. Reported-by: Fredrik Markström <fredrik.markstrom@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 341f09c01a7f26a030f3bedb08e4ce91e3ca24d3 Author: Hannes Frederic Sowa <hannes@stressinduktion.org> Date: Wed Feb 3 02:11:03 2016 +0100 unix: correctly track in-flight fds in sending process user_struct commit 415e3d3e90ce9e18727e8843ae343eda5a58fad6 upstream. The commit referenced in the Fixes tag incorrectly accounted the number of in-flight fds over a unix domain socket to the original opener of the file-descriptor. This allows another process to arbitrary deplete the original file-openers resource limit for the maximum of open files. Instead the sending processes and its struct cred should be credited. To do so, we add a reference counted struct user_struct pointer to the scm_fp_list and use it to account for the number of inflight unix fds. Fixes: 712f4aad406bb1 ("unix: properly account for FDs passed over unix sockets") Reported-by: David Herrmann <dh.herrmann@gmail.com> Cc: David Herrmann <dh.herrmann@gmail.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 59ae7b1c13bd615b09bff9d03aaa335559af604a Author: Olga Kornievskaia <aglo@umich.edu> Date: Mon Sep 14 19:54:36 2015 -0400 Failing to send a CLOSE if file is opened WRONLY and server reboots on a 4.x mount commit a41cbe86df3afbc82311a1640e20858c0cd7e065 upstream. A test case is as the description says: open(foobar, O_WRONLY); sleep() --> reboot the server close(foobar) The bug is because in nfs4state.c in nfs4_reclaim_open_state() a few line before going to restart, there is clear_bit(NFS4CLNT_RECLAIM_NOGRACE, &state->flags). NFS4CLNT_RECLAIM_NOGRACE is a flag for the client states not open owner states. Value of NFS4CLNT_RECLAIM_NOGRACE is 4 which is the value of NFS_O_WRONLY_STATE in nfs4_state->flags. So clearing it wipes out state and when we go to close it, “call_close” doesn’t get set as state flag is not set and CLOSE doesn’t go on the wire. Signed-off-by: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 4c67196fd14f5f53eb865719bfca1908fa963618 Author: Christophe Leroy <christophe.leroy@c-s.fr> Date: Wed May 6 17:26:47 2015 +0200 splice: sendfile() at once fails for big files commit 0ff28d9f4674d781e492bcff6f32f0fe48cf0fed upstream. Using sendfile with below small program to get MD5 sums of some files, it appear that big files (over 64kbytes with 4k pages system) get a wrong MD5 sum while small files get the correct sum. This program uses sendfile() to send a file to an AF_ALG socket for hashing. /* md5sum2.c */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #include <fcntl.h> #include <sys/socket.h> #include <sys/stat.h> #include <sys/types.h> #include <linux/if_alg.h> int main(int argc, char **argv) { int sk = socket(AF_ALG, SOCK_SEQPACKET, 0); struct stat st; struct sockaddr_alg sa = { .salg_family = AF_ALG, .salg_type = "hash", .salg_name = "md5", }; int n; bind(sk, (struct sockaddr*)&sa, sizeof(sa)); for (n = 1; n < argc; n++) { int size; int offset = 0; char buf[4096]; int fd; int sko; int i; fd = open(argv[n], O_RDONLY); sko = accept(sk, NULL, 0); fstat(fd, &st); size = st.st_size; sendfile(sko, fd, &offset, size); size = read(sko, buf, sizeof(buf)); for (i = 0; i < size; i++) printf("%2.2x", buf[i]); printf(" %s\n", argv[n]); close(fd); close(sko); } exit(0); } Test below is done using official linux patch files. First result is with a software based md5sum. Second result is with the program above. root@vgoip:~# ls -l patch-3.6.* -rw-r--r-- 1 root root 64011 Aug 24 12:01 patch-3.6.2.gz -rw-r--r-- 1 root root 94131 Aug 24 12:01 patch-3.6.3.gz root@vgoip:~# md5sum patch-3.6.* b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz c5e8f687878457db77cb7158c38a7e43 patch-3.6.3.gz root@vgoip:~# ./md5sum2 patch-3.6.* b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz 5fd77b24e68bb24dcc72d6e57c64790e patch-3.6.3.gz After investivation, it appears that sendfile() sends the files by blocks of 64kbytes (16 times PAGE_SIZE). The problem is that at the end of each block, the SPLICE_F_MORE flag is missing, therefore the hashing operation is reset as if it was the end of the file. This patch adds SPLICE_F_MORE to the flags when more data is pending. With the patch applied, we get the correct sums: root@vgoip:~# md5sum patch-3.6.* b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz c5e8f687878457db77cb7158c38a7e43 patch-3.6.3.gz root@vgoip:~# ./md5sum2 patch-3.6.* b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz c5e8f687878457db77cb7158c38a7e43 patch-3.6.3.gz Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Jens Axboe <axboe@fb.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 23a9a7dc6f75393a21c0378d13b380e46907b877 Author: James Hogan <james.hogan@imgtec.com> Date: Wed Nov 11 14:21:20 2015 +0000 MIPS: KVM: Uninit VCPU in vcpu_create error path commit 585bb8f9a5e592f2ce7abbe5ed3112d5438d2754 upstream. If either of the memory allocations in kvm_arch_vcpu_create() fail, the vcpu which has been allocated and kvm_vcpu_init'd doesn't get uninit'd in the error handling path. Add a call to kvm_vcpu_uninit() to fix this. Fixes: 669e846e6c4e ("KVM/MIPS32: MIPS arch specific APIs for KVM") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 7b0fc4511317bbc71ca8bbcf032013ffbfe0a2bc Author: James Hogan <james.hogan@imgtec.com> Date: Wed Nov 11 14:21:19 2015 +0000 MIPS: KVM: Fix CACHE immediate offset sign extension commit c5c2a3b998f1ff5a586f9d37e154070b8d550d17 upstream. The immediate field of the CACHE instruction is signed, so ensure that it gets sign extended by casting it to an int16_t rather than just masking the low 16 bits. Fixes: e685c689f3a8 ("KVM/MIPS32: Privileged instruction/target branch emulation.") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 5525dd65cd8e4f80ede26993f6f665df7eeec1dc Author: James Hogan <james.hogan@imgtec.com> Date: Wed Nov 11 14:21:18 2015 +0000 MIPS: KVM: Fix ASID restoration logic commit 002374f371bd02df864cce1fe85d90dc5b292837 upstream. ASID restoration on guest resume should determine the guest execution mode based on the guest Status register rather than bit 30 of the guest PC. Fix the two places in locore.S that do this, loading the guest status from the cop0 area. Note, this assembly is specific to the trap & emulate implementation of KVM, so it doesn't need to check the supervisor bit as that mode is not implemented in the guest. Fixes: b680f70fc111 ("KVM/MIPS32: Entry point for trampolining to...") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 1630624d5387a837112fb3663fe5daa22c680267 Author: Hariprasad S <hariprasad@chelsio.com> Date: Fri Dec 11 13:59:17 2015 +0530 iw_cxgb3: Fix incorrectly returning error on success commit 67f1aee6f45059fd6b0f5b0ecb2c97ad0451f6b3 upstream. The cxgb3_*_send() functions return NET_XMIT_ values, which are positive integers values. So don't treat positive return values as an error. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com> [a pox on developers and maintainers who do not cc: stable for bug fixes like this - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit b048b93f0e5b249b0add0bde6ec7ec75a07eac9c Author: Corey Wright <undefined@pobox.com> Date: Sun Feb 28 02:42:39 2016 -0600 proc: Fix ptrace-based permission checks for accessing task maps Modify mm_access() calls in fs/proc/task_mmu.c and fs/proc/task_nommu.c to have the mode include PTRACE_MODE_FSCREDS so accessing /proc/pid/maps and /proc/pid/pagemap is not denied to all users. In backporting upstream commit caaee623 to pre-3.18 kernel versions it was overlooked that mm_access() is used in fs/proc/task_*mmu.c as those calls were removed in 3.18 (by upstream commit 29a40ace) and did not exist at the time of the original commit. Signed-off-by: Corey Wright <undefined@pobox.com> Acked-by: Jann Horn <jann@thejh.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit db9b6792736a57291901032e4a2036dfc9f0ab95 Author: Bjørn Mork <bjorn@mork.no> Date: Fri Feb 12 16:40:00 2016 +0100 USB: option: add "4G LTE usb-modem U901" commit d061c1caa31d4d9792cfe48a2c6b309a0e01ef46 upstream. Thomas reports: T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=05c6 ProdID=6001 Rev=00.00 S: Manufacturer=USB Modem S: Product=USB Modem S: SerialNumber=1234567890ABCDEF C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage Reported-by: Thomas Schäfer <tschaefer@t-online.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit b920f51b68e4d965b51cab8060fae66eeb21f218 Author: Andrey Skvortsov <andrej.skvortzov@gmail.com> Date: Fri Jan 29 00:07:30 2016 +0300 USB: option: add support for SIM7100E commit 3158a8d416f4e1b79dcc867d67cb50013140772c upstream. $ lsusb: Bus 001 Device 101: ID 1e0e:9001 Qualcomm / Option $ usb-devices: T: Bus=01 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#=101 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 2 P: Vendor=1e0e ProdID=9001 Rev= 2.32 S: Manufacturer=SimTech, Incorporated S: Product=SimTech, Incorporated S: SerialNumber=0123456789ABCDEF C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I:* If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) The last interface (6) is used for Android Composite ADB interface. Serial port layout: 0: QCDM/DIAG 1: NMEA 2: AT 3: AT/PPP 4: audio Signed-off-by: Andrey Skvortsov <andrej.skvortzov@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 9195b27396cbf8459e19593ec34586e66857ab01 Author: Ken Lin <ken.lin@advantech.com.tw> Date: Mon Feb 1 14:57:25 2016 -0500 USB: cp210x: add IDs for GE B650V3 and B850V3 boards commit 6627ae19385283b89356a199d7f03c75ba35fb29 upstream. Add USB ID for cp2104/5 devices on GE B650v3 and B850v3 boards. Signed-off-by: Ken Lin <ken.lin@advantech.com.tw> Signed-off-by: Akshay Bhat <akshay.bhat@timesys.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit fadc5c1769ba24cd8498359d93e31408b0e0fbb9 Author: Gerhard Uttenthaler <uttenthaler@ems-wuensche.com> Date: Tue Dec 22 17:29:16 2015 +0100 can: ems_usb: Fix possible tx overflow commit 90cfde46586d2286488d8ed636929e936c0c9ab2 upstream. This patch fixes the problem that more CAN messages could be sent to the interface as could be send on the CAN bus. This was more likely for slow baud rates. The sleeping _start_xmit was woken up in the _write_bulk_callback. Under heavy TX load this produced another bulk transfer without checking the free_slots variable and hence caused the overflow in the interface. Signed-off-by: Gerhard Uttenthaler <uttenthaler@ems-wuensche.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 86d80ecd4f96b63103abfd1269855a8a3b9d47cc Author: Nikolay Borisov <kernel@kyup.com> Date: Thu Dec 17 18:03:35 2015 +0200 dm thin: fix race condition when destroying thin pool workqueue commit 18d03e8c25f173f4107a40d0b8c24defb6ed69f3 upstream. When a thin pool is being destroyed delayed work items are cancelled using cancel_delayed_work(), which doesn't guarantee that on return the delayed item isn't running. This can cause the work item to requeue itself on an already destroyed workqueue. Fix this by using cancel_delayed_work_sync() which guarantees that on return the work item is not running anymore. Fixes: 905e51b39a555 ("dm thin: commit outstanding data every second") Fixes: 85ad643b7e7e5 ("dm thin: add timeout to stop out-of-data-space mode holding IO forever") Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 9bb86db161a368f423958981869d17c53ae5f395 Author: Joe Thornber <ejt@redhat.com> Date: Wed Dec 9 16:23:24 2015 +0000 dm thin metadata: fix bug when taking a metadata snapshot commit 49e99fc717f624aa75ca755d6e7bc029efd3f0e9 upstream. When you take a metadata snapshot the btree roots for the mapping and details tree need to have their reference counts incremented so they persist for the lifetime of the metadata snap. The roots being incremented were those currently written in the superblock, which could possibly be out of date if concurrent IO is triggering new mappings, breaking of sharing, etc. Fix this by performing a commit with the metadata lock held while taking a metadata snapshot. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 9ee0d9ad9309385fd877bf7f5a762d4d3b5a6462 Author: Ingo Molnar <mingo@kernel.org> Date: Tue Mar 3 07:34:33 2015 +0100 efi: Disable interrupts around EFI calls, not in the epilog/prolog calls commit 23a0d4e8fa6d3a1d7fb819f79bcc0a3739c30ba9 upstream. Tapasweni Pathak reported that we do a kmalloc() in efi_call_phys_prolog() on x86-64 while having interrupts disabled, which is a big no-no, as kmalloc() can sleep. Solve this by removing the irq disabling from the prolog/epilog calls around EFI calls: it's unnecessary, as in this stage we are single threaded in the boot thread, and we don't ever execute this from interrupt contexts. Reported-by: Tapasweni Pathak <tapaswenipathak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com> [ luis: backported to 3.10: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit f2419de63e77bf6f63b5e67a57b1fa7cf535c4ca Author: Dave Airlie <airlied@redhat.com> Date: Thu Aug 20 10:13:55 2015 +1000 drm/radeon: fix hotplug race at startup commit 7f98ca454ad373fc1b76be804fa7138ff68c1d27 upstream. We apparantly get a hotplug irq before we've initialised modesetting, [drm] Loading R100 Microcode BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c125f56f>] __mutex_lock_slowpath+0x23/0x91 *pde = 00000000 Oops: 0002 [#1] Modules linked in: radeon(+) drm_kms_helper ttm drm i2c_algo_bit backlight pcspkr psmouse evdev sr_mod input_leds led_class cdrom sg parport_pc parport floppy intel_agp intel_gtt lpc_ich acpi_cpufreq processor button mfd_core agpgart uhci_hcd ehci_hcd rng_core snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm usbcore usb_common i2c_i801 i2c_core snd_timer snd soundcore thermal_sys CPU: 0 PID: 15 Comm: kworker/0:1 Not tainted 4.2.0-rc7-00015-gbf67402 #111 Hardware name: MicroLink /D850MV , BIOS MV85010A.86A.0067.P24.0304081124 04/08/2003 Workqueue: events radeon_hotplug_work_func [radeon] task: f6ca5900 ti: f6d3e000 task.ti: f6d3e000 EIP: 0060:[<c125f56f>] EFLAGS: 00010282 CPU: 0 EIP is at __mutex_lock_slowpath+0x23/0x91 EAX: 00000000 EBX: f5e900fc ECX: 00000000 EDX: fffffffe ESI: f6ca5900 EDI: f5e90100 EBP: f5e90000 ESP: f6d3ff0c DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 CR0: 8005003b CR2: 00000000 CR3: 36f61000 CR4: 000006d0 Stack: f5e90100 00000000 c103c4c1 f6d2a5a0 f5e900fc f6df394c c125f162 f8b0faca f6d2a5a0 c138ca00 f6df394c f7395600 c1034741 00d40000 00000000 f6d2a5a0 c138ca00 f6d2a5b8 c138ca10 c1034b58 00000001 f6d40000 f6ca5900 f6d0c940 Call Trace: [<c103c4c1>] ? dequeue_task_fair+0xa4/0xb7 [<c125f162>] ? mutex_lock+0x9/0xa [<f8b0faca>] ? radeon_hotplug_work_func+0x17/0x57 [radeon] [<c1034741>] ? process_one_work+0xfc/0x194 [<c1034b58>] ? worker_thread+0x18d/0x218 [<c10349cb>] ? rescuer_thread+0x1d5/0x1d5 [<c103742a>] ? kthread+0x7b/0x80 [<c12601c0>] ? ret_from_kernel_thread+0x20/0x30 [<c10373af>] ? init_completion+0x18/0x18 Code: 42 08 e8 8e a6 dd ff c3 57 56 53 83 ec 0c 8b 35 48 f7 37 c1 8b 10 4a 74 1a 89 c3 8d 78 04 8b 40 08 89 63 Reported-and-Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 8719b00b6bcbf3acf3d8c1efebf9e9743d1d2511 Author: Kamal Mostafa <kamal@canonical.com> Date: Wed Nov 11 14:25:34 2015 -0800 tools: Add a "make all" rule commit f6ba98c5dc78708cb7fd29950c4a50c4c7e88f95 upstream. Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Pali Rohar <pali.rohar@gmail.com> Cc: Roberta Dobrescu <roberta.dobrescu@gmail.com> Link: http://lkml.kernel.org/r/1447280736-2161-2-git-send-email-kamal@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> [ kamal: backport to 3.10-stable: build all tools for this version ] Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 9fac660099785accf7c99670c9cceb096098e820 Author: Zheng Liu <wenqing.lz@taobao.com> Date: Sun Nov 29 17:21:57 2015 -0800 bcache: unregister reboot notifier if bcache fails to unregister device commit 2ecf0cdb2b437402110ab57546e02abfa68a716b upstream. In bcache_init() function it forgot to unregister reboot notifier if bcache fails to unregister a block device. This commit fixes this. Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Tested-by: Joshua Schmid <jschmid@suse.com> Tested-by: Eric Wheeler <bcache@linux.ewheeler.net> Cc: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit f79d019c4b9692afe3a383df849b927097ff4e1a Author: Andrey Vagin <avagin@openvz.org> Date: Wed Jan 29 19:34:14 2014 +0100 netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream. Lets look at destroy_conntrack: hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode); ... nf_conntrack_free(ct) kmem_cache_free(net->ct.nf_conntrack_cachep, ct); net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU. The hash is protected by rcu, so readers look up conntracks without locks. A conntrack is removed from the hash, but in this moment a few readers still can use the conntrack. Then this conntrack is released and another thread creates conntrack with the same address and the equal tuple. After this a reader starts to validate the conntrack: * It's not dying, because a new conntrack was created * nf_ct_tuple_equal() returns true. But this conntrack is not initialized yet, so it can not be used by two threads concurrently. In this case BUG_ON may be triggered from nf_nat_setup_info(). Florian Westphal suggested to check the confirm bit too. I think it's right. task 1 task 2 task 3 nf_conntrack_find_get ____nf_conntrack_find destroy_conntrack hlist_nulls_del_rcu nf_conntrack_free kmem_cache_free __nf_conntrack_alloc kmem_cache_alloc memset(&ct->tuplehash[IP_CT_DIR_MAX], if (nf_ct_is_dying(ct)) if (!nf_ct_tuple_equal() I'm not sure, that I have ever seen this race condition in a real life. Currently we are investigating a bug, which is reproduced on a few nodes. In our case one conntrack is initialized from a few tasks concurrently, we don't have any other explanation for this. <2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322! ... <4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>] [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat] ... <4>[46267.085549] Call Trace: <4>[46267.085622] [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat] <4>[46267.085697] [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat] <4>[46267.085770] [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat] <4>[46267.085843] [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat] <4>[46267.085919] [<ffffffff814841b9>] nf_iterate+0x69/0xb0 <4>[46267.085991] [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0 <4>[46267.086063] [<ffffffff81484374>] nf_hook_slow+0x74/0x110 <4>[46267.086133] [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0 <4>[46267.086207] [<ffffffff814b5890>] ? dst_output+0x0/0x20 <4>[46267.086277] [<ffffffff81495204>] ip_output+0xa4/0xc0 <4>[46267.086346] [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910 <4>[46267.086419] [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0 <4>[46267.086491] [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50 <4>[46267.086562] [<ffffffff81444d67>] sock_sendmsg+0x117/0x140 <4>[46267.086638] [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20 <4>[46267.086712] [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40 <4>[46267.086785] [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80 <4>[46267.086858] [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20 <4>[46267.086936] [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90 <4>[46267.087006] [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90 <4>[46267.087081] [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0 <4>[46267.087151] [<ffffffff81445599>] sys_sendto+0x139/0x190 <4>[46267.087229] [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0 <4>[46267.087303] [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200 <4>[46267.087378] [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290 <4>[46267.087454] [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210 <4>[46267.087531] [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210 <4>[46267.087607] [<ffffffff8104dea3>] ia32_sysret+0x0/0x5 <4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74 <1>[46267.088023] RIP [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Florian Westphal <fw@strlen.de> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrey Vagin <avagin@openvz.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 9f5bc010e9feef21555b9870ee4b5b06c9feae73 Author: Egbert Eich <eich@suse.de> Date: Wed Jun 11 14:59:55 2014 +0200 drm/ast: Initialized data needed to map fbdev memory commit 28fb4cb7fa6f63dc2fbdb5f2564dcbead8e3eee0 upstream. Due to a missing initialization there was no way to map fbdev memory. Thus for example using the Xserver with the fbdev driver failed. This fix adds initialization for fix.smem_start and fix.smem_len in the fb_info structure, which fixes this problem. Requested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Egbert Eich <eich@suse.de> [pulled from SuSE tree by me - airlied] Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit c3e07084a8e08c4d4a02ef352e20419ba9835149 Author: Steven Rostedt (Red Hat) <rostedt@goodmis.org> Date: Mon Feb 15 12:36:14 2016 -0500 tracepoints: Do not trace when cpu is offline commit f37755490fe9bf76f6ba1d8c6591745d3574a6a6 upstream. The tracepoint infrastructure uses RCU sched protection to enable and disable tracepoints safely. There are some instances where tracepoints are used in infrastructure code (like kfree()) that get called after a CPU is going offline, and perhaps when it is coming back online but hasn't been registered yet. This can probuce the following warning: [ INFO: suspicious RCU usage. ] 4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S ------------------------------- include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/8/0. stack backtrace: CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S 4.4.0-00006-g0fe53e8-dirty #34 Call Trace: [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable) [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170 [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440 [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100 [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150 [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140 [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310 [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60 [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40 [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560 [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360 [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14 This warning is not a false positive either. RCU is not protecting code that is being executed while the CPU is offline. Instead of playing "whack-a-mole(TM)" and adding conditional statements to the tracepoints we find that are used in this instance, simply add a cpu_online() test to the tracepoint code where the tracepoint will be ignored if the CPU is offline. Use of raw_smp_processor_id() is fine, as there should never be a case where the tracepoint code goes from running on a CPU that is online and suddenly gets migrated to a CPU that is offline. Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org Reported-by: Denis Kirjanov <kda@linux-powerpc.org> Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* 3.10.66 -> 3.10.67Jan Engelmohr2016-08-261-0/+1
|
* 3.10.65 -> 3.10.66Jan Engelmohr2016-08-261-18/+34
|
* first commitMeizu OpenSource2016-08-15194-0/+118102