aboutsummaryrefslogtreecommitdiff
path: root/http.c
Commit message (Collapse)AuthorAgeFilesLines
* http.c: Always set SameSite=Strict to cookiesXavier Del Campo Romero2025-09-231-6/+7
| | | | | | | This cookie attribute allows to mitigate CSRF attacks, while not requiring the server to store additional data. [1] [1]: https://owasp.org/www-community/SameSite
* Implement HTTP byte servingXavier Del Campo Romero2024-11-111-26/+259
| | | | | | | | | | | | | | This commit allows the HTTP server to return partial content to clients, rather than returning the whole resource. This can be particularly useful for applications such as audio/video playback or showing large PDF files. Notes: - Applications must not care about partial contents i.e., if a valid user request was made, applications must still return HTTP status 200 ("OK"), as usual. The HTTP server will then translate the status code to 206 ("Partial Content") if required.
* http.c: Always call ctx_to_payloadXavier Del Campo Romero2024-10-041-46/+10
| | | | | | | | | Defining each struct http_payload manually had the risk of missing some member on the initializer. This was in fact the case for `n_headers` and `headers`, which were only assigned by ctx_to_payload, and therefore some specific HTTP requests would mistakenly not reflect such information to users.
* http.c: Avoid isspace(3) in get_boundaryXavier Del Campo Romero2024-10-041-2/+1
| | | | | | | | | According to POSIX.1-2008, this function is sensitive to the system locale, which might then have different definitions for a whitespace character. Therefore, it is safer to only check against ' ' so as to remove such a dependency.
* http.c: Fix ending boundaries not followed by CRLFXavier Del Campo Romero2024-08-221-41/+84
| | | | | | | According to RFC 2046, section 5.1.1, end boundaries might not be followed by CRLF. However, so far libweb naively relied on this behaviour as major implementations, such as cURL, Chromium or Gecko always add the optional CRLF, whereas Dillo does not.
* http.c: Accept double quotes on boundariesXavier Del Campo Romero2024-08-221-7/+66
| | | | | | | | | | | "multipart/form-data"-encoded POST requests might use double quotes for their boundaries. While this is required when invalid characters are otherwise used (e.g.: ':'), some web clients always insert double quotes. Additionally, according to RFC 2046 section 5.1.1, the boundary parameter consists of 1 to 70 characters, but libweb was not imposing such restrictions.
* http.c: Remove unneeded parameterXavier Del Campo Romero2024-08-221-17/+17
| | | | | | | | | | This parameter was rendered obsolete after the following commit: commit b0accd099fa8c5110d4c3c68830ad6fd810ca3ec Author: Xavier Del Campo Romero <xavi.dcr@tutanota.com> Date: Fri Nov 24 00:52:50 2023 +0100 http.c: Unify read operations
* http.c: Remove unused variableXavier Del Campo Romero2024-08-221-1/+1
|
* http.c: Fix memory leak on read failureXavier Del Campo Romero2024-08-221-9/+9
| | | | | | | For some unknown reason, ctx_free was only called by update_lstate, but this is not the only function that modifies a struct ctx instance. Since struct ctx is related to read operations, ctx_free must instead be called whenever http_read fails.
* http.c: Fix wrong checkXavier Del Campo Romero2024-08-221-1/+1
| | | | | | | | | | | | | p->f is a FILE *, so it is invalid to check against negative values. This bug was introduced when p->fd, a file descriptor, was replaced with p->f, a FILE *, by the following commit: commit b0accd099fa8c5110d4c3c68830ad6fd810ca3ec Author: Xavier Del Campo Romero <xavi.dcr@tutanota.com> Date: Fri Nov 24 00:52:50 2023 +0100 http.c: Unify read operations
* Limit maximum multipart/form-data pairs and filesXavier Del Campo Romero2024-02-191-1/+17
| | | | | | A malicious user could inject an infinite number of empty files or key/value pairs into a request in order to exhaust the device's resources.
* http.c: Solve performance issues on POST uploadsXavier Del Campo Romero2024-01-201-47/+91
| | | | | | | Profiling showed that reading multipart/form POST uploads byte-by-byte was too slow and typically led to maximum CPU usage. Therefore, the older approach (as done up to commit 7efc2b3a) was more efficient, even if the resulting code was a bit uglier.
* http.c: Unify read operationsXavier Del Campo Romero2023-11-241-159/+178
| | | | | | | | | | | | | | | | | So far, libweb would perform different read operations depending on its state: - For HTTP headers or request bodies, one byte at a time was read. - For multipart/form-data, up to BUFSIZ bytes at a time were read. However, this caused a significant extra number of syscalls for no reason and would increase code complexity, specially when parsing multiform/form-data boundaries. Now, http_read always reads up to BUFSIZ bytes at a time and process them on a loop. Apart from reducing code complexity, this should increase performance due to the (much) lower number of syscalls required.
* http.c. Limit multipart/form-data to POSTXavier Del Campo2023-11-201-0/+6
|
* http: Add support for PUTXavier Del Campo2023-11-201-38/+224
| | | | | | | | | | | | | | Notes: - Since curl would use the "Expect: 100-continue" header field for PUT operations, this was a good operation to fix the existing issues in its implementation. Breaking changes: - expect_continue is no longer exclusive to struct http_post. Now, it has been moved into struct http_payload and it is up to users to check it.
* Send HTTP headers to payload callbackXavier Del Campo Romero2023-11-181-3/+65
| | | | | | | | | | | | Even if libweb already parses some common headers, such as Content-Length, some users might find it interesting to inspect which headers were received from a request. Since HTTP/1.1 does not define a limit on the number of maximum headers a client can send, for security reasons a maximum value must be provided by the user. Any extra headers shall be then discarded by libweb. An example application showing this new feature is also provided.
* http.c: Fix more issues with partial boundariesXavier Del Campo Romero2023-11-121-19/+36
| | | | | | | | | - http_memmem must not check strlen(a) > n because, in case of a partial boundary, it would wrongfully return NULL. - If one or more characters from a partial boundary are found at the end of a buffer, but the next buffer does not start with the rest of the boundary, the accumulated boundary must be reset, and then look for a new boundary.
* http.c: Fix several issues with partial boundariesXavier Del Campo Romero2023-11-121-11/+17
| | | | | | | | | | - Writing to m->boundary[len] did not make any sense, as len is not meant to change between calls to read_mf_boundary_byte. - For the same reason, memset(3)ing "len + 1" did not make any sense. - When a partial boundary is found, http_memmem must still return st. - Calling reset_boundary with prev == 0 did not make sense, since that case typically means a partial boundary was found on a previous iteration, so m->blen must not be reset.
* http: Make http_decode_url return intXavier Del Campo Romero2023-11-121-32/+42
| | | | | So far, it was not possible callers to distinguish between decoding errors, as caused by ill-formed input, from fatal errors.
* http.c: Avoid use of dynstr_append_or_ret_nonzeroXavier Del Campo Romero2023-11-121-10/+46
| | | | | | | This macro would return a positive integer on failure. However, functions called by http_update should only return a positive integer for user input-related errors, not fatal errors such as those related to failed memory allocations.
* http.c: Avoid writing body for HEAD requestsXavier Del Campo Romero2023-11-121-1/+8
| | | | As opposed to GET or POST requests, HEAD must not write any body bytes.
* Rename project from slweb to libwebv0.1.0-rc3Xavier Del Campo Romero2023-10-111-1/+1
| | | | | | | | | | | | It was found out there was another project of the same name around (https://git.sr.ht/~strahinja/slweb/), also related to website generation. In order to avoid confusion, a new name has been chosen for this project. Surprisingly, libweb was not in use by any distributions (according to https://repology.org and AUR index), and it should reflect well the intention behind this project i.e., being a library to build web-related stuff.
* http: Support HEADXavier Del Campo Romero2023-10-101-0/+4
|
* http: Use null-terminated string for POST dataXavier Del Campo Romero2023-09-091-3/+3
| | | | | | | | | application/x-www-form-urlencoded-data is (or should be) always text, so it is preferrable to define struct http_post member "data" as a null- terminated string. For applications already making this assumption, this change should now remove the need for string duplication.
* http: Insert name into http_post_fileXavier Del Campo Romero2023-09-091-0/+1
| | | | | Whereas slcl, the project where slweb started, ignored this field, some applications might require it.
* http: Allow multiple non-file Content-DispositionXavier Del Campo Romero2023-09-091-12/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, slweb accepts requests such as: --boundary Content-Disposition: form-data; name="field1" value1 --boundary Content-Disposition: form-data; name="field2" value2 --boundary Content-Disposition: form-data; name="field3"; filename="example.txt" The following breaking changes have been introduced: Member "dir" from struct http_post was a leftover from the days where slcl and slweb were one project. It did not make sense for slweb, since it should not decide which Content-Disposition names are allowed. In other words, "dir" was only relevant in the scope of slcl. Member "n" from struct http_post used to have two meanings: - The length of a URL-encoded request. - The number of files on a multipart/form-data request. Since "npairs" had to be introduced to struct http_post, it did not make sense to keep this dual meaning any more. Therefore, "n" has been restricted to the former, whereas a new member, called "nfiles", has been introduced for the latter.
* http.c: Use BUFSIZ instead of arbitrary valueXavier Del Campo Romero2023-09-071-1/+1
| | | | | | | | | | | | | | According to C99 7.19.1p3: BUFSIZ is a macro that expands to an integer constant expression that is the size of the buffer used by the setbuf function. In other words, this means BUFSIZ is the most optimal length for a buffer that reads a file into memory in chunks using fread(3). Note: the number of bytes sent to the client might be less than BUFSIZ, so this would act as a bottleneck, no matter how large the buffer passed to fread(3) is.
* http.c: Return error if check_length failsXavier Del Campo Romero2023-09-071-0/+7
| | | | | Otherwise, fatal errors coming from the h->cfg.length would be unnoticed, causing slweb to attempt to send a response.
* http.c: Merge payload_{get,post} into process_payloadXavier Del Campo Romero2023-08-131-17/+3
| | | | | Both functions were in fact identical, so there was no reason to keep two definitions rather than one.
* http.c: Remove useless explicit castXavier Del Campo Romero2023-08-011-1/+1
|
* Move header files to subdirectoryXavier Del Campo Romero2023-07-211-1/+1
| | | | | | | | | | | Since slweb is meant as a library, it is advisable to keep public header files under their own directory in order to avoid name clashing i.e., #include "something.h" Now becomes: #include "slweb/something.h"
* http.c: Disallow forbidden filenames during uploadXavier Del Campo Romero2023-07-201-0/+8
| | | | | | | - '.' or '..' must not be used for filenames. - Filenames must not contain forward slashes ('/'). - Filenames must not contain asterisks ('*') to avoid confusion with wildcard expressions.
* http.c: Use case-insensitive compare for Content-DispositionXavier Del Campo Romero2023-07-201-1/+1
| | | | | HTTP headers are case-insensitive, so the implementation must accept Content-Diposition, content-disposition or any other variation.
* http.c: Accept resources with '&' or '?'Xavier Del Campo Romero2023-07-201-2/+4
| | | | | Otherwise, client requests to resources such as '/me & you', '/?' or '/??preview=1' would fail.
* Avoid crashing on SIGPIPEXavier Del Campo Romero2023-07-201-0/+2
| | | | | | | | | Under some circumstances, clients could cause SIGPIPE to slcl. Since this signal was not handled by server.c (i.e., via sigaction(3)), slcl would crash without any error messages printed to stderr. In such situation, SIGPIPE should not be usually considered a fatal error, so it is preferrable to close the connection and keep working.
* http.c: Decode URL resource and parameters separatelyXavier Del Campo Romero2023-07-201-23/+41
| | | | | | | | | Given the following contrived example request: /example%FB%DC&arg%DE1=examplevalue%AA slcl must decode each token separately, so that percent-encoded characters '&', '=' or '?' do not get accidently intepreted.
* Fix missing error checks for strtoul(3)Xavier Del Campo Romero2023-07-201-1/+9
|
* Return error if write_ctx_free failsXavier Del Campo Romero2023-07-201-4/+10
| | | | | | | | | | | | | | | Otherwise, write_body_mem and write_body_mem would silently fail, causing undefined behaviour. Notes: The return value for write_ctx_free is currently assigned to that of fclose(3), which can be either 0 on success or EOF on failure. However, it makes sense for write_body_mem and write_body_mem to simply check against non-zero. Also, it would not be sensible to return EOF to caller functions, which expect either 0 (success), -1 (fatal error) or 1 (input error).
* Remove HTTP/1.0 supportXavier Del Campo Romero2023-07-201-33/+5
| | | | | | | | | | | Considering http.h defined HTTP/1.1-only responses such as "303 See Other", as well as incoming HTTP/1.1-only features (e.g.: byte serving), it did not make much sense to keep a somewhat broken compatibility against HTTP/1.0. Unfortunately, this breaks support with some existing clients such as lynx(1), even if HTTP/1.0 was already deprecated many years ago. However, even lynx(1) can be configured to support HTTP/1.1.
* Support URL parametersXavier Del Campo Romero2023-07-201-19/+234
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, http_payload includes a list of human-readable parameters that can be read (but not modified) by users. Given the following example link: /test?key1=value1&key2=value2 This will generate two parameters, with the following values: { .args = { [0] = {.key = "key1", .value = "value1"}, [1] = {.key = "key2", .value = "value2"} }, .n_args = 2 } As expected, if any URL parameters are given, struct http_payload member "resource" is accordingly trimmed so as not to include any parameters. Therefore, considering the example above: {.args = {...}, .resource = "/test"} Limitations: - Since the definition of struct http_arg is both shared by http.h (as a read-only pointer within struct http_payload) and http.c (as a read/write pointer within struct ctx), its members (namely key and value) must remain as read/write pointers, even if they must not be modified by users of http.h.
* Define _POSIX_C_SOURCEXavier Del Campo Romero2023-07-201-0/+2
| | | | | This allows using the default compiler defined by make(1) (i.e., c99(1)), thus improving POSIX compatibility.
* http.c: Add missing #includeXavier Del Campo Romero2023-07-201-0/+1
| | | | As required by strncasecmp(3).
* Send response on quota exceededXavier Del Campo Romero2023-07-201-7/+26
| | | | | | | | | | | | | | | | | So far, slcl would just close the connection with a client when the Content-Length of an incoming request exceeded the user quota, without any meaningful information given back to the user. Now, slcl responds with a HTML file with meaningful information about the error. Limitations: - While this commits has been successfully tested on ungoogled-chromium, LibreWolf (and I assume Firefox and any other derivates too) does not seem to receive the response from the server. - However, this issue only occurred during local testing, but not on remote instances.
* http.c: Minor formatting changeXavier Del Campo Romero2023-07-201-2/+1
|
* Remove(3) f->tmpname from ctx_freeXavier Del Campo Romero2023-07-201-5/+10
| | | | | | Until now, f->tmpname was removed by move_file when the move operation succeeded. However, since a HTTP operation can fail before move_file is called, the temporary file must also be removed.
* Implement user quotaXavier Del Campo Romero2023-07-201-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This feature allows admins to set a specific quota for each user, in MiB. This feature is particularly useful for shared instances, where unlimited user storage might be unfeasible or even dangerous for the server. Also, a nice HTML5 <progress> element has been added to the site that shows how much of the quota has been consumed. If no quota is set, slcl falls back to the default behaviour i.e., assume unlimited storage. Limitations: - While HTTP does specify a Content-Length, which determines the length of the whole request, it does not specify how many files are involved or their individual sizes. - Because of this, if multiple files are uploaded simultaneously, the whole request would be dropped if user quota is exceeded, even if not all files exceeded it. - Also, Content-Length adds the length of some HTTP boilerplate (e.g.: boundaries), but slcl must rely on this before accepting the whole request. In other words, this means some requests might be rejected by slcl because of the extra bytes caused by such boilerplate. - When the quota is exceeded, slcl must close the connection so that the rest of the transfer is cancelled. Unfortunately, this means no HTML can be sent back to the customer to inform about the situation.
* http.c: Compare headers as case-insensitiveXavier Del Campo Romero2023-07-201-1/+1
| | | | | Web browsers such as lynx send "Content-length" instead of "Content-Length" (as done by LibreWolf and Chromium).
* http.c: Use persistent cookiesXavier Del Campo Romero2023-07-201-0/+40
| | | | | | Cookies without "Expires" are considered non-persistent and thus can be removed by the web browser. Instead, slcl now sets persistent cookies that last for 1 year.
* http.c: Improve error detection for strotull(3)Xavier Del Campo Romero2023-07-201-1/+12
| | | | | set_length relies on user input to determine Content-Length, so it should be considered unreliable.
* Fix memory leak on failed realloc(3)Xavier Del Campo Romero2023-07-201-13/+36
| | | | | | | | | | According to C99 ยง7.20.3.4: If memory for the new object cannot be allocated, the old object is not deallocated and its value is unchanged. Therefore, a temporary pointer must be used to ensure the original object can still be deallocated should realloc(3) return a null pointer.