aboutsummaryrefslogtreecommitdiff
path: root/http.c
Commit message (Collapse)AuthorAgeFilesLines
* Add http_strcasecmp(3)HEADmasterXavier Del Campo Romero2026-02-271-0/+26
| | | | | | | | | POSIX.1-2008 does not any locale-specific version of strcasecmp(3), so conversions to lowercase depend on the system locale. Since HTTP header fields must be checked without case sensitivity and not depend on the system locale, a specialised function that forces the "POSIX" locale is required.
* Add http_strncasecmp(3)Xavier Del Campo Romero2026-02-271-4/+37
| | | | | | | | | POSIX.1-2008 does not any locale-specific version of strncasecmp(3), so conversions to lowercase depend on the system locale. Since HTTP header fields must be checked without case sensitivity and not depend on the system locale, a specialised function that forces the "POSIX" locale is required.
* http.c: Ensure valid object on freelocale(3)Xavier Del Campo Romero2026-02-271-1/+3
| | | | | | | According to POSIX.1-2008, the behaviour is undefined if freelocale(3) is called with an invalid object. [1] [1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/freelocale.html
* http.c: Break on found headerXavier Del Campo Romero2026-02-271-2/+7
| | | | | Once a given HTTP header from the list has been found, it makes no sense to keep reading the rest from it.
* http.c: Remove unused variableXavier Del Campo Romero2026-02-121-1/+0
|
* http.c: Use expected timezone abbreviationXavier Del Campo Romero2026-02-121-2/+2
| | | | | | | | | | The struct tm instance consumed by append_expire is provided by users and could refer to any timezone, rather than GMT only. According to Wikipedia [1], timezone abbreviations are either 3 or 4 characters long, or use numeric UTC offsets. [1]: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#Time_zone_abbreviations
* Add HTTP op and resource to length callbackXavier Del Campo Romero2026-02-121-1/+2
| | | | | | Users might want to know which HTTP operation (i.e., POST or PUT) and/or resource is being requested before determining whether the request should be accepted or not.
* http.c: Force POSIX locale on append_expireXavier Del Campo Romero2026-02-121-3/+15
| | | | | | Otherwise, strftime(3) could return different strings depending on the system configuration, and therefore return 0 if the resulting string does not fit into buf.
* Add optional expiration date to http_cookie_createXavier Del Campo Romero2026-02-121-23/+13
| | | | | | | So far, libweb had been arbitrarily appending a 1-year expiration date to all HTTP cookies. While good enough for some contexts, libweb should allow users to set up their own, if any, so this arbitary decision has been eventually removed.
* http.c: Fix attack vector on PUT requestsXavier Del Campo Romero2026-02-091-1/+5
| | | | | | | Without the fix, a malicious user could perform a large number of PUT requests to any endpoint, regardless of being correct or not, so that libweb would allocate a large number of temporary files without removing them, eventually exhausting the system resources.
* Free chunk/step user data on context freeXavier Del Campo Romero2025-10-081-1/+1
| | | | | | | | | So far, users had no way to free user-defined data allocated inside the chunk/step function pointers whenever an error occurred. Now, the free callback can be also used in conjunction with chunk/step, so that user-defined data is now deallocated when the operation finishes (in the case of chunk-encoded data) or an error occurs.
* Implement HTTP chunk encodingXavier Del Campo Romero2025-10-081-5/+133
| | | | | | A new function pointer, namely chunk, has been added to struct http_response so that library users can generate their message bodies dynamically.
* Fix design issues with async responses, add async exampleXavier Del Campo Romero2025-10-061-42/+110
| | | | | | | | | | | | | | | | | | struct http_response did not provide users any void * that could be used to maintain a state between calls to an asynchronous HTTP response. On the other hand, the user pointer could not be used for this purpose, since it is shared among all HTTP clients for a given struct handler instance. Moreover, the length callback was still not supporting this feature, which in fact might be required by some users. Implementing this was particularly challenging, as this broke the current assumption that all bytes on a call to http_read were being processed. Now, since a client request can only be partially processed because of the length callback, http_read must take this into account so that the remaining bytes are still available for future calls, before reading again from the file descriptor.
* Implement async HTTP responsesXavier Del Campo Romero2025-09-241-43/+106
| | | | | | | | | | | | Sometimes, library users cannot return a HTTP response as soon as the request is received, or the operations that are required to generate it can take a long time. In order to solve this, libweb adds a new member to struct http_response, namely step, which must be assigned to a function whenever a HTTP response should be generated in a non-blocking manner. Leaving the function pointer as null will fall back to the default behaviour.
* http.c: Always set SameSite=Strict to cookiesXavier Del Campo Romero2025-09-231-6/+7
| | | | | | | This cookie attribute allows to mitigate CSRF attacks, while not requiring the server to store additional data. [1] [1]: https://owasp.org/www-community/SameSite
* Implement HTTP byte servingXavier Del Campo Romero2024-11-111-26/+259
| | | | | | | | | | | | | | This commit allows the HTTP server to return partial content to clients, rather than returning the whole resource. This can be particularly useful for applications such as audio/video playback or showing large PDF files. Notes: - Applications must not care about partial contents i.e., if a valid user request was made, applications must still return HTTP status 200 ("OK"), as usual. The HTTP server will then translate the status code to 206 ("Partial Content") if required.
* http.c: Always call ctx_to_payloadXavier Del Campo Romero2024-10-041-46/+10
| | | | | | | | | Defining each struct http_payload manually had the risk of missing some member on the initializer. This was in fact the case for `n_headers` and `headers`, which were only assigned by ctx_to_payload, and therefore some specific HTTP requests would mistakenly not reflect such information to users.
* http.c: Avoid isspace(3) in get_boundaryXavier Del Campo Romero2024-10-041-2/+1
| | | | | | | | | According to POSIX.1-2008, this function is sensitive to the system locale, which might then have different definitions for a whitespace character. Therefore, it is safer to only check against ' ' so as to remove such a dependency.
* http.c: Fix ending boundaries not followed by CRLFXavier Del Campo Romero2024-08-221-41/+84
| | | | | | | According to RFC 2046, section 5.1.1, end boundaries might not be followed by CRLF. However, so far libweb naively relied on this behaviour as major implementations, such as cURL, Chromium or Gecko always add the optional CRLF, whereas Dillo does not.
* http.c: Accept double quotes on boundariesXavier Del Campo Romero2024-08-221-7/+66
| | | | | | | | | | | "multipart/form-data"-encoded POST requests might use double quotes for their boundaries. While this is required when invalid characters are otherwise used (e.g.: ':'), some web clients always insert double quotes. Additionally, according to RFC 2046 section 5.1.1, the boundary parameter consists of 1 to 70 characters, but libweb was not imposing such restrictions.
* http.c: Remove unneeded parameterXavier Del Campo Romero2024-08-221-17/+17
| | | | | | | | | | This parameter was rendered obsolete after the following commit: commit b0accd099fa8c5110d4c3c68830ad6fd810ca3ec Author: Xavier Del Campo Romero <xavi.dcr@tutanota.com> Date: Fri Nov 24 00:52:50 2023 +0100 http.c: Unify read operations
* http.c: Remove unused variableXavier Del Campo Romero2024-08-221-1/+1
|
* http.c: Fix memory leak on read failureXavier Del Campo Romero2024-08-221-9/+9
| | | | | | | For some unknown reason, ctx_free was only called by update_lstate, but this is not the only function that modifies a struct ctx instance. Since struct ctx is related to read operations, ctx_free must instead be called whenever http_read fails.
* http.c: Fix wrong checkXavier Del Campo Romero2024-08-221-1/+1
| | | | | | | | | | | | | p->f is a FILE *, so it is invalid to check against negative values. This bug was introduced when p->fd, a file descriptor, was replaced with p->f, a FILE *, by the following commit: commit b0accd099fa8c5110d4c3c68830ad6fd810ca3ec Author: Xavier Del Campo Romero <xavi.dcr@tutanota.com> Date: Fri Nov 24 00:52:50 2023 +0100 http.c: Unify read operations
* Limit maximum multipart/form-data pairs and filesXavier Del Campo Romero2024-02-191-1/+17
| | | | | | A malicious user could inject an infinite number of empty files or key/value pairs into a request in order to exhaust the device's resources.
* http.c: Solve performance issues on POST uploadsXavier Del Campo Romero2024-01-201-47/+91
| | | | | | | Profiling showed that reading multipart/form POST uploads byte-by-byte was too slow and typically led to maximum CPU usage. Therefore, the older approach (as done up to commit 7efc2b3a) was more efficient, even if the resulting code was a bit uglier.
* http.c: Unify read operationsXavier Del Campo Romero2023-11-241-159/+178
| | | | | | | | | | | | | | | | | So far, libweb would perform different read operations depending on its state: - For HTTP headers or request bodies, one byte at a time was read. - For multipart/form-data, up to BUFSIZ bytes at a time were read. However, this caused a significant extra number of syscalls for no reason and would increase code complexity, specially when parsing multiform/form-data boundaries. Now, http_read always reads up to BUFSIZ bytes at a time and process them on a loop. Apart from reducing code complexity, this should increase performance due to the (much) lower number of syscalls required.
* http.c. Limit multipart/form-data to POSTXavier Del Campo2023-11-201-0/+6
|
* http: Add support for PUTXavier Del Campo2023-11-201-38/+224
| | | | | | | | | | | | | | Notes: - Since curl would use the "Expect: 100-continue" header field for PUT operations, this was a good operation to fix the existing issues in its implementation. Breaking changes: - expect_continue is no longer exclusive to struct http_post. Now, it has been moved into struct http_payload and it is up to users to check it.
* Send HTTP headers to payload callbackXavier Del Campo Romero2023-11-181-3/+65
| | | | | | | | | | | | Even if libweb already parses some common headers, such as Content-Length, some users might find it interesting to inspect which headers were received from a request. Since HTTP/1.1 does not define a limit on the number of maximum headers a client can send, for security reasons a maximum value must be provided by the user. Any extra headers shall be then discarded by libweb. An example application showing this new feature is also provided.
* http.c: Fix more issues with partial boundariesXavier Del Campo Romero2023-11-121-19/+36
| | | | | | | | | - http_memmem must not check strlen(a) > n because, in case of a partial boundary, it would wrongfully return NULL. - If one or more characters from a partial boundary are found at the end of a buffer, but the next buffer does not start with the rest of the boundary, the accumulated boundary must be reset, and then look for a new boundary.
* http.c: Fix several issues with partial boundariesXavier Del Campo Romero2023-11-121-11/+17
| | | | | | | | | | - Writing to m->boundary[len] did not make any sense, as len is not meant to change between calls to read_mf_boundary_byte. - For the same reason, memset(3)ing "len + 1" did not make any sense. - When a partial boundary is found, http_memmem must still return st. - Calling reset_boundary with prev == 0 did not make sense, since that case typically means a partial boundary was found on a previous iteration, so m->blen must not be reset.
* http: Make http_decode_url return intXavier Del Campo Romero2023-11-121-32/+42
| | | | | So far, it was not possible callers to distinguish between decoding errors, as caused by ill-formed input, from fatal errors.
* http.c: Avoid use of dynstr_append_or_ret_nonzeroXavier Del Campo Romero2023-11-121-10/+46
| | | | | | | This macro would return a positive integer on failure. However, functions called by http_update should only return a positive integer for user input-related errors, not fatal errors such as those related to failed memory allocations.
* http.c: Avoid writing body for HEAD requestsXavier Del Campo Romero2023-11-121-1/+8
| | | | As opposed to GET or POST requests, HEAD must not write any body bytes.
* Rename project from slweb to libwebv0.1.0-rc3Xavier Del Campo Romero2023-10-111-1/+1
| | | | | | | | | | | | It was found out there was another project of the same name around (https://git.sr.ht/~strahinja/slweb/), also related to website generation. In order to avoid confusion, a new name has been chosen for this project. Surprisingly, libweb was not in use by any distributions (according to https://repology.org and AUR index), and it should reflect well the intention behind this project i.e., being a library to build web-related stuff.
* http: Support HEADXavier Del Campo Romero2023-10-101-0/+4
|
* http: Use null-terminated string for POST dataXavier Del Campo Romero2023-09-091-3/+3
| | | | | | | | | application/x-www-form-urlencoded-data is (or should be) always text, so it is preferrable to define struct http_post member "data" as a null- terminated string. For applications already making this assumption, this change should now remove the need for string duplication.
* http: Insert name into http_post_fileXavier Del Campo Romero2023-09-091-0/+1
| | | | | Whereas slcl, the project where slweb started, ignored this field, some applications might require it.
* http: Allow multiple non-file Content-DispositionXavier Del Campo Romero2023-09-091-12/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, slweb accepts requests such as: --boundary Content-Disposition: form-data; name="field1" value1 --boundary Content-Disposition: form-data; name="field2" value2 --boundary Content-Disposition: form-data; name="field3"; filename="example.txt" The following breaking changes have been introduced: Member "dir" from struct http_post was a leftover from the days where slcl and slweb were one project. It did not make sense for slweb, since it should not decide which Content-Disposition names are allowed. In other words, "dir" was only relevant in the scope of slcl. Member "n" from struct http_post used to have two meanings: - The length of a URL-encoded request. - The number of files on a multipart/form-data request. Since "npairs" had to be introduced to struct http_post, it did not make sense to keep this dual meaning any more. Therefore, "n" has been restricted to the former, whereas a new member, called "nfiles", has been introduced for the latter.
* http.c: Use BUFSIZ instead of arbitrary valueXavier Del Campo Romero2023-09-071-1/+1
| | | | | | | | | | | | | | According to C99 7.19.1p3: BUFSIZ is a macro that expands to an integer constant expression that is the size of the buffer used by the setbuf function. In other words, this means BUFSIZ is the most optimal length for a buffer that reads a file into memory in chunks using fread(3). Note: the number of bytes sent to the client might be less than BUFSIZ, so this would act as a bottleneck, no matter how large the buffer passed to fread(3) is.
* http.c: Return error if check_length failsXavier Del Campo Romero2023-09-071-0/+7
| | | | | Otherwise, fatal errors coming from the h->cfg.length would be unnoticed, causing slweb to attempt to send a response.
* http.c: Merge payload_{get,post} into process_payloadXavier Del Campo Romero2023-08-131-17/+3
| | | | | Both functions were in fact identical, so there was no reason to keep two definitions rather than one.
* http.c: Remove useless explicit castXavier Del Campo Romero2023-08-011-1/+1
|
* Move header files to subdirectoryXavier Del Campo Romero2023-07-211-1/+1
| | | | | | | | | | | Since slweb is meant as a library, it is advisable to keep public header files under their own directory in order to avoid name clashing i.e., #include "something.h" Now becomes: #include "slweb/something.h"
* http.c: Disallow forbidden filenames during uploadXavier Del Campo Romero2023-07-201-0/+8
| | | | | | | - '.' or '..' must not be used for filenames. - Filenames must not contain forward slashes ('/'). - Filenames must not contain asterisks ('*') to avoid confusion with wildcard expressions.
* http.c: Use case-insensitive compare for Content-DispositionXavier Del Campo Romero2023-07-201-1/+1
| | | | | HTTP headers are case-insensitive, so the implementation must accept Content-Diposition, content-disposition or any other variation.
* http.c: Accept resources with '&' or '?'Xavier Del Campo Romero2023-07-201-2/+4
| | | | | Otherwise, client requests to resources such as '/me & you', '/?' or '/??preview=1' would fail.
* Avoid crashing on SIGPIPEXavier Del Campo Romero2023-07-201-0/+2
| | | | | | | | | Under some circumstances, clients could cause SIGPIPE to slcl. Since this signal was not handled by server.c (i.e., via sigaction(3)), slcl would crash without any error messages printed to stderr. In such situation, SIGPIPE should not be usually considered a fatal error, so it is preferrable to close the connection and keep working.
* http.c: Decode URL resource and parameters separatelyXavier Del Campo Romero2023-07-201-23/+41
| | | | | | | | | Given the following contrived example request: /example%FB%DC&arg%DE1=examplevalue%AA slcl must decode each token separately, so that percent-encoded characters '&', '=' or '?' do not get accidently intepreted.