| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
POSIX.1-2008 does not any locale-specific version of strcasecmp(3), so
conversions to lowercase depend on the system locale.
Since HTTP header fields must be checked without case sensitivity and
not depend on the system locale, a specialised function that forces the
"POSIX" locale is required.
|
| |
|
|
|
|
|
|
|
| |
POSIX.1-2008 does not any locale-specific version of strncasecmp(3), so
conversions to lowercase depend on the system locale.
Since HTTP header fields must be checked without case sensitivity and
not depend on the system locale, a specialised function that forces the
"POSIX" locale is required.
|
| |
|
|
|
|
|
| |
According to POSIX.1-2008, the behaviour is undefined if freelocale(3)
is called with an invalid object. [1]
[1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/freelocale.html
|
| |
|
|
|
| |
Once a given HTTP header from the list has been found, it makes no sense
to keep reading the rest from it.
|
| | |
|
| |
|
|
|
|
|
|
|
|
| |
The struct tm instance consumed by append_expire is provided by users
and could refer to any timezone, rather than GMT only.
According to Wikipedia [1], timezone abbreviations are either 3 or 4
characters long, or use numeric UTC offsets.
[1]: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#Time_zone_abbreviations
|
| |
|
|
|
|
| |
Users might want to know which HTTP operation (i.e., POST or PUT) and/or
resource is being requested before determining whether the request
should be accepted or not.
|
| |
|
|
|
|
| |
Otherwise, strftime(3) could return different strings depending on the
system configuration, and therefore return 0 if the resulting string
does not fit into buf.
|
| |
|
|
|
|
|
| |
So far, libweb had been arbitrarily appending a 1-year expiration date
to all HTTP cookies. While good enough for some contexts, libweb should
allow users to set up their own, if any, so this arbitary decision has
been eventually removed.
|
| |
|
|
|
|
|
| |
Without the fix, a malicious user could perform a large number of PUT
requests to any endpoint, regardless of being correct or not, so that
libweb would allocate a large number of temporary files without removing
them, eventually exhausting the system resources.
|
| |
|
|
|
|
|
|
|
| |
So far, users had no way to free user-defined data allocated inside the
chunk/step function pointers whenever an error occurred.
Now, the free callback can be also used in conjunction with chunk/step,
so that user-defined data is now deallocated when the operation
finishes (in the case of chunk-encoded data) or an error occurs.
|
| |
|
|
|
|
| |
A new function pointer, namely chunk, has been added to struct
http_response so that library users can generate their message bodies
dynamically.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
struct http_response did not provide users any void * that could be used
to maintain a state between calls to an asynchronous HTTP response.
On the other hand, the user pointer could not be used for this purpose,
since it is shared among all HTTP clients for a given struct handler
instance.
Moreover, the length callback was still not supporting this feature,
which in fact might be required by some users. Implementing this was
particularly challenging, as this broke the current assumption that
all bytes on a call to http_read were being processed.
Now, since a client request can only be partially processed because of
the length callback, http_read must take this into account so that the
remaining bytes are still available for future calls, before reading
again from the file descriptor.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Sometimes, library users cannot return a HTTP response as soon as the
request is received, or the operations that are required to generate it
can take a long time.
In order to solve this, libweb adds a new member to struct
http_response, namely step, which must be assigned to a function
whenever a HTTP response should be generated in a non-blocking manner.
Leaving the function pointer as null will fall back to the default
behaviour.
|
| |
|
|
|
|
|
| |
This cookie attribute allows to mitigate CSRF attacks, while not
requiring the server to store additional data. [1]
[1]: https://owasp.org/www-community/SameSite
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit allows the HTTP server to return partial content to clients,
rather than returning the whole resource. This can be particularly
useful for applications such as audio/video playback or showing large
PDF files.
Notes:
- Applications must not care about partial contents i.e., if a valid
user request was made, applications must still return HTTP status 200
("OK"), as usual. The HTTP server will then translate the status code to
206 ("Partial Content") if required.
|
| |
|
|
|
|
|
|
|
| |
Defining each struct http_payload manually had the risk of missing some
member on the initializer.
This was in fact the case for `n_headers` and `headers`, which were only
assigned by ctx_to_payload, and therefore some specific HTTP requests
would mistakenly not reflect such information to users.
|
| |
|
|
|
|
|
|
|
| |
According to POSIX.1-2008, this function is sensitive to the system
locale, which might then have different definitions for a whitespace
character.
Therefore, it is safer to only check against ' ' so as to remove such a
dependency.
|
| |
|
|
|
|
|
| |
According to RFC 2046, section 5.1.1, end boundaries might not be
followed by CRLF. However, so far libweb naively relied on this
behaviour as major implementations, such as cURL, Chromium or Gecko
always add the optional CRLF, whereas Dillo does not.
|
| |
|
|
|
|
|
|
|
|
|
| |
"multipart/form-data"-encoded POST requests might use double quotes for
their boundaries. While this is required when invalid characters are
otherwise used (e.g.: ':'), some web clients always insert double
quotes.
Additionally, according to RFC 2046 section 5.1.1, the boundary
parameter consists of 1 to 70 characters, but libweb was not imposing
such restrictions.
|
| |
|
|
|
|
|
|
|
|
| |
This parameter was rendered obsolete after the following commit:
commit b0accd099fa8c5110d4c3c68830ad6fd810ca3ec
Author: Xavier Del Campo Romero <xavi.dcr@tutanota.com>
Date: Fri Nov 24 00:52:50 2023 +0100
http.c: Unify read operations
|
| | |
|
| |
|
|
|
|
|
| |
For some unknown reason, ctx_free was only called by update_lstate, but
this is not the only function that modifies a struct ctx instance. Since
struct ctx is related to read operations, ctx_free must instead be
called whenever http_read fails.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
p->f is a FILE *, so it is invalid to check against negative values.
This bug was introduced when p->fd, a file descriptor, was replaced with
p->f, a FILE *, by the following commit:
commit b0accd099fa8c5110d4c3c68830ad6fd810ca3ec
Author: Xavier Del Campo Romero <xavi.dcr@tutanota.com>
Date: Fri Nov 24 00:52:50 2023 +0100
http.c: Unify read operations
|
| |
|
|
|
|
| |
A malicious user could inject an infinite number of empty files or
key/value pairs into a request in order to exhaust the device's
resources.
|
| |
|
|
|
|
|
| |
Profiling showed that reading multipart/form POST uploads byte-by-byte
was too slow and typically led to maximum CPU usage. Therefore, the
older approach (as done up to commit 7efc2b3a) was more efficient, even
if the resulting code was a bit uglier.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So far, libweb would perform different read operations depending on its
state:
- For HTTP headers or request bodies, one byte at a time was read.
- For multipart/form-data, up to BUFSIZ bytes at a time were read.
However, this caused a significant extra number of syscalls for no
reason and would increase code complexity, specially when parsing
multiform/form-data boundaries.
Now, http_read always reads up to BUFSIZ bytes at a time and process
them on a loop. Apart from reducing code complexity, this should
increase performance due to the (much) lower number of syscalls
required.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Notes:
- Since curl would use the "Expect: 100-continue" header field for PUT
operations, this was a good operation to fix the existing issues in its
implementation.
Breaking changes:
- expect_continue is no longer exclusive to struct http_post. Now, it
has been moved into struct http_payload and it is up to users to check
it.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Even if libweb already parses some common headers, such as
Content-Length, some users might find it interesting to inspect which
headers were received from a request.
Since HTTP/1.1 does not define a limit on the number of maximum headers
a client can send, for security reasons a maximum value must be provided
by the user. Any extra headers shall be then discarded by libweb.
An example application showing this new feature is also provided.
|
| |
|
|
|
|
|
|
|
| |
- http_memmem must not check strlen(a) > n because, in case of a partial
boundary, it would wrongfully return NULL.
- If one or more characters from a partial boundary are found at the end
of a buffer, but the next buffer does not start with the rest of the
boundary, the accumulated boundary must be reset, and then look for a
new boundary.
|
| |
|
|
|
|
|
|
|
|
| |
- Writing to m->boundary[len] did not make any sense, as len is not
meant to change between calls to read_mf_boundary_byte.
- For the same reason, memset(3)ing "len + 1" did not make any sense.
- When a partial boundary is found, http_memmem must still return st.
- Calling reset_boundary with prev == 0 did not make sense, since that
case typically means a partial boundary was found on a previous
iteration, so m->blen must not be reset.
|
| |
|
|
|
| |
So far, it was not possible callers to distinguish between decoding
errors, as caused by ill-formed input, from fatal errors.
|
| |
|
|
|
|
|
| |
This macro would return a positive integer on failure. However,
functions called by http_update should only return a positive integer
for user input-related errors, not fatal errors such as those related to
failed memory allocations.
|
| |
|
|
| |
As opposed to GET or POST requests, HEAD must not write any body bytes.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
It was found out there was another project of the same name around
(https://git.sr.ht/~strahinja/slweb/), also related to website
generation.
In order to avoid confusion, a new name has been chosen for this
project. Surprisingly, libweb was not in use by any distributions
(according to https://repology.org and AUR index), and it should
reflect well the intention behind this project i.e., being a library
to build web-related stuff.
|
| | |
|
| |
|
|
|
|
|
|
|
| |
application/x-www-form-urlencoded-data is (or should be) always text, so
it is preferrable to define struct http_post member "data" as a null-
terminated string.
For applications already making this assumption, this change should now
remove the need for string duplication.
|
| |
|
|
|
| |
Whereas slcl, the project where slweb started, ignored this field, some
applications might require it.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now, slweb accepts requests such as:
--boundary
Content-Disposition: form-data; name="field1"
value1
--boundary
Content-Disposition: form-data; name="field2"
value2
--boundary
Content-Disposition: form-data; name="field3"; filename="example.txt"
The following breaking changes have been introduced:
Member "dir" from struct http_post was a leftover from the days where
slcl and slweb were one project. It did not make sense for slweb, since
it should not decide which Content-Disposition names are allowed. In
other words, "dir" was only relevant in the scope of slcl.
Member "n" from struct http_post used to have two meanings:
- The length of a URL-encoded request.
- The number of files on a multipart/form-data request.
Since "npairs" had to be introduced to struct http_post, it did not make
sense to keep this dual meaning any more. Therefore, "n" has been
restricted to the former, whereas a new member, called "nfiles", has
been introduced for the latter.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to C99 7.19.1p3:
BUFSIZ is a macro that expands to an integer constant expression that is
the size of the buffer used by the setbuf function.
In other words, this means BUFSIZ is the most optimal length for a
buffer that reads a file into memory in chunks using fread(3).
Note: the number of bytes sent to the client might be less than BUFSIZ,
so this would act as a bottleneck, no matter how large the buffer passed
to fread(3) is.
|
| |
|
|
|
| |
Otherwise, fatal errors coming from the h->cfg.length would be
unnoticed, causing slweb to attempt to send a response.
|
| |
|
|
|
| |
Both functions were in fact identical, so there was no reason to keep
two definitions rather than one.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
Since slweb is meant as a library, it is advisable to keep public header
files under their own directory in order to avoid name clashing i.e.,
#include "something.h"
Now becomes:
#include "slweb/something.h"
|
| |
|
|
|
|
|
| |
- '.' or '..' must not be used for filenames.
- Filenames must not contain forward slashes ('/').
- Filenames must not contain asterisks ('*') to avoid confusion with
wildcard expressions.
|
| |
|
|
|
| |
HTTP headers are case-insensitive, so the implementation must accept
Content-Diposition, content-disposition or any other variation.
|
| |
|
|
|
| |
Otherwise, client requests to resources such as '/me & you', '/?' or
'/??preview=1' would fail.
|
| |
|
|
|
|
|
|
|
| |
Under some circumstances, clients could cause SIGPIPE to slcl. Since
this signal was not handled by server.c (i.e., via sigaction(3)), slcl
would crash without any error messages printed to stderr.
In such situation, SIGPIPE should not be usually considered a fatal
error, so it is preferrable to close the connection and keep working.
|
| |
|
|
|
|
|
|
|
| |
Given the following contrived example request:
/example%FB%DC&arg%DE1=examplevalue%AA
slcl must decode each token separately, so that percent-encoded
characters '&', '=' or '?' do not get accidently intepreted.
|