The previous implementation would leave half-initialised objects if one
of the calls to strdup(3) failed. Now, n->attrs is only modified when
all previous memory allocations were successful.
Under some specific circumstances, poll(2) would return a positive
integer, but do_exit might had been previously set. This caused libweb
to ignore SIGTERM, with the potential risk for an endless loop.
Profiling showed that reading multipart/form POST uploads byte-by-byte
was too slow and typically led to maximum CPU usage. Therefore, the
older approach (as done up to commit 7efc2b3a) was more efficient, even
if the resulting code was a bit uglier.
So far, libweb would perform different read operations depending on its
state:
- For HTTP headers or request bodies, one byte at a time was read.
- For multipart/form-data, up to BUFSIZ bytes at a time were read.
However, this caused a significant extra number of syscalls for no
reason and would increase code complexity, specially when parsing
multiform/form-data boundaries.
Now, http_read always reads up to BUFSIZ bytes at a time and process
them on a loop. Apart from reducing code complexity, this should
increase performance due to the (much) lower number of syscalls
required.
Notes:
- Since curl would use the "Expect: 100-continue" header field for PUT
operations, this was a good operation to fix the existing issues in its
implementation.
Breaking changes:
- expect_continue is no longer exclusive to struct http_post. Now, it
has been moved into struct http_payload and it is up to users to check
it.
Some applications might set up a struct handler object to listen on any
port i.e., 0, but still need a way to determine which port number was
eventually selected by the implementation.
Therefore, handler_listen has been reduced to the server initialization
bit, whereas the main loop has been split into its own function, namely
handler_loop.
Because of these changes, it no longer made sense for libweb to write
the selected port to standard output, as this is something now
applications can do on their own.
Even if libweb already parses some common headers, such as
Content-Length, some users might find it interesting to inspect which
headers were received from a request.
Since HTTP/1.1 does not define a limit on the number of maximum headers
a client can send, for security reasons a maximum value must be provided
by the user. Any extra headers shall be then discarded by libweb.
An example application showing this new feature is also provided.
- http_memmem must not check strlen(a) > n because, in case of a partial
boundary, it would wrongfully return NULL.
- If one or more characters from a partial boundary are found at the end
of a buffer, but the next buffer does not start with the rest of the
boundary, the accumulated boundary must be reset, and then look for a
new boundary.
- Writing to m->boundary[len] did not make any sense, as len is not
meant to change between calls to read_mf_boundary_byte.
- For the same reason, memset(3)ing "len + 1" did not make any sense.
- When a partial boundary is found, http_memmem must still return st.
- Calling reset_boundary with prev == 0 did not make sense, since that
case typically means a partial boundary was found on a previous
iteration, so m->blen must not be reset.
This macro would return a positive integer on failure. However,
functions called by http_update should only return a positive integer
for user input-related errors, not fatal errors such as those related to
failed memory allocations.
The commit below is relevant to fix CMake builds:
Author: Xavier Del Campo Romero <xavi.dcr@tutanota.com>
Date: Fri Nov 10 14:43:39 2023 +0100
CMakeLists.txt: Fix missing parameter names
VERSION must be indicated when passing a version string to project().
Also, LANGUAGES must be also be passed when the language name is not the
only argument to project() (apart from the project name itself).
It was found out there was another project of the same name around
(https://git.sr.ht/~strahinja/slweb/), also related to website
generation.
In order to avoid confusion, a new name has been chosen for this
project. Surprisingly, libweb was not in use by any distributions
(according to https://repology.org and AUR index), and it should
reflect well the intention behind this project i.e., being a library
to build web-related stuff.