774 lines
17 KiB
Groff
774 lines
17 KiB
Groff
.TH LIBWEB_HTTP 7 2024-02-12 0.2.0 "libweb Library Reference"
|
|
|
|
.SH NAME
|
|
libweb_http \- libweb HTTP connection handling and utilities
|
|
|
|
.SH SYNOPSIS
|
|
.LP
|
|
.nf
|
|
#include <libweb/http.h>
|
|
.fi
|
|
|
|
.SH DESCRIPTION
|
|
As one of its key features,
|
|
\fIlibweb\fR
|
|
provides a HTTP/1.1-compatible server implementation that can be
|
|
embedded into applications as a library. While not a complete HTTP/1.1
|
|
server implementation, the following features are supported:
|
|
|
|
.IP \(bu 2
|
|
.BR GET .
|
|
.IP \(bu 2
|
|
.BR HEAD .
|
|
.IP \(bu 2
|
|
.BR PUT .
|
|
.IP \(bu 2
|
|
.BR POST .
|
|
.IR multipart/form-data -encoded
|
|
data is also supported.
|
|
.IP \(bu 2
|
|
Cookies.
|
|
|
|
An optional payload size limit can be defined for
|
|
.B PUT
|
|
and
|
|
.B POST
|
|
requests (see section
|
|
.BR "HTTP server configuration" ).
|
|
|
|
.SS Utility functions
|
|
The functions listed below are meant for library users:
|
|
|
|
.IP \(bu 2
|
|
.IR http_response_add_header (3).
|
|
.IP \(bu 2
|
|
.IR http_cookie_create (3).
|
|
.IP \(bu 2
|
|
.IR http_encode_url (3).
|
|
.IP \(bu 2
|
|
.IR http_decode_url (3).
|
|
|
|
.SS HTTP connection-related functions
|
|
|
|
The functions listed below are meant for internal use by
|
|
.IR libweb :
|
|
|
|
.IP \(bu 2
|
|
.IR http_alloc (3).
|
|
.IP \(bu 2
|
|
.IR http_free (3).
|
|
.IP \(bu 2
|
|
.IR http_update (3).
|
|
|
|
However, this component alone does not provide a working web server.
|
|
For example, a list of endpoints is required to define its behaviour,
|
|
and
|
|
.I struct http
|
|
objects must be stored somewhere as long as the connections are active.
|
|
.IR libweb_handler (7)
|
|
is the component meant to provide the missing pieces that conform a
|
|
working web server.
|
|
|
|
.SS HTTP server configuration
|
|
|
|
A HTTP server is contained into a
|
|
.IR "struct http_ctx" ,
|
|
and can be allocated by calling
|
|
.IR http_alloc (3).
|
|
This function requires a valid pointer to a
|
|
.I "struct http_cfg"
|
|
object. This flexible configuration allows the library to run on top of
|
|
any reliable transport layer, including TCP.
|
|
.I "struct http_cfg"
|
|
is defined as:
|
|
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
struct http_cfg
|
|
{
|
|
int (*\fIread\fP)(void *\fIbuf\fP, size_t \fIn\fP, void *\fIuser\fP);
|
|
int (*\fIwrite\fP)(const void *\fIbuf\fP, size_t \fIn\fP, void *\fIuser\fP);
|
|
int (*\fIpayload\fP)(const struct http_payload *\fIp\fP, struct http_response *\fIr\fP, void *\fIuser\fP);
|
|
int (*\fIlength\fP)(unsigned long long \fIlen\fP, const struct http_cookie *\fIc\fP, struct http_response *\fIr\fP, void *\fIuser\fP);
|
|
const char *\fItmpdir\fP;
|
|
void *\fIuser\fP;
|
|
size_t \fImax_headers\fP;
|
|
|
|
struct http_cfg_post
|
|
{
|
|
size_t \fImax_pairs\fP, \fImax_files\fP;
|
|
} \fIpost\fP;
|
|
};
|
|
.EE
|
|
.in
|
|
.PP
|
|
|
|
All of the function pointers listed above define
|
|
.I user
|
|
as a parameter, an opaque pointer to user-defined data previously
|
|
defined by member
|
|
.I user
|
|
(see definition below). Unless noted otherwise, all pointers must be
|
|
valid.
|
|
|
|
.I read
|
|
is a function pointer to a
|
|
.IR read (2)-like
|
|
function that must read up to
|
|
.I n
|
|
bytes
|
|
from the client into a buffer pointed to by
|
|
.IR buf .
|
|
The function pointed to by
|
|
.I read
|
|
returns the number of bytes that could be read from the client,
|
|
which could be from zero up to
|
|
.IR n .
|
|
On error, a negative integer is returned.
|
|
|
|
.I write
|
|
is a function pointer to a
|
|
.IR write (2)-like
|
|
function that must write up to
|
|
.I n
|
|
bytes
|
|
to the client from a buffer pointed to by
|
|
.IR buf .
|
|
It returns the number of bytes that could be written to the client,
|
|
which could be from zero to
|
|
.IR n .
|
|
On error, a negative integer is returned.
|
|
|
|
.I payload
|
|
is a function pointer called by
|
|
.I libweb
|
|
when a new HTTP request has been received.
|
|
.I p
|
|
is a read-only pointer to a
|
|
.I "struct http_payload"
|
|
object, which describes the HTTP request (see section
|
|
.BR "HTTP payload" ).
|
|
.I r
|
|
is a pointer to a
|
|
.I "struct http_response"
|
|
object that must be initialized by the function pointed to by
|
|
.IR payload ,
|
|
which includes the HTTP response parameters to be returned to the
|
|
client.
|
|
The function pointed to by
|
|
.I read
|
|
returns the number of bytes that could be read from the client,
|
|
which could be from zero to
|
|
.IR n .
|
|
This function returns zero on success. On error, a negative integer is
|
|
returned.
|
|
|
|
.I length
|
|
is a function pointer called by
|
|
.I libweb
|
|
when an incoming HTTP request from a client requires to store one or
|
|
more files on the server.
|
|
In the case of a
|
|
.B POST
|
|
request,
|
|
.I len
|
|
defines the length of the
|
|
.IR multipart/form-data
|
|
(see section
|
|
.BR "Content-Length design limitations for multipart/form-data" ).
|
|
In the case of a
|
|
.B PUT
|
|
request,
|
|
.I len
|
|
defines the length of the file body.
|
|
.I c
|
|
is a read-only pointer to a
|
|
.I "struct http_cookie"
|
|
object, containing at most
|
|
.B one
|
|
(see section
|
|
.BR "Limitations on the number of HTTP cookies" )
|
|
HTTP cookie. If no cookies are defined, its members shall contain null
|
|
pointers.
|
|
.I r
|
|
is a pointer to a
|
|
.I "struct http_response"
|
|
object that must be initialized by the function pointed to by
|
|
.I payload
|
|
only when the function returns a positive integer.
|
|
This function returns zero on success, a negative integer in case
|
|
of a fatal error or a positive integer in case of a non-fatal error
|
|
caused by a malformed request, or to indicate a lack of support for
|
|
this feature. When a positive integer is returned, the connection
|
|
against the client shall be closed.
|
|
|
|
.I tmpdir
|
|
is a null-terminated string defining the path to a directory where
|
|
files uploaded by clients shall be stored temporarily.
|
|
.I tmpdir
|
|
can be a null pointer if this feature is not supported by the
|
|
application.
|
|
|
|
.I user
|
|
is an opaque pointer to a user-defined object that shall be passed to
|
|
other function pointers defined by
|
|
.IR "struct http_cfg" .
|
|
.I user
|
|
can be a null pointer.
|
|
|
|
.I max_headers
|
|
refers to the maximum number of header fields that shall be passed to the
|
|
.IR "struct http_payload"
|
|
object passed to the function pointed to by
|
|
.IR payload .
|
|
Any extra headers sent by the client outside this maximum value shall be
|
|
silently ignored by
|
|
.IR libweb .
|
|
|
|
.I post
|
|
contains configuration parameters specific to
|
|
.B POST
|
|
requests:
|
|
|
|
.I max_pairs
|
|
refers to the maximum number of key/value pairs that shall be accepted by
|
|
.I libweb
|
|
on
|
|
.B POST
|
|
.IR multipart/form-data -encoded
|
|
requests. If the maximum number of pairs is exceeded by the request,
|
|
.I libweb
|
|
shall terminate the connection.
|
|
|
|
.I max_files
|
|
refers to the maximum number of files that shall be accepted by
|
|
.I libweb
|
|
on
|
|
.B POST
|
|
.IR multipart/form-data -encoded
|
|
requests. If the maximum number of files is exceeded by the request,
|
|
.I libweb
|
|
shall terminate the connection.
|
|
|
|
.SS HTTP payload
|
|
|
|
When a client submits a request to the server,
|
|
.I libweb
|
|
prepares a high-level data structure, called
|
|
.IR "struct http_payload" ,
|
|
and passes it to the function pointer defined by
|
|
.I "struct http_cfg"
|
|
member
|
|
.IR payload .
|
|
.I "struct http_payload"
|
|
is defined as:
|
|
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
struct http_payload
|
|
{
|
|
enum http_op \fIop\fP;
|
|
const char *\fIresource\fP;
|
|
struct http_cookie \fIcookie\fP;
|
|
|
|
union
|
|
{
|
|
struct http_post \fIpost\fP;
|
|
struct http_put \fIput\fP;
|
|
} u;
|
|
|
|
const struct http_arg *\fIargs\fP;
|
|
size_t \fIn_args\fP, \fIn_headers\fP;
|
|
const struct http_header *\fIheaders\fP;
|
|
bool \fIexpect_continue\fP;
|
|
};
|
|
.EE
|
|
.in
|
|
.PP
|
|
|
|
.I op
|
|
describes the HTTP/1.1 operation. See the definition for
|
|
.I "enum http_op"
|
|
for an exhaustive list of supported operations.
|
|
|
|
.I resource
|
|
describes which resource is being requested by the client. For example:
|
|
.IR /index.html .
|
|
|
|
.I cookie
|
|
contains at most
|
|
.B one
|
|
HTTP cookie, defined as a key-value pair. Its members shall be null
|
|
pointers if no cookie is present. Also, see section
|
|
.BR "Limitations on the number of HTTP cookies" .
|
|
|
|
.I u
|
|
defines a tagged union with operation-specific data. For example,
|
|
.I post
|
|
refers to data sent by a client on a
|
|
.B POST
|
|
request (see section
|
|
.BR "HTTP POST payload" ),
|
|
whereas
|
|
.I put
|
|
refers to data sent by a client on a
|
|
.B PUT
|
|
request (see section
|
|
.BR "HTTP PUT payload" ).
|
|
|
|
.I args
|
|
defines a list of key-value pairs containing URL parameters. Its length
|
|
is defined by
|
|
.IR n_args .
|
|
|
|
.I headers
|
|
defines a list of key-value pairs containing header fields. Its length
|
|
is defined by
|
|
.IR n_headers .
|
|
|
|
.I expect_continue
|
|
shall be set to
|
|
.I true
|
|
if an
|
|
.B "Expect: 100-continue"
|
|
HTTP header is received,
|
|
.I false
|
|
otherwise (see section
|
|
.BR "Expect: 100-continue handling" ).
|
|
This field is only relevant for
|
|
.B PUT
|
|
or
|
|
.B POST
|
|
requests.
|
|
|
|
.SS HTTP POST payload
|
|
|
|
As opposed to payload-less HTTP/1.1 operations, such as
|
|
.BR GET ,
|
|
.B POST
|
|
operations might or might not include payload data. Moreover, such
|
|
payload can be encoded in two different ways, which
|
|
.I libweb
|
|
handles differently:
|
|
|
|
.IP \(bu 2
|
|
.IR application/x-www-form-urlencoded :
|
|
suggested for smaller payloads.
|
|
.I libweb
|
|
shall store the payload in memory, limiting its maximum size to
|
|
.BR "7999 octets" .
|
|
|
|
.IP \(bu 2
|
|
.IR multipart/form-data :
|
|
suggested for larger and/or binary payloads.
|
|
.I libweb
|
|
shall store each non-file name-value pair in memory, limiting the value
|
|
length to
|
|
.BR "8000 octets" .
|
|
On the other hand,
|
|
.I libweb
|
|
shall store each file into the temporary directory defined by
|
|
.I struct http_cfg
|
|
member
|
|
.IR tmpdir .
|
|
|
|
This information is contained into a
|
|
.I struct http_post
|
|
object, defined as:
|
|
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
struct http_post
|
|
{
|
|
const char *\fIdata\fP;
|
|
size_t \fInfiles\fP, \fInpairs\fP;
|
|
|
|
const struct http_post_pair
|
|
{
|
|
const char *\fIname\fP, *\fIvalue\fP;
|
|
} *\fIpairs\fP;
|
|
|
|
const struct http_post_file
|
|
{
|
|
const char *\fIname\fP, *\fItmpname\fP, *\fIfilename\fP;
|
|
} *\fIfiles\fP;
|
|
};
|
|
.EE
|
|
.in
|
|
.PP
|
|
|
|
When
|
|
.IR application/x-www-form-urlencoded -data
|
|
is included,
|
|
.I data
|
|
shall contain a null-terminated string with the user payload. Data must
|
|
be decoded by applications (see section
|
|
.BR "Handling application/x-www-form-urlencoded data" ).
|
|
Otherwise,
|
|
.I data
|
|
shall be a null pointer.
|
|
|
|
In the case of
|
|
.IR multipart/form-data ,
|
|
.I files
|
|
shall contain a list of files that were uploaded by the client, each
|
|
one stored by the server to a temporary file, defined by
|
|
.IR tmpname .
|
|
The final name for the uploaded file is defined by
|
|
.IR filename .
|
|
The key
|
|
.B name
|
|
used for each requested file is defined by
|
|
.IR name .
|
|
The length of this list is defined by
|
|
.IR nfiles .
|
|
If no files are defined,
|
|
.I files
|
|
shall be a null pointer.
|
|
|
|
In the case of
|
|
.IR multipart/form-data ,
|
|
.I pairs
|
|
shall contain a list of name-value pairs that were uploaded by the
|
|
client, defined by
|
|
.I name
|
|
and
|
|
.IR value ,
|
|
respectively. The length of this list is defined by
|
|
.IR npairs .
|
|
If no name-value pairs are defined,
|
|
.I pairs
|
|
shall be a null pointer.
|
|
|
|
.SS HTTP PUT payload
|
|
|
|
For
|
|
.B PUT
|
|
requests, and as opposed to
|
|
.B POST
|
|
requests,
|
|
.I libweb
|
|
shall always store the request body into a temporary file, defined by
|
|
.I struct http_put
|
|
member
|
|
.IR tmpname .
|
|
|
|
This information is contained into a
|
|
.I struct http_put
|
|
object, defined as:
|
|
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
struct http_put
|
|
{
|
|
const char *tmpname;
|
|
};
|
|
.EE
|
|
.in
|
|
.PP
|
|
|
|
.SS Expect: 100-continue handling
|
|
|
|
As opposed to other HTTP headers, the
|
|
.B Expect: 100-continue
|
|
HTTP header requires the server to respond before the request body is
|
|
sent by the client. Typically, this is meant to allow the server to
|
|
check the request resource and headers beforehand, without waiting to
|
|
receive the request body, which can be long. This is usual for resource
|
|
uploads, as typically done with
|
|
.BR PUT
|
|
requests.
|
|
|
|
Therefore, when this header is received,
|
|
.I libweb
|
|
shall trigger the user-defined callback as defined by
|
|
.I struct http_cfg
|
|
member
|
|
.I payload
|
|
at least once. Then, users are expected to check this flag and act
|
|
accordingly.
|
|
|
|
The example below shows a user-defined callback that replies with
|
|
.I HTTP_STATUS_CONTINUE
|
|
when
|
|
.I struct http_payload
|
|
member
|
|
.I expect_continue
|
|
is set:
|
|
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
static int on_payload(const struct http_payload *const p,
|
|
struct http_response *const r, void *const user)
|
|
{
|
|
if (p->expect_continue)
|
|
{
|
|
*r = (const struct http_response)
|
|
{
|
|
.status = HTTP_STATUS_CONTINUE
|
|
};
|
|
|
|
return 0;
|
|
}
|
|
|
|
/* Handle request body. */
|
|
}
|
|
.EE
|
|
.in
|
|
.PP
|
|
|
|
Then, once the request body is received,
|
|
.I libweb
|
|
shall trigger the callback again with
|
|
.I struct http_payload
|
|
member
|
|
.I expect_continue
|
|
assigned to
|
|
.IR false .
|
|
|
|
.SS HTTP responses
|
|
|
|
Some function pointers used by
|
|
.I libweb
|
|
require to initialize a
|
|
.I "struct http_response"
|
|
object that defines the response that must be sent to the client.
|
|
This structure is defined as:
|
|
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
struct http_response
|
|
{
|
|
enum http_status \fIstatus\fP;
|
|
|
|
struct http_header
|
|
{
|
|
char *\fIheader\fP, *\fIvalue\fP;
|
|
} *\fIheaders\fP;
|
|
|
|
union
|
|
{
|
|
const void *\fIro\fP;
|
|
void *\fIrw\fP;
|
|
} \fIbuf\fP;
|
|
|
|
FILE *\fIf\fP;
|
|
unsigned long long \fIn\fP;
|
|
size_t \fIn_headers\fP;
|
|
void (*\fIfree\fP)(void *);
|
|
};
|
|
.EE
|
|
.in
|
|
.PP
|
|
|
|
.I status
|
|
is the response code to be returned to the client. A list of possible
|
|
values is defined by
|
|
.IR "enum http_status" .
|
|
|
|
.I headers
|
|
is a pointer to an array of
|
|
.I "struct http_header"
|
|
whose length is defined by
|
|
.IR n_headers ,
|
|
containing the HTTP headers to be included into the response. Note that
|
|
.I headers
|
|
is not meant to be modified directly by library users. Instead, the
|
|
.IR http_response_add_header (3)
|
|
utility function shall update the
|
|
.I "struct http_response"
|
|
object accordingly.
|
|
|
|
.I buf
|
|
is a union containing two possible values, with minor semantic
|
|
differences:
|
|
|
|
.I ro
|
|
is a read-only opaque pointer to a buffer in memory, whose length is
|
|
defined by
|
|
.I n
|
|
(see definition below).
|
|
.I libweb
|
|
shall select
|
|
.I ro
|
|
as the output payload if both
|
|
.I f
|
|
and
|
|
.I free
|
|
are null pointers, and
|
|
.I n
|
|
is non-zero.
|
|
|
|
.I rw
|
|
is an opaque pointer to a buffer in memory, whose length is defined by
|
|
.I n
|
|
(see definition below).
|
|
.I libweb
|
|
shall select
|
|
.I rw
|
|
as the output payload if both
|
|
.I f
|
|
is a null pointer and
|
|
.I free
|
|
is a valid pointer to a function that frees the memory used by
|
|
.IR rw ,
|
|
and
|
|
.I n
|
|
is non-zero.
|
|
|
|
.I f
|
|
is a
|
|
.I FILE
|
|
pointer opened for reading that defines the payload to be sent to the
|
|
client, whose length is defined by
|
|
.IR n .
|
|
.I libweb
|
|
shall select
|
|
.I f
|
|
as the output payload if
|
|
.IR ro ,
|
|
.I rw
|
|
and
|
|
.I free
|
|
are null pointers, and
|
|
.I n
|
|
is non-zero.
|
|
|
|
.I n
|
|
is the length of the output payload, which can be either a buffer in
|
|
memory (see definitions for
|
|
.I ro
|
|
and
|
|
.IR rw )
|
|
or a file (see definition for
|
|
.IR f ).
|
|
If
|
|
.I n
|
|
equals zero, no payload shall be sent.
|
|
|
|
.I n_headers
|
|
defines the number of HTTP headers contained in the response. This
|
|
field is not meant to be manipulated directly. Instead, the
|
|
.IR http_response_add_header (3)
|
|
utility function shall update the
|
|
.I "struct http_response"
|
|
object accordingly.
|
|
|
|
.I free
|
|
is a pointer to a function that frees the memory used by
|
|
.I rw
|
|
.B only if
|
|
.I rw
|
|
is a valid pointer. Otherwise,
|
|
.I free
|
|
must be a null pointer.
|
|
|
|
.SS Transport Layer Security (TLS)
|
|
By design,
|
|
.I libweb
|
|
does
|
|
.BI not
|
|
implement TLS (Transport Layer Security). It is assumed this should
|
|
be provided by a reverse proxy instead, a kind of project that is
|
|
usually maintained by a larger community than
|
|
.I libweb
|
|
and audited for security vulnerabilities.
|
|
|
|
.SH NOTES
|
|
.SS Comparing against other HTTP server implementations
|
|
While it is well understood that other solutions provide fully-fledged
|
|
server implementations as standalone executables,
|
|
.I libweb
|
|
strives to be as small and easy to use as possible, intentionally
|
|
limiting its scope while covering a good range of use cases.
|
|
|
|
.SS Content-Length design limitations for multipart/form-data
|
|
HTTP/1.1 defines the Content-Length for a
|
|
.I multipart/form-data
|
|
.B POST
|
|
request as the sum of:
|
|
|
|
.IP \(bu 2
|
|
The length of all files.
|
|
.IP \(bu 2
|
|
The length of all boundaries.
|
|
.IP \(bu 2
|
|
The length of all headers included on each part.
|
|
.IP \(bu 2
|
|
All separator tokens, such as
|
|
.B LFCR
|
|
or
|
|
.BR -- .
|
|
|
|
This means it is not possible for
|
|
.I libweb
|
|
to determine the number of files or their lengths in a HTTP request
|
|
unless the whole request is read, which not might be possible for large
|
|
requests. Therefore, the
|
|
.B Content-Length
|
|
is the only rough estimation
|
|
.I libweb
|
|
can rely on, and therefore is the value passed to the
|
|
.I length
|
|
function pointer in
|
|
.IR "struct http_cfg" .
|
|
|
|
.SH FUTURE DIRECTIONS
|
|
.SS Limitations on the number of HTTP cookies
|
|
So far,
|
|
.I libweb
|
|
shall only append at most
|
|
.B one
|
|
HTTP cookie to a
|
|
.I "struct http_payload"
|
|
object. This is due to arbitrary design limitations on the library.
|
|
Future versions of this library shall replace the
|
|
.I "struct http_cookie"
|
|
object inside
|
|
.I "struct http_payload"
|
|
with a pointer to an array of
|
|
.IR "struct http_cookie" ,
|
|
plus a
|
|
.I size_t
|
|
object containing the number of HTTP cookies in the request.
|
|
|
|
.SS Handling application/x-www-form-urlencoded data
|
|
Due to historical reasons,
|
|
.I libweb
|
|
treated
|
|
.IR application/x-www-form-urlencoded -data
|
|
as a binary blob. While this was changed to a null-terminated string in
|
|
order to allow applications to avoid unnecessary memory allocations,
|
|
.I libweb
|
|
still does not decode the data, instead forcing applications to do so.
|
|
Future versions of this library shall replace
|
|
.I "struct http_post"
|
|
member
|
|
.I data
|
|
with an array of structures containing key-value pairs, so that
|
|
applications no longer need to decode payload data by themselves.
|
|
|
|
.SH SEE ALSO
|
|
.BR handler_alloc (3),
|
|
.BR http_alloc (3),
|
|
.BR http_free (3),
|
|
.BR http_update (3),
|
|
.BR http_response_add_header (3),
|
|
.BR http_cookie_create (3),
|
|
.BR http_encode_url (3),
|
|
.BR http_decode_url (3).
|
|
|
|
.SH COPYRIGHT
|
|
Copyright (C) 2023-2024 libweb contributors
|
|
.P
|
|
This program is free software: you can redistribute it and/or modify
|
|
it under the terms of the GNU Affero General Public License as published by
|
|
the Free Software Foundation, either version 3 of the License, or
|
|
(at your option) any later version.
|