1
0
Fork 0
libweb/doc/man7/libweb_http.7

667 lines
15 KiB
Groff

.TH LIBWEB_HTTP 7 2023-11-18 0.2.0 "libweb Library Reference"
.SH NAME
libweb_http \- libweb HTTP connection handling and utilities
.SH SYNOPSIS
.LP
.nf
#include <libweb/http.h>
.fi
.SH DESCRIPTION
As one of its key features,
\fIlibweb\fR
provides a HTTP/1.1-compatible server implementation that can be
embedded into applications as a library. While not a complete HTTP/1.1
server implementation, the following features are supported:
.IP \(bu 2
.BR GET .
.IP \(bu 2
.BR POST .
.IP \(bu 2
.IR multipart/form-data -encoded
data. An optional payload size limit can be defined (see section
.BR "HTTP server configuration" ).
.IP \(bu 2
Cookies.
.SS Utility functions
The functions listed below are meant for library users:
.IP \(bu 2
.IR http_response_add_header (3).
.IP \(bu 2
.IR http_cookie_create (3).
.IP \(bu 2
.IR http_encode_url (3).
.IP \(bu 2
.IR http_decode_url (3).
.SS HTTP connection-related functions
The functions listed below are meant for internal use by
.IR libweb :
.IP \(bu 2
.IR http_alloc (3).
.IP \(bu 2
.IR http_free (3).
.IP \(bu 2
.IR http_update (3).
However, this component alone does not provide a working web server.
For example, a list of endpoints is required to define its behaviour,
and
.I struct http
objects must be stored somewhere as long as the connections are active.
.IR libweb_handler (7)
is the component meant to provide the missing pieces that conform a
working web server.
.SS HTTP server configuration
A HTTP server is contained into a
.IR "struct http_ctx" ,
and can be allocated by calling
.IR http_alloc (3).
This function requires a valid pointer to a
.I "struct http_cfg"
object. This flexible configuration allows the library to run on top of
any reliable transport layer, including TCP.
.I "struct http_cfg"
is defined as:
.PP
.in +4n
.EX
struct http_cfg
{
int (*\fIread\fP)(void *\fIbuf\fP, size_t \fIn\fP, void *\fIuser\fP);
int (*\fIwrite\fP)(const void *\fIbuf\fP, size_t \fIn\fP, void *\fIuser\fP);
int (*\fIpayload\fP)(const struct http_payload *\fIp\fP, struct http_response *\fIr\fP, void *\fIuser\fP);
int (*\fIlength\fP)(unsigned long long \fIlen\fP, const struct http_cookie *\fIc\fP, struct http_response *\fIr\fP, void *\fIuser\fP);
const char *\fItmpdir\fP;
void *\fIuser\fP;
size_t \fImax_headers\fP;
};
.EE
.in
.PP
All of the function pointers listed above define
.I user
as a parameter, an opaque pointer to user-defined data previously
defined by member
.I user
(see definition below). Unless noted otherwise, all pointers must be
valid.
.I read
is a function pointer to a
.IR read (2)-like
function that must read up to
.I n
bytes
from the client into a buffer pointed to by
.IR buf .
The function pointed to by
.I read
returns the number of bytes that could be read from the client,
which could be from zero up to
.IR n .
On error, a negative integer is returned.
.I write
is a function pointer to a
.IR write (2)-like
function that must write up to
.I n
bytes
to the client from a buffer pointed to by
.IR buf .
It returns the number of bytes that could be written to the client,
which could be from zero to
.IR n .
On error, a negative integer is returned.
.I payload
is a function pointer called by
.I libweb
when a new HTTP request has been received.
.I p
is a read-only pointer to a
.I "struct http_payload"
object, which describes the HTTP request (see section
.BR "HTTP payload" ).
.I r
is a pointer to a
.I "struct http_response"
object that must be initialized by the function pointed to by
.IR payload ,
which includes the HTTP response parameters to be returned to the
client.
The function pointed to by
.I read
returns the number of bytes that could be read from the client,
which could be from zero to
.IR n .
This function returns zero on success. On error, a negative integer is
returned.
.I length
is a function pointer called by
.I libweb
when an incoming HTTP request from a client requires to store one or
more files on the server, encoded as
.IR multipart/form-data .
.I len
defines the length of the
.IR multipart/form-data
(see section
.BR "Content-Length design limitations for multipart/form-data" ).
.I c
is a read-only pointer to a
.I "struct http_cookie"
object, containing at most
.B one
(see section
.BR "Limitations on the number of HTTP cookies" )
HTTP cookie. If no cookies are defined, its members shall contain null
pointers.
.I r
is a pointer to a
.I "struct http_response"
object that must be initialized by the function pointed to by
.I payload
only when the function returns a positive integer.
This function returns zero on success, a negative integer in case
of a fatal error or a positive integer in case of a non-fatal error
caused by a malformed request, or to indicate a lack of support for
this feature. When a positive integer is returned, the connection
against the client shall be closed.
.I tmpdir
is a null-terminated string defining the path to a directory where
files uploaded by clients shall be stored temporarily.
.I tmpdir
can be a null pointer if this feature is not supported by the
application.
.I user
is an opaque pointer to a user-defined object that shall be passed to
other function pointers defined by
.IR "struct http_cfg" .
.I user
can be a null pointer.
.I max_headers
refers to the maximum number of header fields that shall be passed to the
.IR "struct http_payload"
object passed to the function pointed to by
.IR payload .
Any extra headers sent by the client outside this maximum value shall be
silently ignored by
.IR libweb .
.SS HTTP payload
When a client submits a request to the server,
.I libweb
prepares a high-level data structure, called
.IR "struct http_payload" ,
and passes it to the function pointer defined by
.I "struct http_cfg"
member
.IR payload .
.I "struct http_payload"
is defined as:
.PP
.in +4n
.EX
struct http_payload
{
enum http_op \fIop\fP;
const char *\fIresource\fP;
struct http_cookie \fIcookie\fP;
union
{
struct http_post \fIpost\fP;
} \fIu\fP;
const struct http_arg *\fIargs\fP;
size_t \fIn_args\fP;
const struct http_header *\fIheaders\fP;
size_t \fIn_headers\fP;
};
.EE
.in
.PP
.I op
describes the HTTP/1.1 operation. See the definition for
.I "enum http_op"
for an exhaustive list of supported operations.
.I resource
describes which resource is being requested by the client. For example:
.IR /index.html .
.I cookie
contains at most
.B one
HTTP cookie, defined as a key-value pair. Its members shall be null
pointers if no cookie is present. Also, see section
.BR "Limitations on the number of HTTP cookies" .
.I u
defines a tagged union with operation-specific data. For example,
.I post
refers to data sent by a client on a
.B POST
request (see section
.BR "HTTP POST payload" ).
Also, see section
.BR "Future supported HTTP/1.1 operations" .
.I args
defines a list of key-value pairs containing URL parameters. Its length
is defined by
.IR n_args .
.I headers
defines a list of key-value pairs containing header fields. Its length
is defined by
.IR n_headers .
.SS HTTP POST payload
As opposed to payload-less HTTP/1.1 operations, such as
.BR GET ,
.B POST
operations might or might not include payload data. Moreover, such
payload can be encoded in two different ways, which
.I slcl
handles differently:
.IP \(bu 2
.IR application/x-www-form-urlencoded :
suggested for smaller payloads.
.I libweb
shall store the payload in memory, limiting its maximum size to
.BR "7999 octets" .
.IP \(bu 2
.IR multipart/form-data :
suggested for larger and/or binary payloads.
.I libweb
shall store each non-file name-value pair in memory, limiting the value
length to
.BR "8000 octets" .
On the other hand,
.I libweb
shall store each file into the temporary directory defined by
.I struct http_cfg
member
.IR tmpdir .
This information is contained into a
.B "struct http_post"
object, defined as:
.PP
.in +4n
.EX
struct http_post
{
bool \fIexpect_continue\fP;
const char *\fIdata\fP;
size_t \fInfiles\fP, \fInpairs\fP;
const struct http_post_pair
{
const char *\fIname\fP, *\fIvalue\fP;
} *\fIpairs\fP;
const struct http_post_file
{
const char *\fIname\fP, *\fItmpname\fP, *\fIfilename\fP;
} *\fIfiles\fP;
};
.EE
.in
.PP
.I expect_continue
shall be set to
.I true
if an
.B "Expect: 100-continue"
HTTP header is received,
.I false
otherwise (see
section
.B Handling of 100-continue requests
in
.BR BUGS ).
When
.IR application/x-www-form-urlencoded -data
is included,
.I data
shall contain a null-terminated string with the user payload. Data must
be decoded by applications (see section
.BR "Handling application/x-www-form-urlencoded data" ).
Otherwise,
.I data
shall be a null pointer.
In the case of
.IR multipart/form-data ,
.I files
shall contain a list of files that were uploaded by the client, each
one stored by the server to a temporary file, defined by
.IR tmpname .
The final name for the uploaded file is defined by
.IR filename .
The key
.B name
used for each requested file is defined by
.IR name .
The length of this list is defined by
.IR nfiles .
If no files are defined,
.I files
shall be a null pointer.
In the case of
.IR multipart/form-data ,
.I pairs
shall contain a list of name-value pairs that were uploaded by the
client, defined by
.I name
and
.IR value ,
respectively. The length of this list is defined by
.IR npairs .
If no name-value pairs are defined,
.I pairs
shall be a null pointer.
.SS HTTP responses
Some function pointers used by
.I libweb
require to initialize a
.I "struct http_response"
object that defines the response that must be sent to the client.
This structure is defined as:
.PP
.in +4n
.EX
struct http_response
{
enum http_status \fIstatus\fP;
struct http_header
{
char *\fIheader\fP, *\fIvalue\fP;
} *\fIheaders\fP;
union
{
const void *\fIro\fP;
void *\fIrw\fP;
} \fIbuf\fP;
FILE *\fIf\fP;
unsigned long long \fIn\fP;
size_t \fIn_headers\fP;
void (*\fIfree\fP)(void *);
};
.EE
.in
.PP
.I status
is the response code to be returned to the client. A list of possible
values is defined by
.IR "enum http_status" .
.I headers
is a pointer to an array of
.I "struct http_header"
whose length is defined by
.IR n_headers ,
containing the HTTP headers to be included into the response. Note that
.I headers
is not meant to be modified directly by library users. Instead, the
.IR http_response_add_header (3)
utility function shall update the
.I "struct http_response"
object accordingly.
.I buf
is a union containing two possible values, with minor semantic
differences:
.I ro
is a read-only opaque pointer to a buffer in memory, whose length is
defined by
.I n
(see definition below).
.I libweb
shall select
.I ro
as the output payload if both
.I f
and
.I free
are null pointers, and
.I n
is non-zero.
.I rw
is an opaque pointer to a buffer in memory, whose length is defined by
.I n
(see definition below).
.I libweb
shall select
.I rw
as the output payload if both
.I f
is a null pointer and
.I free
is a valid pointer to a function that frees the memory used by
.IR rw ,
and
.I n
is non-zero.
.I f
is a
.I FILE
pointer opened for reading that defines the payload to be sent to the
client, whose length is defined by
.IR n .
.I libweb
shall select
.I f
as the output payload if
.IR ro ,
.I rw
and
.I free
are null pointers, and
.I n
is non-zero.
.I n
is the length of the output payload, which can be either a buffer in
memory (see definitions for
.I ro
and
.IR rw )
or a file (see definition for
.IR f ).
If
.I n
equals zero, no payload shall be sent.
.I n_headers
defines the number of HTTP headers contained in the response. This
field is not meant to be manipulated directly. Instead, the
.IR http_response_add_header (3)
utility function shall update the
.I "struct http_response"
object accordingly.
.I free
is a pointer to a function that frees the memory used by
.I rw
.B only if
.I rw
is a valid pointer. Otherwise,
.I free
must be a null pointer.
.SS Transport Layer Security (TLS)
By design,
.I libweb
does
.BI not
implement TLS (Transport Layer Security). It is assumed this should
be provided by a reverse proxy instead, a kind of project that is
usually maintained by a larger community than
.I libweb
and audited for security vulnerabilities.
.SH NOTES
.SS Comparing against other HTTP server implementations
While it is well understood that other solutions provide fully-fledged
server implementations as standalone executables,
.I libweb
strives to be as small and easy to use as possible, intentionally
limiting its scope while covering a good range of use cases.
.SS Content-Length design limitations for multipart/form-data
HTTP/1.1 defines the Content-Length for a
.I multipart/form-data
.B POST
request as the sum of:
.IP \(bu 2
The length of all files.
.IP \(bu 2
The length of all boundaries.
.IP \(bu 2
The length of all headers included on each part.
.IP \(bu 2
All separator tokens, such as
.B LFCR
or
.BR -- .
This means it is not possible for
.I libweb
to determine the number of files or their lengths in a HTTP request
unless the whole request is read, which not might be possible for large
requests. Therefore, the
.B Content-Length
is the only rough estimation
.I libweb
can rely on, and therefore is the value passed to the
.I length
function pointer in
.IR "struct http_cfg" .
.SH BUGS
.SS Handling of 100-continue requests
The handling of
.B 100-continue
requests is not done correctly:
.I libweb
calls the function pointed to by
.I "struct http_cfg"
member
.I payload
as soon as it encounters the
.B Expected:
header. However, a response should only be sent to the client once all
headers are processed.
.SH FUTURE DIRECTIONS
.SS Limitations on the number of HTTP cookies
So far,
.I libweb
shall only append at most
.B one
HTTP cookie to a
.I "struct http_payload"
object. This is due to arbitrary design limitations on the library.
Future versions of this library shall replace the
.I "struct http_cookie"
object inside
.I "struct http_payload"
with a pointer to an array of
.IR "struct http_cookie" ,
plus a
.I size_t
object containing the number of HTTP cookies in the request.
.SS Handling application/x-www-form-urlencoded data
Due to historical reasons,
.I libweb
treated
.IR application/x-www-form-urlencoded -data
as a binary blob. While this was changed to a null-terminated string in
order to allow applications to avoid unnecessary memory allocations,
.I libweb
still does not decode the data, instead forcing applications to do so.
Future versions of this library shall replace
.I "struct http_post"
member
.I data
with an array of structures containing key-value pairs, so that
applications no longer need to decode payload data by themselves.
.SS Future supported HTTP/1.1 operations
So far,
.I struct http_payload
defines
.I u
as a union that only holds one possible data type. While this might
look counterintuitive, this is because
.B POST
is the only HTTP/1.1 operation
.I libweb
supports that requires to store a payload. However, future versions of
this library might extend its support for other HTTP/1.1 operations
that could require to store a payload, while keeping the memory
footprint for
.I struct http_payload
small.
.SH SEE ALSO
.BR handler_alloc (3),
.BR http_alloc (3),
.BR http_free (3),
.BR http_update (3),
.BR http_response_add_header (3),
.BR http_cookie_create (3),
.BR http_encode_url (3),
.BR http_decode_url (3).
.SH COPYRIGHT
Copyright (C) 2023 Xavier Del Campo Romero.
.P
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.