QUIC FUTURE: Add concurrency architecture design document
Reviewed-by: Neil Horman <nhorman@openssl.org> Reviewed-by: Saša Nedvědický <sashan@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26025)
This commit is contained in:
parent
15f859403e
commit
3686d215fe
2 changed files with 413 additions and 0 deletions
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 174 KiB |
412
doc/designs/quic-design/quic-concurrency.md
Normal file
412
doc/designs/quic-design/quic-concurrency.md
Normal file
|
@ -0,0 +1,412 @@
|
|||
QUIC Concurrency Architecture
|
||||
=============================
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
Most QUIC implementations in C are offered as a simple state machine without any
|
||||
included I/O solution. Applications must do significant integration work to
|
||||
provide the necessary infrastructure for a QUIC implementation to integrate
|
||||
with. Moreover, blocking I/O at an application level may not be supported.
|
||||
|
||||
OpenSSL QUIC seeks to offer a QUIC solution which can serve multiple use cases:
|
||||
|
||||
- Firstly, it seeks to offer the simple state machine model and a fully
|
||||
customisable network path (via a BIO) for those who want it;
|
||||
|
||||
- Secondly, it seeks to offer a turnkey solution with an in-the-box I/O
|
||||
and polling solution which can support blocking API calls in a Berkeley
|
||||
sockets-like way.
|
||||
|
||||
These usage modes are somewhat diametrically opposed. One involves libssl
|
||||
consuming no resources but those it is given, with an application responsible
|
||||
for synchronisation and a potentially custom network I/O path. This usage model
|
||||
is not “smart”. Network traffic is connected to the state machine and state is
|
||||
input and output from the state machine as needed by an application on a purely
|
||||
non-blocking basis. Determining *when* to do anything is largely the
|
||||
application's responsibility.
|
||||
|
||||
The other diametrically opposed usage mode involves libssl managing more things
|
||||
internally to provide an easier to use solution. For example, it may involve
|
||||
spinning up background threads to ensure connections are serviced regularly (as
|
||||
in our existing client-side thread assisted mode).
|
||||
|
||||
In order to provide for these different use cases, the concept of concurrency
|
||||
models is introduced. A concurrency model defines how “cleverly” the QUIC engine
|
||||
will operate and how many background resources (e.g. threads, other OS
|
||||
resources) will be established to support operation.
|
||||
|
||||
Concurrency Models
|
||||
------------------
|
||||
|
||||
- **Unsynchronised Concurrency Model (UCM):** In the Unsynchronised Concurrency
|
||||
Model, calls to SSL objects are not synchronised. There is no locking on any
|
||||
APL call (the omission of which is purely an optimisation). The application is
|
||||
either single-threaded or is otherwise responsible for doing synchronisation
|
||||
itself.
|
||||
|
||||
Blocking API calls are not supported under this model. This model is intended
|
||||
primarily for single-threaded use as a simple state machine by advanced
|
||||
applications, and many applications will be likely to disable autoticking.
|
||||
|
||||
- **Contentive Concurrency Model (CCM):** In the
|
||||
Contentive Concurrency Model, calls to SSL objects are wrapped in locks and
|
||||
multi-threaded usage of a QUIC connection (for example, parallel writes to
|
||||
different QUIC stream SSL objects belonging to the same QUIC connection) is
|
||||
synchronised by a mutex.
|
||||
|
||||
This is contentive in the sense that if a large number of threads are trying
|
||||
to write to different streams on the same connection, a large amount of lock
|
||||
contention will occur. As such, this concurrency model will not scale and
|
||||
provide good performance, at least within the context of concurrent use
|
||||
of a single connection.
|
||||
|
||||
Under this model, APL calls by the application result in lock-wrapped
|
||||
mutations of QUIC core objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.) on the
|
||||
same thread.
|
||||
|
||||
This model may be used either in a variant which does not support blocking
|
||||
(NB-CCM) or which does support blocking (B-CCM). The blocking variant must
|
||||
spin up additional OS resources to correctly support blocking semantics.
|
||||
|
||||
- **Thread Assisted Contentive Concurrency Model (TA-CCM):** This is currently
|
||||
implemented by our thread assisted mode for client-side QUIC usage. It does
|
||||
not realise the full state separation or performance of the Worker Concurrency
|
||||
Model (WCM) below. Instead, it simply spawns a background thread which ensures
|
||||
QUIC timer events are handled as needed. It makes use of the Contentive
|
||||
Concurrency Model for performing that handling, in that it obtains a lock when
|
||||
ticking a QUIC connection just as any call by an application would.
|
||||
|
||||
This mode is likely to be deprecated in favour of the full Worker Concurrency
|
||||
Model (WCM), which it will naturally be subsumed by.
|
||||
|
||||
- **Worker Concurrency Model (WCM):** In the Worker Concurrency Model,
|
||||
a background worker thread is spawned to manage connection processing. All
|
||||
interaction with a SSL object goes through this thread in some way.
|
||||
Interactions with SSL objects are essentially translated into commands and
|
||||
handled by the worker thread. To optimise performance and minimise lock
|
||||
contention, there is an emphasis on message passing over locking.
|
||||
Internal dataflow for application data can be managed in a zero-copy way to
|
||||
minimise the costs of this message passing.
|
||||
|
||||
Under this model, QUIC core objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.) will
|
||||
live solely on the worker thread and access to these objects by an application
|
||||
thread will be entirely forbidden.
|
||||
|
||||
Blocking API calls are supported under this model.
|
||||
|
||||
These concurrency models are summarised as follows:
|
||||
|
||||
| Model | Sophistication | Concurrency | Blocking Supported | OS Resources | Timer Events | RX Steering | Core State Affinity |
|
||||
|--------|----------------|-----------------------|--------------------|---------------------------|-----------------|-------------|----------------------|
|
||||
| UCM | Lowest | ST only | No | None | App Responsible | None | App Thread |
|
||||
| CCM | | MT (Contentive) | Optional | Mutex, (Notifier) | App Responsible | TBD | App Threads |
|
||||
| TA-CCM† | | MT (Contentive) | Optional | Mutex, Thread, (Notifier) | Managed | TBD | App & Assist Threads |
|
||||
| WCM | Highest | MT (High Performance) | Yes | Mutex, Thread, Notifier | Managed | Futureproof | Worker Thread |
|
||||
|
||||
† To eventually be deprecated in favour of WCM.
|
||||
|
||||
Legend:
|
||||
|
||||
- **Blocking Supported:** Whether blocking calls to e.g. `SSL_read` can be
|
||||
supported. If this is listed as “optional”, extra resources are required to
|
||||
support this under the listed model and these resources could be omitted if an
|
||||
application indicates it does not need this functionality at initialisation
|
||||
time.
|
||||
|
||||
- **OS Resources:** “Mutex” refers to mutex and condition variable resources.
|
||||
“Notifier” refers to a kind of OS resource needed to allow one thread to wake
|
||||
another thread which is currently blocking in an OS socket polling call such
|
||||
as poll(2) (e.g. an eventfd or socketpair). Resources listed in parentheses in
|
||||
the table above are required only if blocking support is desired.
|
||||
|
||||
- **Timer Events:** Is an application responsible for ensuring QUIC timeout
|
||||
events are handled in a timely manner?
|
||||
|
||||
- **RX Steering:** The matter of RX steering will be discussed in detail in a
|
||||
future document. Broadly speaking, RX steering concerns whether incoming
|
||||
traffic for multiple different QUIC connections on the same local port (e.g.
|
||||
for a server) can be vectored *by the OS* to different threads or whether the
|
||||
demuxing of incoming traffic for different connections has to be done manually
|
||||
on an in-process basis.
|
||||
|
||||
The WCM model most readily supports RX steering and is futureproof in this
|
||||
regard. The feasibility of having the UCM and CCM models support RX steering
|
||||
is left for future analysis.
|
||||
|
||||
- **Core State Affinity:** Which threads are allowed to touch the QUIC core
|
||||
objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.)
|
||||
|
||||
Architecture
|
||||
------------
|
||||
|
||||
To recap, the API Personality Layer (APL) refers to the code in `quic_impl.c`
|
||||
which implements the libssl API personality (`SSL_write`, etc.). The APL is
|
||||
cleanly separated from the QUIC core implementation (`QUIC_CHANNEL`, etc.).
|
||||
|
||||
Since UCM is basically a slight optimisation of CCM in which unnecessary locking
|
||||
is elided, discussion from hereon in will focus on CCM and WCM except where
|
||||
there are specific differences between CCM and UCM.
|
||||
|
||||
Supporting both CCM and WCM creates significant architectural challenges. Under
|
||||
CCM, QUIC core objects have their state mutated under lock by arbitrary
|
||||
application threads and these mutations happen during APL calls. By contrast, a
|
||||
performant WCM architecture requires that APL calls be recorded and serviced in
|
||||
an asynchronous fashion involving message passing to a worker thread. This
|
||||
threatens to require highly divergent dispatch architectures for the two
|
||||
concurrency models.
|
||||
|
||||
As such, the concept of a **Concurrency Management Layer (CML)** is introduced.
|
||||
The CML lives between the APL and the QUIC core code. It is responsible for
|
||||
dispatching in-thread mutations of QUIC core objects when operating under CCM,
|
||||
and for dispatching messages to a worker thread under WCM.
|
||||
|
||||

|
||||
|
||||
There are two different CMLs:
|
||||
|
||||
- **Direct CML (DCML)**, in which core objects are worked on in the same thread
|
||||
which made an APL call, under lock;
|
||||
|
||||
- **Worker CML (WCML)**, in which core objects are managed by a worker thread
|
||||
with communication via message passing. This CML is split into a front end
|
||||
(WCML-FE) and back end (WCML-BE).
|
||||
|
||||
The legacy thread assisted mode uses a bespoke method which is similar to the
|
||||
approach used by the DCML.
|
||||
|
||||
CML Design
|
||||
----------
|
||||
|
||||
The CML is designed to have as small an API surface area as possible to enable
|
||||
unified handling of as many kinds of (APL) API operations as possible. The idea
|
||||
is that complex APL calls are translated into simple operations on the CML.
|
||||
|
||||
At its core, the CML exposes some number of *pipes*. The number of pipes which
|
||||
can be accessed via the CML varies as connections and streams are created and
|
||||
destroyed. A pipe is a *unidirectional* transport for byte streams. Zero-copy
|
||||
optimisations are expected to be implemented in future but are deferred.
|
||||
|
||||
The CML (`QUIC_CML`) allows the caller to refer to a pipe by providing an opaque
|
||||
pipe handle (`QUIC_CML_PIPE`). If the pipe is a sending pipe, the caller can use
|
||||
`ossl_cml_write` to try and add bytes to it. Conversely, if it is a receiving
|
||||
pipe, the caller can use `ossl_cml_read` to try and read bytes from it.
|
||||
|
||||
The method `ossl_cml_block_until` allows the caller to block until at least one
|
||||
of the provided pipe handles is ready. Ready means that at least one byte can be
|
||||
written (for a sending pipe) or at least one byte can be read (for a receiving
|
||||
pipe).
|
||||
|
||||
Note that there is only expected to be one `QUIC_CML` instance per QUIC event
|
||||
processing domain (i.e., per `QUIC_DOMAIN` / `QUIC_ENGINE` instance). The CML
|
||||
fully abstracts the QUIC core objects such as `QUIC_ENGINE` or `QUIC_CHANNEL` so
|
||||
that the APL never sees them.
|
||||
|
||||
The caller retrieves a pipe handle using `ossl_cml_get_pipe`. This function
|
||||
retrieves a pipe based on two values:
|
||||
|
||||
- a CML pipe class;
|
||||
- a CML *selector*.
|
||||
|
||||
The CML selector is a tagged union structure which specifies what pipe is to be
|
||||
retrieved. Abstractly, examples of selectors include:
|
||||
|
||||
```text
|
||||
Domain ()
|
||||
Listener (listener_id: uint)
|
||||
Conn (conn_id: uint)
|
||||
Stream (conn_id: uint, stream_id: u64)
|
||||
```
|
||||
|
||||
In other words, the CML selector selects the “object” to retrieve a pipe from.
|
||||
|
||||
The CML pipe class is one of the following values:
|
||||
|
||||
- Request
|
||||
- Notification
|
||||
- App Send
|
||||
- App Recv
|
||||
|
||||
The pipe classes available for a given selector vary. For example, the “App
|
||||
Send” and “App Recv” pipes only exist on a stream, so it is invalid to request
|
||||
such a pipe in conjunction with a different type of selector.
|
||||
|
||||
The “Request” and “App Send” classes expose send-only streams, and the
|
||||
“Notification” and “App Recv” classes expose receive-only streams.
|
||||
|
||||
For any given CML selector, the Request pipe is used to send serialized commands
|
||||
for asynchronous processing in relation to the entity selected by that selector.
|
||||
Conversely, the Notification pipe returns asynchronous notifications. These
|
||||
could be in relation to a previous Command (e.g. indicating whether a command
|
||||
succeeded), or unprompted notifications about other events.
|
||||
|
||||
The underlying pattern here is that there is a bidirectional channel for control
|
||||
messages, and a bidirectional channel for application data, both comprised of
|
||||
two unidirectional pipes in turn.
|
||||
|
||||
Pipe handles are stable for as long as the pipe they reference exists, so an APL
|
||||
object can cache a pipe handle if desired.
|
||||
|
||||
All CML methods are thread safe. The CML implementation handles any necessary
|
||||
locking (if any) internally.
|
||||
|
||||
The `ossl_cml_write_available` and `ossl_cml_read_available` calls determine the
|
||||
number of bytes which can currently be written to a send-only pipe, or read from
|
||||
a receive-only pipe, respectively.
|
||||
|
||||
**Race conditions.** Because these are separate calls to `ossl_cml_write` and
|
||||
`ossl_cml_read`, the values returned by these functions may become out of date
|
||||
before the caller has a chance to read `ossl_cml_write` or `ossl_cml_read`.
|
||||
However, such changes are guaranteed to be monotonically in favour of the
|
||||
caller; for example, the value returned by `ossl_cml_write_available` will only
|
||||
ever increase asynchronously (and only decrease as a result of an
|
||||
`ossl_cml_write` call). Conversely, the value returned by
|
||||
`ossl_cml_read_available` will only ever increase asynchronously (and only
|
||||
decrease as a result of an `ossl_cml_read` call). Assuming that only one thread
|
||||
makes calls to CML functions at a given time *for a given pipe*, this therefore
|
||||
poses no issue for callers.
|
||||
|
||||
Concurrent use of `ossl_cml_write` or `ossl_cml_read` for a given pipe is not
|
||||
intended (and would not make sense in any case). The caller is responsible for
|
||||
synchronising such calls.
|
||||
|
||||
**Examples of pipe usage.** The application data pipes are used to serialize the
|
||||
actual application data sent or received on a QUIC stream. The usage of the
|
||||
request/notification pipes is more varied and used for control activity. There
|
||||
is therefore a “control/data” separation here. The request and notification
|
||||
pipes transport tagged unions. Abstractly, commands and notifications might
|
||||
include:
|
||||
|
||||
- Request: Reset Stream (error code: u64)
|
||||
- Notification: Connection Terminated by Peer
|
||||
|
||||
**Example implementation of `SSL_write`.** An `SSL_write`-like API might be
|
||||
implemented in the APL like this:
|
||||
|
||||
```c
|
||||
int do_write(QUIC_CML *cml,
|
||||
QUIC_CML_PIPE notification_pipe,
|
||||
QUIC_CML_PIPE app_send_pipe,
|
||||
const void *buf, size_t buf_len)
|
||||
{
|
||||
size_t bytes_written = 0;
|
||||
|
||||
for (;;) {
|
||||
/* e.g. connection termination */
|
||||
process_any_notifications(notification_pipe);
|
||||
|
||||
/* state checks, etc. */
|
||||
if (...->conn_terminated)
|
||||
return 0;
|
||||
|
||||
if (buf_len == 0)
|
||||
return 1;
|
||||
|
||||
if (!ossl_cml_write(cml, app_send_pipe, buf, buf_len, &bytes_written))
|
||||
return 0;
|
||||
|
||||
if (bytes_written == 0) {
|
||||
if (!should_block())
|
||||
break;
|
||||
|
||||
ossl_cml_block_until(cml, {notification_pipe, app_send_pipe});
|
||||
continue; /* try again */
|
||||
}
|
||||
|
||||
buf += bytes_written;
|
||||
buf_len -= bytes_written;
|
||||
}
|
||||
|
||||
return 1;
|
||||
}
|
||||
```
|
||||
|
||||
```c
|
||||
/*
|
||||
* Creates a new CML using the Direct CML (DCML) implementation. need_locking
|
||||
* may be 0 to elide mutex usage if the application is guaranteed to synchronise
|
||||
* access or is purely single-threaded.
|
||||
*/
|
||||
QUIC_CML *ossl_cml_new_direct(int need_locking);
|
||||
|
||||
/* Creates a new CML using the Worker CML (WCML) implementation. */
|
||||
QUIC_CML *ossl_cml_new_worker(size_t num_worker_threads);
|
||||
|
||||
/*
|
||||
* Starts the CML operating. Idempotent after it returns successfully. For the
|
||||
* WCML this might e.g. start background threads; for the DCML it is likely to
|
||||
* be a no-op (but must still be called).
|
||||
*/
|
||||
int ossl_cml_start(QUIC_CML *cml);
|
||||
|
||||
/*
|
||||
* Begins the CML shutdown process. Returns 1 once shutdown is complete; may
|
||||
* need to be called multiple times until shutdown is done.
|
||||
*/
|
||||
int ossl_cml_shutdown(QUIC_CML *cml);
|
||||
|
||||
/*
|
||||
* Immediate free of the CML. This is always safe but may cause handling
|
||||
* of a connection to be aborted abruptly as it is an immediate teardown
|
||||
* of all state.
|
||||
*/
|
||||
void ossl_cml_free(QUIC_CML *cml);
|
||||
|
||||
/*
|
||||
* Retrieves a pipe for a logical CML object described by selector. The pipe
|
||||
* handle, which is stable over the life of the logical CML object, is written
|
||||
* to *pipe_handle. class_ is a QUIC_CML_CLASS value.
|
||||
*/
|
||||
enum {
|
||||
QUIC_CML_CLASS_REQUEST, /* control; send */
|
||||
QUIC_CML_CLASS_NOTIFICATION, /* control; recv */
|
||||
QUIC_CML_CLASS_APP_SEND, /* data; send */
|
||||
QUIC_CML_CLASS_APP_RECV /* data; recv */
|
||||
};
|
||||
|
||||
int ossl_cml_get_pipe(QUIC_CML *cml,
|
||||
int class_,
|
||||
const QUIC_CML_SELECTOR *selector,
|
||||
QUIC_CML_PIPE *pipe_handle);
|
||||
|
||||
/*
|
||||
* Returns the number of bytes a sending pipe can currently accept. The returned
|
||||
* value may increase over time asynchronously but will only decrease in
|
||||
* response to an ossl_cml_write call.
|
||||
*/
|
||||
size_t ossl_cml_write_available(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle);
|
||||
|
||||
/*
|
||||
* Appends bytes into a sending pipe by copying them. The buffer can be freed
|
||||
* as soon as this call returns.
|
||||
*/
|
||||
int ossl_cml_write(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle,
|
||||
const void *buf, size_t buf_len);
|
||||
|
||||
/*
|
||||
* Returns the number of bytes a receiving pipe currently has waiting to be
|
||||
* read. The returned value may increase over time asynchronously but will only
|
||||
* decreate in response to an ossl_cml_read call.
|
||||
*/
|
||||
size_t ossl_cml_read_available(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle);
|
||||
|
||||
/*
|
||||
* Reads bytes from a receiving pipe by copying them.
|
||||
*/
|
||||
int ossl_cml_read(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle,
|
||||
void *buf, size_t buf_len);
|
||||
|
||||
/*
|
||||
* Blocks until at least one of the pipes in the array specified by
|
||||
* pipe_handles is ready, or until the deadline given is reached.
|
||||
*
|
||||
* A pipe is ready if:
|
||||
*
|
||||
* - it is a sending pipe and one or more bytes can now be written;
|
||||
* - it is a receiving pipe and one or more bytes can now be read.
|
||||
*/
|
||||
int ossl_cml_block_until(QUIC_CML *cml,
|
||||
const QUIC_CML_PIPE *pipe_handles,
|
||||
size_t num_pipe_handles,
|
||||
OSSL_TIME deadline);
|
||||
```
|
Loading…
Add table
Add a link
Reference in a new issue