mirror of
https://libwebsockets.org/repo/libwebsockets
synced 2025-05-06 08:50:02 +00:00

This provides very memory-efficient CBOR stream parsing and writing. The parser converts pieces of CBOR into callbacks that define the structure and collate string and blobs into buffer chunks for extensible and easy access. It is fragementation-safe and does not need all the CBOR in the same place at one time, chunks of CBOR are parsed and discarded as provided. It does not allocate and just needs a few hundred bytes of stack for even huge CBOR objects. Huge strings and blobs are handled without needing memory to hold them atomically. Includes ./minimal-examples/api-tests/api-test-lecp that unit tests it against 82 official example CBORs and 26 additional test vectors from COSE (just checking the CBOR parsing). The writing apis allow printf style semantics with a variety of CBOR-aware %-formats. The apis write into a context that manages output buffer usage, if the output buffer fills, then the apis return with an AGAIN code that lets you issue and reset the output buffer and repeat the api all to issue more output. The subsequent calls can occur much later or from a different function context, so this is perfect for WRITEABLE-mediated output from the network parts of lws. See ./READMEs/README.cbor-lecp.md
107 lines
4.1 KiB
Markdown
107 lines
4.1 KiB
Markdown
# LEJP JSON Stream Parser
|
|
|
|
|||
|
|
|---|---|---|
|
|
|cmake| `LWS_WITH_LEJP`|
|
|
|Header| ./include/libwebsockets/lws-lejp.h|
|
|
|api-test| ./minimal-examples/api-tests/api-test-lejp/|
|
|
|test app| ./test-apps/test-lejp.c -> libwebsockets-test-lejp|
|
|
|
|
LEJP is a lightweight JSON stream parser.
|
|
|
|
The features are:
|
|
|
|
- completely immune to input fragmentation, give it any size blocks of JSON as
|
|
they become available, 1 byte, or 100K at a time give identical parsing
|
|
results
|
|
- input chunks discarded as they are parsed, whole JSON never needed in memory
|
|
- nonrecursive, fixed stack usage of a few dozen bytes
|
|
- no heap allocations at all, just requires ~500 byte context usually on
|
|
caller stack
|
|
- creates callbacks to a user-provided handler as members are parsed out
|
|
- no payload size limit, supports huge / endless strings bigger than
|
|
system memory
|
|
- collates utf-8 text payloads into a 250-byte chunk buffer in the json parser
|
|
context object for ease of access
|
|
|
|
## Type handling
|
|
|
|
LEJP leaves all numbers in text form, they are signalled in different callbacks
|
|
according to int or float, but delivered as text strings in the first
|
|
`ctx->npos` chars of `ctx->buf`.
|
|
|
|
For numeric types, you would typically use `atoi()` or similar to recover the
|
|
number as a host type.
|
|
|
|
## Callback reasons
|
|
|
|
The user callback does not have to handle any callbacks, it only needs to
|
|
process the data for the ones it is interested in.
|
|
|
|
|Callback reason|JSON structure|Associated data|
|
|
|---|---|---|
|
|
|`LEJPCB_CONSTRUCTED`|Created the parse context||
|
|
|`LEJPCB_DESTRUCTED`|Destroyed the parse context||
|
|
|`LEJPCB_COMPLETE`|The parsing completed OK||
|
|
|`LEJPCB_FAILED`|The parsing failed||
|
|
|`LEJPCB_VAL_TRUE`|boolean true||
|
|
|`LEJPCB_VAL_FALSE`|boolean false||
|
|
|`LEJPCB_VAL_NULL`|explicit NULL||
|
|
|`LEJPCB_PAIR_NAME`|The name part of a JSON `key: value` map pair|`ctx->buf`|
|
|
|`LEJPCB_VAL_STR_START`|A UTF-8 string is starting||
|
|
|`LEJPCB_VAL_STR_CHUNK`|The next string chunk|`ctx->npos` bytes in `ctx->buf`|
|
|
|`LEJPCB_VAL_STR_END`|The last string chunk|`ctx->npos` bytes in `ctx->buf`|
|
|
|`LEJPCB_ARRAY_START`|An array is starting||
|
|
|`LEJPCB_ARRAY_END`|An array has ended||
|
|
|`LEJPCB_OBJECT_START`|A JSON object is starting||
|
|
|`LEJPCB_OBJECT_END`|A JSON object has ended||
|
|
|
|
## Handling JSON UTF-8 strings
|
|
|
|
When a string is parsed, an advisory callback of `LECPCB_VAL_STR_START` occurs
|
|
first. No payload is delivered with the START callback.
|
|
|
|
Payload is collated into `ctx->buf[]`, the valid length is in `ctx->npos`.
|
|
|
|
For short strings or blobs where the length is known, the whole payload is
|
|
delivered in a single `LECPCB_VAL_STR_END` callback.
|
|
|
|
For payloads larger than the size of `ctx->buf[]`, `LECPCB_VAL_STR_CHUNK`
|
|
callbacks occur delivering each sequential bufferload.
|
|
|
|
The last chunk (which may be zero length) is delievered by `LECPCB_VAL_STR_END`.
|
|
|
|
## Parsing paths
|
|
|
|
LEJP maintains a "parsing path" in `ctx->path` that represents the context of
|
|
the callback events. As a convenience, at LEJP context creation time, you can
|
|
pass in an array of path strings you want to match on, and have any match
|
|
checkable in the callback using `ctx->path_match`, it's 0 if no active match,
|
|
or the match index from your path array starting from 1 for the first entry.
|
|
|
|
|CBOR element|Representation in path|
|
|
|---|---|
|
|
|JSON Array|`[]`|
|
|
|JSON Map|`.`|
|
|
|JSON Map entry key string|`keystring`|
|
|
|
|
|
|
|
|
## Comparison with LECP (CBOR parser)
|
|
|
|
LECP is based on the same principles as LEJP and shares most of the callbacks.
|
|
The major differences:
|
|
|
|
- LEJP value callbacks all appear in `ctx->buf[]`, ie, floating-point is
|
|
provided to the callback in ascii form like `"1.0"`. CBOR provides a more
|
|
strict typing system, and the different type values are provided either in
|
|
`ctx->buf[]` for blobs or utf-8 text strtings, or the `item.u` union for
|
|
converted types, with additional callback reasons specific to each type.
|
|
|
|
- CBOR "maps" use `_OBJECT_START` and `_END` parsing callbacks around the
|
|
key / value pairs. LEJP has a special callback type `PAIR_NAME` for the
|
|
key string / integer, but in LECP these are provided as generic callbacks
|
|
dependent on type, ie, generic string callbacks or integer ones, and the
|
|
value part is represented according to whatever comes.
|
|
|
|
|