README.md - platform/system/libcppbor - Git at Google

 LibCppBor: A Modern C++ CBOR Parser and Generator
 ==============================================

 LibCppBor provides a natural and easy-to-use syntax for constructing and
 parsing CBOR messages.  It does not (yet) support all features of
 CBOR, nor (yet) support validation against CDDL schemata, though both
 are planned.  CBOR features that aren't supported include:

 * Indefinite length values
 * Semantic tagging
 * Floating point

 LibCppBor requires C++-17.

 ## CBOR representation

 LibCppBor represents CBOR data items as instances of the `Item` class or,
 more precisely, as instances of subclasses of `Item`, since `Item` is a
 pure interface.  The subclasses of `Item` correspond almost one-to-one
 with CBOR major types, and are named to match the CDDL names to which
 they correspond.  They are:

 * `Uint` corresponds to major type 0, and can hold unsigned integers
   up through (2^64 - 1).
 * `Nint` corresponds to major type 1.  It can only hold values from -1
   to -(2^63 - 1), since it's internal representation is an int64_t.
   This can be fixed, but it seems unlikely that applications will need
   the omitted range from -(2^63) to (2^64 - 1), since it's
   inconvenient to represent them in many programming languages.
 * `Int` is an abstract base of `Uint` and `Nint` that facilitates
   working with all signed integers representable with int64_t.
 * `Bstr` corresponds to major type 2, a byte string.
 * `Tstr` corresponds to major type 3, a text string.
 * `Array` corresponds to major type 4, an Array.  It holds a
   variable-length array of `Item`s.
 * `Map` corresponds to major type 5, a Map.  It holds a
   variable-length array of pairs of `Item`s.
 * `Simple` corresponds to major type 7.  It's an abstract class since
   items require more specific type.
 * `Bool` is the only currently-implemented subclass of `Simple`.

 Note that major type 6, semantic tag, is not yet implemented.

 In practice, users of LibCppBor will rarely use most of these classes
 when generating CBOR encodings.  This is because LibCppBor provides
 straightforward conversions from the obvious normal C++ types.
 Specifically, the following conversions are provided in appropriate
 contexts:

 * Signed and unsigned integers convert to `Uint` or `Nint`, as
   appropriate.
 * `std::string`, `std::string_view`, `const char*` and
   `std::pair<char iterator, char iterator>` convert to `Tstr`.
 * `std::vector<uint8_t>`, `std::pair<uint8_t iterator, uint8_t
   iterator>` and `std::pair<uint8_t*, size_t>` convert to `Bstr`.
 * `bool` converts to `Bool`.

 ## CBOR generation

 ### Complete tree generation

 The set of `encode` methods in `Item` provide the interface for
 producing encoded CBOR.  The basic process for "complete tree"
 generation (as opposed to "incremental" generation, which is discussed
 below) is to construct an `Item` which models the data to be encoded,
 and then call one of the `encode` methods, whichever is convenient for
 the encoding destination.  A trivial example:

 ```
 cppbor::Uint val(0);
 std::vector<uint8_t> encoding = val.encode();
 ```

     It's relatively rare that single values are encoded as above.  More often, the
     "root" data item will be an `Array` or `Map` which contains a more complex structure.For example
     :

 ``` using cppbor::Map;
 using cppbor::Array;

 std::vector<uint8_t> vec =  // ...
     Map val("key1", Array(Map("key_a", 99 "key_b", vec), "foo"), "key2", true);
 std::vector<uint8_t> encoding = val.encode();
 ```

 This creates a map with two entries, with `Tstr` keys "Outer1" and
 "Outer2", respectively.  The "Outer1" entry has as its value an
 `Array` containing a `Map` and a `Tstr`.  The "Outer2" entry has a
 `Bool` value.

 This example demonstrates how automatic conversion of C++ types to
 LibCppBor `Item` subclass instances is done.  Where the caller provides a
 C++ or C string, a `Tstr` entry is added.  Where the caller provides
 an integer literal or variable, a `Uint` or `Nint` is added, depending
 on whether the value is positive or negative.

 As an alternative, a more fluent-style API is provided for building up
 structures.  For example:

 ```
 using cppbor::Map;
 using cppbor::Array;

 std::vector<uint8_t> vec =  // ...
     Map val();
 val.add("key1", Array().add(Map().add("key_a", 99).add("key_b", vec)).add("foo")).add("key2", true);
 std::vector<uint8_t> encoding = val.encode();
 ```

     An advantage of this interface over the constructor -
     based creation approach above is that it need not be done all at once.
     The `add` methods return a reference to the object added to to allow calls to be chained,
     but chaining is not necessary; calls can be made
     sequentially, as the data to add is available.

 #### `encode` methods

 There are several variations of `Item::encode`, all of which
 accomplish the same task but output the encoded data in different
 ways, and with somewhat different performance characteristics.  The
 provided options are:

 * `bool encode(uint8\_t** pos, const uint8\_t* end)` encodes into the
   buffer referenced by the range [`*pos`, end).  `*pos` is moved.  If
   the encoding runs out of buffer space before finishing, the method
   returns false.  This is the most efficient way to encode, into an
   already-allocated buffer.
 * `void encode(EncodeCallback encodeCallback)` calls `encodeCallback`
   for each encoded byte.  It's the responsibility of the implementor
   of the callback to behave safely in the event that the output buffer
   (if applicable) is exhausted.  This is less efficient than the prior
   method because it imposes an additional function call for each byte.
 * `template </*...*/> void encode(OutputIterator i)`
   encodes into the provided iterator.  SFINAE ensures that the
   template doesn't match for non-iterators.  The implementation
   actually uses the callback-based method, plus has whatever overhead
   the iterator adds.
 * `std::vector<uint8_t> encode()` creates a new std::vector, reserves
   sufficient capacity to hold the encoding, and inserts the encoded
   bytes with a std::pushback_iterator and the previous method.
 * `std::string toString()` does the same as the previous method, but
   returns a string instead of a vector.

 ### Incremental generation

 Incremental generation requires deeper understanding of CBOR, because
 the library can't do as much to ensure that the output is valid.  The
 basic tool for intcremental generation is the `encodeHeader`
 function.  There are two variations, one which writes into a buffer,
 and one which uses a callback.  Both simply write out the bytes of a
 header.  To construct the same map as in the above examples,
 incrementally, one might write:

 ```
 using namespace cppbor;  // For example brevity

 std::vector encoding;
 auto iter = std::back_inserter(result);
 encodeHeader(MAP, 2 /* # of map entries */, iter);
 std::string s = "key1";
 encodeHeader(TSTR, s.size(), iter);
 std::copy(s.begin(), s.end(), iter);
 encodeHeader(ARRAY, 2 /* # of array entries */, iter);
 Map().add("key_a", 99).add("key_b", vec).encode(iter)
 s = "foo";
 encodeHeader(TSTR, foo.size(), iter);
 std::copy(s.begin(), s.end(), iter);
 s = "key2";
 encodeHeader(TSTR, foo.size(), iter);
 std::copy(s.begin(), s.end(), iter);
 encodeHeader(SIMPLE, TRUE, iter);
 ```

 As the above example demonstrates, the styles can be mixed -- Note the
 creation and encoding of the inner Map using the fluent style.

 ## Parsing

 LibCppBor also supports parsing of encoded CBOR data, with the same
 feature set as encoding.  There are two basic approaches to parsing,
 "full" and "stream"

 ### Full parsing

 Full parsing means completely parsing a (possibly-compound) data
 item from a byte buffer.  The `parse` functions that do not take a
 `ParseClient` pointer do this.  They return a `ParseResult` which is a
 tuple of three values:

 * std::unique_ptr<Item> that points to the parsed item, or is nullptr
   if there was a parse error.
 * const uint8_t* that points to the byte after the end of the decoded
   item, or to the first unparseable byte in the event of an error.
 * std::string that is empty on success or contains an error message if
   a parse error occurred.

 Assuming a successful parse, you can then use `Item::type()` to
 discover the type of the parsed item (e.g. MAP), and then use the
 appropriate `Item::as*()` method (e.g. `Item::asMap()`) to get a
 pointer to an interface which allows you to retrieve specific values.

 ### Stream parsing

 Stream parsing is more complex, but more flexible.  To use
 StreamParsing, you must create your own subclass of `ParseClient` and
 call one of the `parse` functions that accepts it.  See the
 `ParseClient` methods docstrings for details.

 One unusual feature of stream parsing is that the `ParseClient`
 callback methods not only provide the parsed Item, but also pointers
 to the portion of the buffer that encode that Item.  This is useful
 if, for example, you want to find an element inside of a structure,
 and then copy the encoding of that sub-structure, without bothering to
 parse the rest.

 The full parser is implemented with the stream parser.

 ### Disclaimer
 This is not an officially supported Google product
	LibCppBor: A Modern C++ CBOR Parser and Generator
	==============================================

	LibCppBor provides a natural and easy-to-use syntax for constructing and
	parsing CBOR messages. It does not (yet) support all features of
	CBOR, nor (yet) support validation against CDDL schemata, though both
	are planned. CBOR features that aren't supported include:

	* Indefinite length values
	* Semantic tagging
	* Floating point

	LibCppBor requires C++-17.

	## CBOR representation

	LibCppBor represents CBOR data items as instances of the `Item` class or,
	more precisely, as instances of subclasses of `Item`, since `Item` is a
	pure interface. The subclasses of `Item` correspond almost one-to-one
	with CBOR major types, and are named to match the CDDL names to which
	they correspond. They are:

	* `Uint` corresponds to major type 0, and can hold unsigned integers
	up through (2^64 - 1).
	* `Nint` corresponds to major type 1. It can only hold values from -1
	to -(2^63 - 1), since it's internal representation is an int64_t.
	This can be fixed, but it seems unlikely that applications will need
	the omitted range from -(2^63) to (2^64 - 1), since it's
	inconvenient to represent them in many programming languages.
	* `Int` is an abstract base of `Uint` and `Nint` that facilitates
	working with all signed integers representable with int64_t.
	* `Bstr` corresponds to major type 2, a byte string.
	* `Tstr` corresponds to major type 3, a text string.
	* `Array` corresponds to major type 4, an Array. It holds a
	variable-length array of `Item`s.
	* `Map` corresponds to major type 5, a Map. It holds a
	variable-length array of pairs of `Item`s.
	* `Simple` corresponds to major type 7. It's an abstract class since
	items require more specific type.
	* `Bool` is the only currently-implemented subclass of `Simple`.

	Note that major type 6, semantic tag, is not yet implemented.

	In practice, users of LibCppBor will rarely use most of these classes
	when generating CBOR encodings. This is because LibCppBor provides
	straightforward conversions from the obvious normal C++ types.
	Specifically, the following conversions are provided in appropriate
	contexts:

	* Signed and unsigned integers convert to `Uint` or `Nint`, as
	appropriate.
	* `std::string`, `std::string_view`, `const char*` and
	`std::pair<char iterator, char iterator>` convert to `Tstr`.
	* `std::vector<uint8_t>`, `std::pair<uint8_t iterator, uint8_t
	iterator>` and `std::pair<uint8_t*, size_t>` convert to `Bstr`.
	* `bool` converts to `Bool`.

	## CBOR generation

	### Complete tree generation

	The set of `encode` methods in `Item` provide the interface for
	producing encoded CBOR. The basic process for "complete tree"
	generation (as opposed to "incremental" generation, which is discussed
	below) is to construct an `Item` which models the data to be encoded,
	and then call one of the `encode` methods, whichever is convenient for
	the encoding destination. A trivial example:

	```
	cppbor::Uint val(0);
	std::vector<uint8_t> encoding = val.encode();
	```

	It's relatively rare that single values are encoded as above. More often, the
	"root" data item will be an `Array` or `Map` which contains a more complex structure.For example
	:

	``` using cppbor::Map;
	using cppbor::Array;

	std::vector<uint8_t> vec = // ...
	Map val("key1", Array(Map("key_a", 99 "key_b", vec), "foo"), "key2", true);
	std::vector<uint8_t> encoding = val.encode();
	```

	This creates a map with two entries, with `Tstr` keys "Outer1" and
	"Outer2", respectively. The "Outer1" entry has as its value an
	`Array` containing a `Map` and a `Tstr`. The "Outer2" entry has a
	`Bool` value.

	This example demonstrates how automatic conversion of C++ types to
	LibCppBor `Item` subclass instances is done. Where the caller provides a
	C++ or C string, a `Tstr` entry is added. Where the caller provides
	an integer literal or variable, a `Uint` or `Nint` is added, depending
	on whether the value is positive or negative.

	As an alternative, a more fluent-style API is provided for building up
	structures. For example:

	```
	using cppbor::Map;
	using cppbor::Array;

	std::vector<uint8_t> vec = // ...
	Map val();
	val.add("key1", Array().add(Map().add("key_a", 99).add("key_b", vec)).add("foo")).add("key2", true);
	std::vector<uint8_t> encoding = val.encode();
	```

	An advantage of this interface over the constructor -
	based creation approach above is that it need not be done all at once.
	The `add` methods return a reference to the object added to to allow calls to be chained,
	but chaining is not necessary; calls can be made
	sequentially, as the data to add is available.

	#### `encode` methods

	There are several variations of `Item::encode`, all of which
	accomplish the same task but output the encoded data in different
	ways, and with somewhat different performance characteristics. The
	provided options are:

	* `bool encode(uint8\_t** pos, const uint8\_t* end)` encodes into the
	buffer referenced by the range [`pos`, end). `pos` is moved. If
	the encoding runs out of buffer space before finishing, the method
	returns false. This is the most efficient way to encode, into an
	already-allocated buffer.
	* `void encode(EncodeCallback encodeCallback)` calls `encodeCallback`
	for each encoded byte. It's the responsibility of the implementor
	of the callback to behave safely in the event that the output buffer
	(if applicable) is exhausted. This is less efficient than the prior
	method because it imposes an additional function call for each byte.
	* `template </.../> void encode(OutputIterator i)`
	encodes into the provided iterator. SFINAE ensures that the
	template doesn't match for non-iterators. The implementation
	actually uses the callback-based method, plus has whatever overhead
	the iterator adds.
	* `std::vector<uint8_t> encode()` creates a new std::vector, reserves
	sufficient capacity to hold the encoding, and inserts the encoded
	bytes with a std::pushback_iterator and the previous method.
	* `std::string toString()` does the same as the previous method, but
	returns a string instead of a vector.

	### Incremental generation

	Incremental generation requires deeper understanding of CBOR, because
	the library can't do as much to ensure that the output is valid. The
	basic tool for intcremental generation is the `encodeHeader`
	function. There are two variations, one which writes into a buffer,
	and one which uses a callback. Both simply write out the bytes of a
	header. To construct the same map as in the above examples,
	incrementally, one might write:

	```
	using namespace cppbor; // For example brevity

	std::vector encoding;
	auto iter = std::back_inserter(result);
	encodeHeader(MAP, 2 /* # of map entries */, iter);
	std::string s = "key1";
	encodeHeader(TSTR, s.size(), iter);
	std::copy(s.begin(), s.end(), iter);
	encodeHeader(ARRAY, 2 /* # of array entries */, iter);
	Map().add("key_a", 99).add("key_b", vec).encode(iter)
	s = "foo";
	encodeHeader(TSTR, foo.size(), iter);
	std::copy(s.begin(), s.end(), iter);
	s = "key2";
	encodeHeader(TSTR, foo.size(), iter);
	std::copy(s.begin(), s.end(), iter);
	encodeHeader(SIMPLE, TRUE, iter);
	```

	As the above example demonstrates, the styles can be mixed -- Note the
	creation and encoding of the inner Map using the fluent style.

	## Parsing

	LibCppBor also supports parsing of encoded CBOR data, with the same
	feature set as encoding. There are two basic approaches to parsing,
	"full" and "stream"

	### Full parsing

	Full parsing means completely parsing a (possibly-compound) data
	item from a byte buffer. The `parse` functions that do not take a
	`ParseClient` pointer do this. They return a `ParseResult` which is a
	tuple of three values:

	* std::unique_ptr<Item> that points to the parsed item, or is nullptr
	if there was a parse error.
	* const uint8_t* that points to the byte after the end of the decoded
	item, or to the first unparseable byte in the event of an error.
	* std::string that is empty on success or contains an error message if
	a parse error occurred.

	Assuming a successful parse, you can then use `Item::type()` to
	discover the type of the parsed item (e.g. MAP), and then use the
	appropriate `Item::as*()` method (e.g. `Item::asMap()`) to get a
	pointer to an interface which allows you to retrieve specific values.

	### Stream parsing

	Stream parsing is more complex, but more flexible. To use
	StreamParsing, you must create your own subclass of `ParseClient` and
	call one of the `parse` functions that accepts it. See the
	`ParseClient` methods docstrings for details.

	One unusual feature of stream parsing is that the `ParseClient`
	callback methods not only provide the parsed Item, but also pointers
	to the portion of the buffer that encode that Item. This is useful
	if, for example, you want to find an element inside of a structure,
	and then copy the encoding of that sub-structure, without bothering to
	parse the rest.

	The full parser is implemented with the stream parser.

	### Disclaimer
	This is not an officially supported Google product