Unicode byte-order mark detection for Rust projects.
unicode-bom
will read the first few bytes from an array or a file on disk, then determine whether a byte-order mark is present.
It won‘t check the rest of the data to determine whether it’s actually valid according to the indicated encoding.
Add it to your dependencies in Cargo.toml
:
[dependencies] unicode-bom = "2"
For more detailed information see the API docs, but the general gist is as follows:
use unicode_bom::Bom; // The BOM can be parsed from a file on disk via the `FromStr` trait... let bom: Bom = "foo.txt".parse().unwrap(); match bom { Bom::Null => { // No BOM was detected } Bom::Bocu1 => { // BOCU-1 BOM was detected } Bom::Gb18030 => { // GB 18030 BOM was detected } Bom::Scsu => { // SCSU BOM was detected } Bom::UtfEbcdic => { // UTF-EBCDIC BOM was detected } Bom::Utf1 => { // UTF-1 BOM was detected } Bom::Utf7 => { // UTF-7 BOM was detected } Bom::Utf8 => { // UTF-8 BOM was detected } Bom::Utf16Be => { // UTF-16 (big-endian) BOM was detected } Bom::Utf16Le => { // UTF-16 (little-endian) BOM was detected } Bom::Utf32Be => { // UTF-32 (big-endian) BOM was detected } Bom::Utf32Le => { // UTF-32 (little-endian) BOM was detected } } // ...or you can detect the BOM in a byte array let bytes = [0u8, 0u8, 0xfeu8, 0xffu8]; let bom = Bom::from(&bytes[0..]); assert_eq!(bom, Bom::Utf32Be); assert_eq(bom.len(), 4);
If you don't already have Rust installed, get that first using rustup
:
curl https://sh.rustup.rs -sSf | sh
Then you can build the project:
cargo b
And run the tests:
cargo t
Yes.
Yes.