| # unicode-bom |
| |
| [](https://gitlab.com/philbooth/unicode-bom/pipelines) |
| [](https://crates.io/crates/unicode_bom) |
| [](https://crates.io/crates/unicode_bom) |
| [](https://www.apache.org/licenses/LICENSE-2.0) |
| |
| [Unicode byte-order mark](https://en.wikipedia.org/wiki/Byte_order_mark) detection |
| for Rust projects. |
| |
| * [What does it do?](#what-does-it-do) |
| * [What doesn't it do?](#what-doesnt-it-do) |
| * [How do I install it?](#how-do-i-install-it) |
| * [How do I use it?](#how-do-i-use-it) |
| * [How do I set up the build environment?](#how-do-i-set-up-the-build-environment) |
| * [Is there API documentation?](#is-there-api-documentation) |
| * [Is there a change log?](#is-there-a-change-log) |
| * [What license is it published under?](#what-license-is-it-published-under) |
| |
| ## What does it do? |
| |
| `unicode-bom` will read |
| the first few bytes from |
| an array or a file on disk, |
| then determine whether |
| a byte-order mark is present. |
| |
| ## What doesn't it do? |
| |
| It won't check the rest of the data |
| to determine whether it's actually valid |
| according to the indicated encoding. |
| |
| ## How do I install it? |
| |
| Add it to your dependencies |
| in `Cargo.toml`: |
| |
| ```toml |
| [dependencies] |
| unicode-bom = "2" |
| ``` |
| |
| ## How do I use it? |
| |
| For more detailed information |
| see the [API docs](https://philbooth.gitlab.io/unicode-bom/unicode_bom/), |
| but the general gist |
| is as follows: |
| |
| ```rust |
| use unicode_bom::Bom; |
| |
| // The BOM can be parsed from a file on disk via the `FromStr` trait... |
| let bom: Bom = "foo.txt".parse().unwrap(); |
| match bom { |
| Bom::Null => { |
| // No BOM was detected |
| } |
| Bom::Bocu1 => { |
| // BOCU-1 BOM was detected |
| } |
| Bom::Gb18030 => { |
| // GB 18030 BOM was detected |
| } |
| Bom::Scsu => { |
| // SCSU BOM was detected |
| } |
| Bom::UtfEbcdic => { |
| // UTF-EBCDIC BOM was detected |
| } |
| Bom::Utf1 => { |
| // UTF-1 BOM was detected |
| } |
| Bom::Utf7 => { |
| // UTF-7 BOM was detected |
| } |
| Bom::Utf8 => { |
| // UTF-8 BOM was detected |
| } |
| Bom::Utf16Be => { |
| // UTF-16 (big-endian) BOM was detected |
| } |
| Bom::Utf16Le => { |
| // UTF-16 (little-endian) BOM was detected |
| } |
| Bom::Utf32Be => { |
| // UTF-32 (big-endian) BOM was detected |
| } |
| Bom::Utf32Le => { |
| // UTF-32 (little-endian) BOM was detected |
| } |
| } |
| |
| // ...or you can detect the BOM in a byte array |
| let bytes = [0u8, 0u8, 0xfeu8, 0xffu8]; |
| let bom = Bom::from(&bytes[0..]); |
| assert_eq!(bom, Bom::Utf32Be); |
| assert_eq(bom.len(), 4); |
| ``` |
| |
| ## How do I set up the build environment? |
| |
| If you don't already have Rust installed, |
| get that first using [`rustup`](https://rustup.rs/): |
| |
| ``` |
| curl https://sh.rustup.rs -sSf | sh |
| ``` |
| |
| Then you can build the project: |
| |
| ``` |
| cargo b |
| ``` |
| |
| And run the tests: |
| |
| ``` |
| cargo t |
| ``` |
| |
| ## Is there API documentation? |
| |
| [Yes](https://philbooth.gitlab.io/unicode-bom/unicode_bom/). |
| |
| ## Is there a change log? |
| |
| [Yes](HISTORY.md). |
| |
| ## What license is it published under? |
| |
| [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0). |
| |