Converting a Vector of Bytes to String in Rust

1. Overview

In Rust, a vector of bytes (Vec<u8>) represents a resizable array of 8-bit unsigned integers.

in this tutorial, we explore different ways to convert a vector of bytes and strings.

2. Understanding byte and string

A byte is the basic information storage unit, represented as a sequence of 8 bits. Also, a string is a sequence of characters. In Rust, strings are UTF-8 encoded, which means each character can be represented by one to four bytes.

Rust provides a way to convert a vector of bytes into a string. We’ll explore these methods in the next section.

3. Converting vector of byte to string

3.1. Using from_utf8()

The from_utf8() is an associated function of string type. It helps to create a new string from a vector of bytes. Additionally, it returns a Result type because the conversion might fail if the byte vector includes non-UTF-8 sequences.

Let’s see an example code that convert Vec<u8> to string using from_utf8():

let bytes: Vec<u8> = vec![0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x57, 0x6f, 0x72, 0x6c, 0x64];
    match String::from_utf8(bytes) {
        Ok(string) => println!("{}", string),
        Err(e) => println!("Invalid UTF-8 sequence: {}", e),
    }

Here, we create a vector of bytes. Each byte represents an ASCII value corresponding to a character. Then, we use the from_utf8 function to convert the vector byte to a string. Also, the function returns a Result because the conversion might fail. To handle a possible error, we use the Rust pattern matching.

3.2. Using from_utf8_lossy()

The from_utf8_lossy() function helps to convert vector bytes to a string regardless of their validity. It replaces invalid UTF-8 sequences with the Unicode replacement character(U+FFFD).

Here’s an example code that uses from_utf8_lossy() to convert Vec<8> to string:

let bytes_lossy: Vec<u8> = vec![0xF0, 0x90, 0x80];
let string = String::from_utf8_lossy(&bytes_lossy);

In the code above, we represent a valid UTF-8 encoded character with hexadecimal. Next, we use the from_uft8_lossy to convert it to a string.

4. Conclusion

In this article, we learned two ways to convert a vector of a byte to a string. Depending on the use case, both methods can convert Vec<u8> to string but from_utf8_lossy() function extends its functionality by replacing an invalid UTF-8 sequence.

The complete example code is available on GitHub.