Generally, the default constructor should be the fastest way of making an empty container.
That’s why I was surprised to see that it’s worse than initializing to an empty string literal:
#include <string>
std::string make_default() {
return {};
}
std::string make_empty() {
return "";
}
This compiles to: (clang 16, libc++)
make_default():
mov rax, rdi
xorps xmm0, xmm0
movups xmmword ptr [rdi], xmm0
mov qword ptr [rdi + 16], 0
ret
make_empty():
mov rax, rdi
mov word ptr [rdi], 0
ret
See live example at Compiler Explorer.
Notice how returning {}
is zeroing 24 bytes in total, but returning ""
is only zeroing 2 bytes. How come return "";
is so much better?
1
1 Answer
1
Highest score (default)
Trending (recent votes count more)
Date modified (newest first)
Date created (oldest first)
This is an intentional decision in libc++’s implementation of std::string
.
First of all, std::string
has so-called Small String Optimization (SSO), which means that for very short (or empty) strings, it will store their contents directly inside of the container, rather than allocating dynamic memory.
That’s why we don’t see any allocations in either case.
In libc++, the "short representation" of a std::string
consists of:
Size (x86_64) | Meaning |
---|---|
1 bit | "short flag" indicating that it is a short string (zero means yes) |
7 bits | length of the string, excluding null terminator |
0 bytes | padding bytes to align string data (none for basic_string<char> ) |
23 bytes | string data, including null terminator |
For an empty string, we only need to store two bytes of information:
- one zero-byte for the "short flag" and the length
- one zero-byte for the null terminator
The constructor accepting a const char*
will only write these two bytes, the bare minimum.
The default constructor "unnecessarily" zeroes all 24 bytes that the std::string
contains.
This may be better overall though, because it makes it possible for the compiler to emit std::memset
or other SIMD-parallel ways of zeroing arrays of strings in bulk.
For a full explanation, see below:
Initializing to ""
/ Calling string(const char*)
To understand what happens, let’s look at the libc++ source code for std::basic_string
:
// constraints...
/* specifiers... */ basic_string(const _CharT* __s)
: /* leave memory indeterminate */ {
// assert that __s != nullptr
__init(__s, traits_type::length(__s));
// ...
}
This ends up calling __init(__s, 0)
, where 0
is the length of the string, obtained from std::char_traits<char>
:
// template head etc...
void basic_string</* ... */>::__init(const value_type* __s, size_type __sz)
{
// length and constexpr checks
pointer __p;
if (__fits_in_sso(__sz))
{
__set_short_size(__sz); // set size to zero, first byte
__p = __get_short_pointer();
}
else
{
// not entered
}
traits_type::copy(std::__to_address(__p), __s, __sz); // copy string, nothing happens
traits_type::assign(__p[__sz], value_type()); // add null terminator
}
__set_short_size
will end up writing only a single byte, because the short representation of a string is:
struct __short
{
struct _LIBCPP_PACKED {
unsigned char __is_long_ : 1; // set to zero when active
unsigned char __size_ : 7; // set to zero for empty string
};
char __padding_[sizeof(value_type) - 1]; // zero size array
value_type __data_[__min_cap]; // null terminator goes here
};
After compiler optimizations, zeroing __is_long_
, __size_
, and one byte of __data_
compiles to:
mov word ptr [rdi], 0
Initializing to {}
/ Calling string()
The default constructor is more wasteful by comparison:
/* specifiers... */ basic_string() /* noexcept(...) */
: /* leave memory indeterminate */ {
// ...
__default_init();
}
This ends up calling __default_init()
, which does:
/* specifiers... */ void __default_init() {
__r_.first() = __rep(); // set representation to value-initialized __rep
// constexpr-only stuff...
}
Value-initialization of a __rep()
results in 24 zero bytes, because:
struct __rep {
union {
__long __l; // first union member gets initialized,
__short __s; // __long representation is 24 bytes large
__raw __r;
};
};
Conclusions
If you want to value-initialize everywhere for the sake of consistency, don’t let this keep you from it. Zeroing out a few bytes unnecessarily isn’t a big performance problem you need to worry about.
In fact, it is helpful when initializing large quantities of strings, because std::memset
may be used, or some other SIMD way of zeroing out memory.
2
-
To your point, I went ahead and tested this in Godbolt. Using
std::string
the compiler will go ahead and usememset
after a certain number of elements (11 in my testing), where as""
will usemov ptr addr, 0
as seen here– PatientPenguin3 hours ago
-
@PatientPenguin I've been getting similar results. I'm not sure if clang ever chooses to turn it into a loop for the
""
version. Even at 1024 strings, it will just emit 1024mov
instructions: godbolt.org/z/5x6E7rz1s. This is probably more relevant forstd::vector
, which would manually begin lifetimes in a loop when resizing. Demo here: godbolt.org/z/Y7xW4j7E7– Jan Schultke3 hours ago
Your Answer
Post as a guest
Required, but never shown
Post as a guest
Required, but never shown
By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.
Not the answer you're looking for? Browse other questions tagged
or ask your own question.
or ask your own question.
I question what exactly does "efficient" actually means. People think a computer having lots of free RAM is "good", but that's stupid. Free RAM is unused RAM, and sits there not doing anything. The best state of RAM is that it is being used, but is readily available for more demanding applications.
3 hours ago
|