Does a const char* literal string persistently exist as long as the process alive?

Does a const char* literal string persistently exist as long as the process alive?


12

I have functions like the following:

const char* get_message() {
    return "This is a constant message, will NOT change forever!";
};

const char* get_message2() {
    return "message2";
};

And I’m planning to use them everywhere my app, even though in different threads.

I’m wondering about the life time of these strings, i.e. whether it’s safe to use these const char* string out of the function get_message.

I guess that a hard coded const char* string will be compiled into the codes segment of a app instead of the data segments, so maybe it is safely to use them as above?

10

  • 10

    Literal strings like that have a life-time of the whole program. But note that a const char * can point to non-literal strings as well, and the life-time of those string depends on what they're pointing to.

    – Some programmer dude

    14 hours ago


  • 4

    Obligatory nitpick, string literals have type const char [N], not const char *.

    – HolyBlackCat

    14 hours ago

  • 3

    I would suggest auto& instead of const char* to not needlessly loose type information. It will also make functions requiring arrays of known bound happy – and you can use range-based for loops on the returned C strings if needed. Example

    – Ted Lyngmo

    14 hours ago


  • 3

    Simple solution for strings in C++: Use std::string for all your strings. Then the life-time issue is sidestepped. There are exception (as for any strict rule) where std::string_view might be a better choice, but if you only use std::string you can't go wrong.

    – Some programmer dude

    13 hours ago


  • 2

    This is the bad thing with "C" style coding and why C++ has containers (like std::vector, std::string) and smart pointers like std::unique_ptr and std::shared_ptr all these types help with describing lifetime of objects. Your constant object probably is best modeled using a std::string_view and a message that can change using std::string

    – Pepijn Kramer

    13 hours ago


4 Answers
4


12

To give an answer from the standard, "message" is a string literal, and string literals have static lifetime, which means that the object (the char const[] which contains the characters) has a lifetime for the entire program. (It’s a bit more complicated for objects with non-trivial constructors or destructors). So pointers to it will be valid for the lifetime of the program.


8

Yes, it is safe to do that. Your assumptions are correct.

3

  • It is not quite right, because if you consider code segment is ,text, but this string constant will be compiled into .rodata section of ELF file

    – Drazen Grasovec

    13 hours ago

  • 6

    @DrazenGrasovec Yes, and in PE files it will usually be the .rdata segment. But this is a technical detail that is not really helpful in the context of the question (in my opinion).

    – Christian Halaszovich

    13 hours ago

  • 2

    @DrazenGrasovec The standard doesn't speak about code segments in binary files, which is an implementation detail. It does however say that string literals have static storage duration and your code is therefore safe, no matter where the actual string literals end up.

    – Ted Lyngmo

    6 hours ago



6

Short answer is string literal "message2" will exist in memory
as long as process, but in .rodata section (assuming we talk about ELF file).

We return pointer to string constant, but as we will latter see, there is not separate memory defined anywhere which stores this const char * pointer
and there is no need to, as address of string is calculated in code and returned using register $rax every time function is called.

But lets take a look in the code what happens with gdb

Does a const char* literal string persistently exist as long as the process alive?

We put breakpoint in our function returning a pointer to constant string, and we see assembly code and process map:

Does a const char* literal string persistently exist as long as the process alive?

Code gets this string in following instruction:

0x000055555555514a <+8>:    lea    0xeb3(%rip),%rax        # 0x555555556004

What this instruction does it calculates address of "message2".
We see here what PIC (position independent code) means.

Address of "message2" string is not hardcoded as absolute,
but is calculated as relative, as hardcoded offset 0xeb3 of next instruction address (0x555555555151 + 0xeb3) and put in register rax.

Purpose of relative addressing (current address +/- offset)
means process will always get the right address of "message2",
no matter where in memory it is loaded.

So here we see that const char * that you asked actually doesn’t exist in memory, because address is calculated "on the fly" and returned using $rax:

We have address in $rax:

(gdb) i r $rax
rax   0x555555556004      93824992239620

And it holds address of "message2":

(gdb) x/s 0x555555556004
0x555555556004: "message2"

Now lets see where address 0x555555556004 in process address map
is:

0x555555556000     0x555555557000     0x1000     0x2000  r--p   /home/drazen/proba/main

So this section is not executable and not writable, just readable and private (r–p) which makes sense as this is not shared library.

When we check with readelf it shows that it is in the .rodata section of ELF file:

drazen@HP-ProBook-640G1:~/proba$ readelf  -x .rodata main

Hex dump of section '.rodata':
0x00002000 01000200 6d657373 61676532 00       ....message2.

So answer is that this string will not be hardcoded in code segment .text of the ELF file but read only data segment .rodata, but yes it will exist as long process exists in memory.

And just to add small detail, this constant string will be returned to main() function by reference of course (address), but not on the stack but in register rax:

(gdb) i r
rax   0x555555556004      93824992239620
rbx   0x0 

Hope it helps!

9

  • 4

    Technically you are right. But in the context of the question it doesn't really matter. Also your answer is very platform specific and not necessarily true for all platforms.

    – Christian Halaszovich

    13 hours ago

  • 4

    Please don't post pictures of code/data.

    – Ted Lyngmo

    13 hours ago

  • 6

    You can't judge whether some code is legal from the assembly alone. UB could give you one assembly on one compiler, and break things on another.

    – HolyBlackCat

    12 hours ago

  • 2

    well, question was, was this string hard-coded into code segment, and technically it isnt, because code segment is where instructions are, and this constant string is not coded as a part of assembly instruction, is just plain data so its places in rodata section. I haven't tried this on different platforms, but i am sure it will be placed in rodata as well.

    – Drazen Grasovec

    12 hours ago


  • 7

    @DrazenGrasovec It doesn't matter where it is placed. The standard guarantees static lifetime, so the compiler must do something to ensure static lifetime. Different linkers and different architectures have different notions of segments. I've used compilers which placed string literals in the same segment as code, and even ones that placed it in the same segment as other data (and which allowed overwriting it!).

    – James Kanze

    12 hours ago


5

I’m wondering about the life time of these strings, i.e. whether it’s safe to use these const char* string out of the function get_message.

A quick look at the standard then.

Evaluating a string-literal results in a string literal object with static storage duration, initialized from the given characters as specified above. Whether all string-literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified. [Note: The effect of attempting to modify a string-literal is undefined. —end note]

—-ISO/IEC JTC1 SC22 WG21 N4860 (section 5.13.5 [String literals])

So yes. After the function evaluates the string literal and returns a const char* to the string literal, the standard assures that this string literal will be given static storage duration.



Leave a Reply

Your email address will not be published. Required fields are marked *