| |
| |
| Valgrind FAQ |
| Release 3.13.0 15 June 2017 |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Table of Contents |
| 1. Background |
| 2. Compiling, installing and configuring |
| 3. Valgrind aborts unexpectedly |
| 4. Valgrind behaves unexpectedly |
| 5. Miscellaneous |
| 6. How To Get Further Assistance |
| |
| ------------------------------------------------------------------------ |
| 1. Background |
| ------------------------------------------------------------------------ |
| |
| 1.1. How do you pronounce "Valgrind"? |
| The "Val" as in the word "value". The "grind" is pronounced with a short |
| 'i' -- ie. "grinned" (rhymes with "tinned") rather than "grined" (rhymes |
| with "find"). |
| |
| Don't feel bad: almost everyone gets it wrong at first. |
| ------------------------------------------------------------------------ |
| |
| 1.2. Where does the name "Valgrind" come from? |
| From Nordic mythology. Originally (before release) the project was named |
| Heimdall, after the watchman of the Nordic gods. He could "see a hundred |
| miles by day or night, hear the grass growing, see the wool growing on a |
| sheep's back", etc. This would have been a great name, but it was |
| already taken by a security package "Heimdal". |
| |
| Keeping with the Nordic theme, Valgrind was chosen. Valgrind is the name |
| of the main entrance to Valhalla (the Hall of the Chosen Slain in |
| Asgard). Over this entrance there resides a wolf and over it there is |
| the head of a boar and on it perches a huge eagle, whose eyes can see to |
| the far regions of the nine worlds. Only those judged worthy by the |
| guardians are allowed to pass through Valgrind. All others are refused |
| entrance. |
| |
| It's not short for "value grinder", although that's not a bad guess. |
| |
| ------------------------------------------------------------------------ |
| 2. Compiling, installing and configuring |
| ------------------------------------------------------------------------ |
| |
| 2.1. When building Valgrind, 'make' dies partway with an assertion |
| failure, something like this: |
| |
| % make: expand.c:489: allocated_variable_append: |
| Assertion 'current_variable_set_list->next != 0' failed. |
| |
| It's probably a bug in 'make'. Some, but not all, instances of version |
| 3.79.1 have this bug, see this: |
| <http://www.mail-archive.com/[email protected]/msg01658.html>. Try |
| upgrading to a more recent version of 'make'. Alternatively, we have |
| heard that unsetting the CFLAGS environment variable avoids the problem. |
| |
| ------------------------------------------------------------------------ |
| |
| 2.2. When building Valgrind, 'make' fails with this: |
| /usr/bin/ld: cannot find -lc |
| collect2: ld returned 1 exit status |
| |
| You need to install the glibc-static-devel package. |
| |
| ------------------------------------------------------------------------ |
| 3. Valgrind aborts unexpectedly |
| ------------------------------------------------------------------------ |
| |
| 3.1. Programs run OK on Valgrind, but at exit produce a bunch of errors |
| involving __libc_freeres and then die with a segmentation fault. |
| |
| When the program exits, Valgrind runs the procedure __libc_freeres in |
| glibc. This is a hook for memory debuggers, so they can ask glibc to |
| free up any memory it has used. Doing that is needed to ensure that |
| Valgrind doesn't incorrectly report space leaks in glibc. |
| |
| The problem is that running __libc_freeres in older glibc versions |
| causes this crash. |
| |
| Workaround for 1.1.X and later versions of Valgrind: use the |
| --run-libc-freeres=no option. You may then get space leak reports for |
| glibc allocations (please don't report these to the glibc people, since |
| they are not real leaks), but at least the program runs. |
| |
| ------------------------------------------------------------------------ |
| |
| 3.2. My (buggy) program dies like this: |
| valgrind: m_mallocfree.c:248 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed. |
| |
| or like this: |
| valgrind: m_mallocfree.c:442 (mk_inuse_bszB): Assertion 'bszB != 0' failed. |
| |
| or otherwise aborts or crashes in m_mallocfree.c. |
| If Memcheck (the memory checker) shows any invalid reads, invalid writes |
| or invalid frees in your program, the above may happen. Reason is that |
| your program may trash Valgrind's low-level memory manager, which then |
| dies with the above assertion, or something similar. The cure is to fix |
| your program so that it doesn't do any illegal memory accesses. The |
| above failure will hopefully go away after that. |
| |
| ------------------------------------------------------------------------ |
| |
| 3.3. My program dies, printing a message like this along the way: |
| vex x86->IR: unhandled instruction bytes: 0x66 0xF 0x2E 0x5 |
| |
| One possibility is that your program has a bug and erroneously jumps to |
| a non-code address, in which case you'll get a SIGILL signal. Memcheck |
| may issue a warning just before this happens, but it might not if the |
| jump happens to land in addressable memory. |
| |
| Another possibility is that Valgrind does not handle the instruction. If |
| you are using an older Valgrind, a newer version might handle the |
| instruction. However, all instruction sets have some obscure, rarely |
| used instructions. Also, on amd64 there are an almost limitless number |
| of combinations of redundant instruction prefixes, many of them |
| undocumented but accepted by CPUs. So Valgrind will still have decoding |
| failures from time to time. If this happens, please file a bug report. |
| |
| ------------------------------------------------------------------------ |
| |
| 3.4. I tried running a Java program (or another program that uses a |
| just-in-time compiler) under Valgrind but something went wrong. Does |
| Valgrind handle such programs? |
| |
| Valgrind can handle dynamically generated code, so long as none of the |
| generated code is later overwritten by other generated code. If this |
| happens, though, things will go wrong as Valgrind will continue running |
| its translations of the old code (this is true on x86 and amd64, on |
| PowerPC there are explicit cache flush instructions which Valgrind |
| detects and honours). You should try running with --smc-check=all in |
| this case. Valgrind will run much more slowly, but should detect the use |
| of the out-of-date code. |
| |
| Alternatively, if you have the source code to the JIT compiler you can |
| insert calls to the VALGRIND_DISCARD_TRANSLATIONS client request to mark |
| out-of-date code, saving you from using --smc-check=all. |
| |
| Apart from this, in theory Valgrind can run any Java program just fine, |
| even those that use JNI and are partially implemented in other languages |
| like C and C++. In practice, Java implementations tend to do nasty |
| things that most programs do not, and Valgrind sometimes falls over |
| these corner cases. |
| |
| If your Java programs do not run under Valgrind, even with |
| --smc-check=all, please file a bug report and hopefully we'll be able to |
| fix the problem. |
| |
| |
| ------------------------------------------------------------------------ |
| 4. Valgrind behaves unexpectedly |
| ------------------------------------------------------------------------ |
| |
| 4.1. My program uses the C++ STL and string classes. Valgrind reports |
| 'still reachable' memory leaks involving these classes at the exit of |
| the program, but there should be none. |
| |
| First of all: relax, it's probably not a bug, but a feature. Many |
| implementations of the C++ standard libraries use their own memory pool |
| allocators. Memory for quite a number of destructed objects is not |
| immediately freed and given back to the OS, but kept in the pool(s) for |
| later re-use. The fact that the pools are not freed at the exit of the |
| program cause Valgrind to report this memory as still reachable. The |
| behaviour not to free pools at the exit could be called a bug of the |
| library though. |
| |
| Using GCC, you can force the STL to use malloc and to free memory as |
| soon as possible by globally disabling memory caching. Beware! Doing so |
| will probably slow down your program, sometimes drastically. |
| |
| * With GCC 2.91, 2.95, 3.0 and 3.1, compile all source using the STL |
| with -D__USE_MALLOC. Beware! This was removed from GCC starting with |
| version 3.3. |
| |
| * With GCC 3.2.2 and later, you should export the environment variable |
| GLIBCPP_FORCE_NEW before running your program. |
| |
| * With GCC 3.4 and later, that variable has changed name to |
| GLIBCXX_FORCE_NEW. |
| |
| There are other ways to disable memory pooling: using the malloc_alloc |
| template with your objects (not portable, but should work for GCC) or |
| even writing your own memory allocators. But all this goes beyond the |
| scope of this FAQ. Start by reading |
| http://gcc.gnu.org/onlinedocs/libstdc++/faq/index.html#4_4_leak: |
| <http://gcc.gnu.org/onlinedocs/libstdc++/faq/index.html#4_4_leak> if you |
| absolutely want to do that. But beware: allocators belong to the more |
| messy parts of the STL and people went to great lengths to make the STL |
| portable across platforms. Chances are good that your solution will work |
| on your platform, but not on others. |
| |
| ------------------------------------------------------------------------ |
| |
| 4.2. The stack traces given by Memcheck (or another tool) aren't |
| helpful. How can I improve them? |
| |
| If they're not long enough, use --num-callers to make them longer. |
| If they're not detailed enough, make sure you are compiling with -g to |
| add debug information. And don't strip symbol tables (programs should be |
| unstripped unless you run 'strip' on them; some libraries ship |
| stripped). |
| |
| Also, for leak reports involving shared objects, if the shared object is |
| unloaded before the program terminates, Valgrind will discard the debug |
| information and the error message will be full of ??? entries. The |
| workaround here is to avoid calling dlclose on these shared objects. |
| |
| Also, -fomit-frame-pointer and -fstack-check can make stack traces |
| worse. |
| |
| Some example sub-traces: |
| * With debug information and unstripped (best): |
| Invalid write of size 1 |
| at 0x80483BF: really (malloc1.c:20) |
| by 0x8048370: main (malloc1.c:9) |
| |
| * With no debug information, unstripped: |
| Invalid write of size 1 |
| at 0x80483BF: really (in /auto/homes/njn25/grind/head5/a.out) |
| by 0x8048370: main (in /auto/homes/njn25/grind/head5/a.out) |
| |
| * With no debug information, stripped: |
| Invalid write of size 1 |
| at 0x80483BF: (within /auto/homes/njn25/grind/head5/a.out) |
| by 0x8048370: (within /auto/homes/njn25/grind/head5/a.out) |
| by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so) |
| by 0x80482CC: (within /auto/homes/njn25/grind/head5/a.out) |
| |
| * With debug information and -fomit-frame-pointer: |
| Invalid write of size 1 |
| at 0x80483C4: really (malloc1.c:20) |
| by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so) |
| by 0x80482CC: ??? (start.S:81) |
| |
| * A leak error message involving an unloaded shared object: |
| 84 bytes in 1 blocks are possibly lost in loss record 488 of 713 |
| at 0x1B9036DA: operator new(unsigned) (vg_replace_malloc.c:132) |
| by 0x1DB63EEB: ??? |
| by 0x1DB4B800: ??? |
| by 0x1D65E007: ??? |
| by 0x8049EE6: main (main.cpp:24) |
| |
| ------------------------------------------------------------------------ |
| |
| 4.3. The stack traces given by Memcheck (or another tool) seem to have |
| the wrong function name in them. What's happening? |
| |
| Occasionally Valgrind stack traces get the wrong function names. This is |
| caused by glibc using aliases to effectively give one function two |
| names. Most of the time Valgrind chooses a suitable name, but very |
| occasionally it gets it wrong. Examples we know of are printing bcmp |
| instead of memcmp, index instead of strchr, and rindex instead of |
| strrchr. |
| |
| ------------------------------------------------------------------------ |
| |
| 4.4. My program crashes normally, but doesn't under Valgrind, or vice |
| versa. What's happening? |
| |
| When a program runs under Valgrind, its environment is slightly |
| different to when it runs natively. For example, the memory layout is |
| different, and the way that threads are scheduled is different. |
| |
| Most of the time this doesn't make any difference, but it can, |
| particularly if your program is buggy. For example, if your program |
| crashes because it erroneously accesses memory that is unaddressable, |
| it's possible that this memory will not be unaddressable when run under |
| Valgrind. Alternatively, if your program has data races, these may not |
| manifest under Valgrind. |
| |
| There isn't anything you can do to change this, it's just the nature of |
| the way Valgrind works that it cannot exactly replicate a native |
| execution environment. In the case where your program crashes due to a |
| memory error when run natively but not when run under Valgrind, in most |
| cases Memcheck should identify the bad memory operation. |
| |
| ------------------------------------------------------------------------ |
| |
| 4.5. Memcheck doesn't report any errors and I know my program has |
| errors. |
| |
| There are two possible causes of this. |
| First, by default, Valgrind only traces the top-level process. So if |
| your program spawns children, they won't be traced by Valgrind by |
| default. Also, if your program is started by a shell script, Perl |
| script, or something similar, Valgrind will trace the shell, or the Perl |
| interpreter, or equivalent. |
| |
| To trace child processes, use the --trace-children=yes option. |
| If you are tracing large trees of processes, it can be less disruptive |
| to have the output sent over the network. Give Valgrind the option |
| --log-socket=127.0.0.1:12345 (if you want logging output sent to port |
| 12345 on localhost). You can use the valgrind-listener program to listen |
| on that port: |
| |
| valgrind-listener 12345 |
| |
| Obviously you have to start the listener process first. See the manual |
| for more details. |
| |
| Second, if your program is statically linked, most Valgrind tools will |
| only work well if they are able to replace certain functions, such as |
| malloc, with their own versions. By default, statically linked malloc |
| functions are not replaced. A key indicator of this is if Memcheck says: |
| All heap blocks were freed -- no leaks are possible when you know your |
| program calls malloc. The workaround is to use the option |
| --soname-synonyms=somalloc=NONE or to avoid statically linking your |
| program. |
| |
| There will also be no replacement if you use an alternative malloc |
| library such as tcmalloc, jemalloc, ... In such a case, the option |
| --soname-synonyms=somalloc=zzzz (where zzzz is the soname of the |
| alternative malloc library) will allow Valgrind to replace the |
| functions. |
| |
| ------------------------------------------------------------------------ |
| |
| 4.6. Why doesn't Memcheck find the array overruns in this program? |
| int static[5]; |
| |
| int main(void) |
| { |
| int stack[5]; |
| |
| static[5] = 0; |
| stack [5] = 0; |
| |
| return 0; |
| } |
| |
| Unfortunately, Memcheck doesn't do bounds checking on global or stack |
| arrays. We'd like to, but it's just not possible to do in a reasonable |
| way that fits with how Memcheck works. Sorry. |
| |
| However, the experimental tool SGcheck can detect errors like this. Run |
| Valgrind with the --tool=exp-sgcheck option to try it, but be aware that |
| it is not as robust as Memcheck. |
| |
| |
| ------------------------------------------------------------------------ |
| 5. Miscellaneous |
| ------------------------------------------------------------------------ |
| |
| 5.1. I tried writing a suppression but it didn't work. Can you write my |
| suppression for me? |
| |
| Yes! Use the --gen-suppressions=yes feature to spit out suppressions |
| automatically for you. You can then edit them if you like, eg. combining |
| similar automatically generated suppressions using wildcards like '*'. |
| |
| If you really want to write suppressions by hand, read the manual |
| carefully. Note particularly that C++ function names must be mangled |
| (that is, not demangled). |
| |
| ------------------------------------------------------------------------ |
| |
| 5.2. With Memcheck's memory leak detector, what's the difference between |
| "definitely lost", "indirectly lost", "possibly lost", "still |
| reachable", and "suppressed"? |
| |
| The details are in the Memcheck section of the user manual. |
| In short: |
| * "definitely lost" means your program is leaking memory -- fix those |
| leaks! |
| |
| * "indirectly lost" means your program is leaking memory in a |
| pointer-based structure. (E.g. if the root node of a binary tree is |
| "definitely lost", all the children will be "indirectly lost".) If you |
| fix the "definitely lost" leaks, the "indirectly lost" leaks should go |
| away. |
| |
| * "possibly lost" means your program is leaking memory, unless you're |
| doing unusual things with pointers that could cause them to point into |
| the middle of an allocated block; see the user manual for some possible |
| causes. Use --show-possibly-lost=no if you don't want to see these |
| reports. |
| |
| * "still reachable" means your program is probably ok -- it didn't free |
| some memory it could have. This is quite common and often reasonable. |
| Don't use --show-reachable=yes if you don't want to see these reports. |
| |
| * "suppressed" means that a leak error has been suppressed. There are |
| some suppressions in the default suppression files. You can ignore |
| suppressed errors. |
| |
| ------------------------------------------------------------------------ |
| |
| 5.3. Memcheck's uninitialised value errors are hard to track down, |
| because they are often reported some time after they are caused. Could |
| Memcheck record a trail of operations to better link the cause to the |
| effect? Or maybe just eagerly report any copies of uninitialised memory |
| values? |
| |
| Prior to version 3.4.0, the answer was "we don't know how to do it |
| without huge performance penalties". As of 3.4.0, try using the |
| --track-origins=yes option. It will run slower than usual, but will give |
| you extra information about the origin of uninitialised values. |
| |
| Or if you want to do it the old fashioned way, you can use the client |
| request VALGRIND_CHECK_VALUE_IS_DEFINED to help track these errors down |
| -- work backwards from the point where the uninitialised error occurs, |
| checking suspect values until you find the cause. This requires editing, |
| compiling and re-running your program multiple times, which is a pain, |
| but still easier than debugging the problem without Memcheck's help. |
| |
| As for eager reporting of copies of uninitialised memory values, this |
| has been suggested multiple times. Unfortunately, almost all programs |
| legitimately copy uninitialised memory values around (because compilers |
| pad structs to preserve alignment) and eager checking leads to hundreds |
| of false positives. Therefore Memcheck does not support eager checking |
| at this time. |
| |
| ------------------------------------------------------------------------ |
| |
| 5.4. Is it possible to attach Valgrind to a program that is already |
| running? |
| |
| No. The environment that Valgrind provides for running programs is |
| significantly different to that for normal programs, e.g. due to |
| different layout of memory. Therefore Valgrind has to have full control |
| from the very start. |
| |
| It is possible to achieve something like this by running your program |
| without any instrumentation (which involves a slow-down of about 5x, |
| less than that of most tools), and then adding instrumentation once you |
| get to a point of interest. Support for this must be provided by the |
| tool, however, and Callgrind is the only tool that currently has such |
| support. See the instructions on the callgrind_control program for |
| details. |
| |
| |
| ------------------------------------------------------------------------ |
| 6. How To Get Further Assistance |
| ------------------------------------------------------------------------ |
| |
| Read the appropriate section(s) of the Valgrind Documentation: |
| <http://www.valgrind.org/docs/manual/index.html>. |
| |
| Search: <http://search.gmane.org> the valgrind-users: |
| <http://news.gmane.org/gmane.comp.debugging.valgrind> mailing list |
| archives, using the group name gmane.comp.debugging.valgrind. |
| |
| If you think an answer in this FAQ is incomplete or inaccurate, please |
| e-mail [email protected]: <[email protected]>. |
| |
| If you have tried all of these things and are still stuck, you can try |
| mailing the valgrind-users mailing list: |
| <http://www.valgrind.org/support/mailing_lists.html>. Note that an email |
| has a better change of being answered usefully if it is clearly written. |
| Also remember that, despite the fact that most of the community are very |
| helpful and responsive to emailed questions, you are probably requesting |
| help from unpaid volunteers, so you have no guarantee of receiving an |
| answer. |
| |