C/C++ Debugging - Useful tips on strace

 Recently, I am making a daemon program to use at the call center. The program collects network packets, restores call center calls, and converts calls into text. However, during the test process, it was found that the program was terminated about once every few days. At first, I thought it was a Segmentation Fault, so I checked whether a coredump occurred, but it was not. 

And the log level of the program was changed and the log was added around the code suspected of error. However, because the program consists of several threads and has a lot of loop statements, it was quickly found that there was a limit to adding logs. 

I looked for suitable debugging tools to find the part where the error occurred more easily.

I was able to find the most commonly used gdb, valgrind, which is effective for checking memory leaks, and strace that can easily investigate system calls and signals, among which I decided to use trace.


Common Daemon Program Errors

The fatal errors encountered in most daemon programs are as follows.


Failed to manage heap memory 

In general, it occurs because the memory allocated using functions such as malloc and strdup is not released as a free function.  If this phenomenon occurs continuously in daemon programs, the program will one day end abnormally due to lack of heap memory.


Segmentation Fault

It is the most notorious error in C/C++. In most cases, it is caused by improper use of memory pointers or buffer overflows. In general, change the system setting to generate a coredump file, and analyze the coredump file that occurred during segmentation fault using gdb to find the location of the error.


Other errors

The remaining errors are when the program is terminated even though the segmentation fault did not occur. We sometimes link other libraries when we create programs. You have to look carefully at the code you wrote.

However, the 3rd party library can detect an error and call the exit() function to shut down the program without me knowing.

In order to determine the cause of this kind of program termination, it is helpful to determine the cause of which system calls were called and which signals were received at the end of the program. strace is useful for figuring out what caused this kind of program to exit.


strace

strace is a diagnostic, debugging and instructional userspace utility for Linux. It is used to monitor and tamper with interactions between processes and the Linux kernel, which include system callssignal deliveries, and changes of process state. The operation of strace is made possible by the kernel feature known as ptrace.

Some Unix-like systems provide other diagnostic tools similar to strace, such as truss.

<From Wiki>

Originally, strace was created by Sun Microsystems in 1991, but later ported to Linux systems, and in the 2012 version, all codes not related to Linux were removed and optimized.


strace log file problem

The biggest problem I felt while using strace was the log file. Since my program receives and analyzes network packets, a lot of network IO and file IO occurs. When I created a log file using strace, a log file of more than 1 GB was created in 10 minutes. This amount is unacceptable. The log can be reduced using the filter function, but it is not a good idea to use the filter when the cause of the initial error is unclear. And I only need the log message at the end of the program. So I'm going to make a simple program to save only the last log of strace.


Important options for strace

  • -f :  If you create a child process or thread, be sure to include this option. If this option is omitted, system calls from child processes and threads are not monitored.
  • -t: Record the time at the log point.
  • -o : Save the output to a file.
  • -p : Attach to the currently running process. Pass the pid value.


strace wrapper program to save only the last part of the log

The following program is a very simple C++ program.

Runs strace as a child process, then receives strace's standard output and stores it in a buffer. And only the specified number of lines are saved to the file. The size of the log file is always kept constant because it is newly saved each time.

One thing to note is that strace outputs to the screen through stderr as well as stdout. In fact, most output goes to stderr. Unlike stdout, output using stderr is immediately displayed on the screen, so stderr is often used for important output. So when you run strace you need to redirect stderr to stdout.


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <iostream>
#include <list>
using namespace std;

#define QUEUE_SIZE 10
void usage();

int g_queue_size = QUEUE_SIZE;
char g_target[256];

void write_log(list<string> &q)
{
    FILE *fh = fopen("./strace.log", "w");
    if(fh){
        list<string>::iterator iter = q.begin();   
        for(iter=q.begin(); iter!=q.end(); iter++)  
        { 
            string packet = *iter;
            fwrite(packet.c_str(), 1, packet.size(), fh);
        }
        fclose(fh) ;   
    }
    
}

/********************************************************
 * to get stdout, use popen instead of system
 * ******************************************************/
void exec_command(const char *cmd)
{
    char buffer[65330];
    string str;
    list<string> strace_q;
    FILE* pipe = popen(cmd, "r");
    if (pipe){
        while (!feof(pipe)) {
            if (fgets(buffer, sizeof(buffer), pipe) != NULL){
                str = (string)buffer;
                strace_q.push_back(str);
                fprintf(stderr, "%s" , buffer);
            }
            if(strace_q.size() > g_queue_size){   //delete last queue element
                strace_q.pop_front();
            }
        }
        pclose(pipe);
        write_log(strace_q);
} return; } int main(int argc, char *argv[]) { char cmd[256]; int opt, pid = 0; while ((opt = getopt(argc, argv, "hp:q:s:")) != -1) { switch (opt) { case 'h': usage(); exit(1); case 'p': pid = atoi(optarg); break; case 'q': g_queue_size = atoi(optarg); break; case 's': strcpy(g_target, optarg); break; } } if(0 == pid && 0x00 == g_target[0]){ usage(); exit(1); } //strace output to the stderr, so redirect stderr to stdout (2>&1) if(pid){ sprintf(cmd, "strace -t -p %d -f 2>&1" , pid); } else{ sprintf(cmd, "strace -t %s -f 2>&1" , g_target); } exec_command(cmd); return 0; } void usage() { fprintf(stderr, "===== Usage =====\n"); fprintf(stderr, "-p: pid of strace parameter \n"); fprintf(stderr, "-q: log file line size (default 10). \n"); fprintf(stderr, "./strace_splillter -p 10075 -q 100 \n"); fprintf(stderr, "./strace_splillter target_app-q 100 \n"); }

<strace_wrapper.cpp>


Sample buggy program

And I made a program with the following simple bug for testing.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>


int main(int argc, char *argv[])
{
    char buf[256];
    const char *p = "Hello";
    for(int x = 0; x < 10; x++){
        memcpy(buf, p, 5);
        if(0 == memcmp(buf, p, 5)){
            //Do some jobs....
            printf("Compare success\n");
            printf("Compare success\n");
            printf("Compare success\n");
            printf("Compare success\n");
            printf("Compare success\n");
            printf("Compare success\n");
            printf("Compare success\n");
            printf("Compare success\n");
            printf("Compare success\n");
        }
        else{
            //Do some jobs....
            printf("Compare failed\n");
        }
        if(x == 9){
            memcpy(buf, (void *)100, 5);    //  ===> Segmentation Fault 
        }
        fprintf(stderr, "x = %d\n", x);
        sleep(6);
    }
    return 0;
}

<buggy.cpp>


Let's compile and test the two source codes. Numerous logs appear on the screen as shown in the following figure. However, what we need is the final log stored in a file, not screen output. In the real world, I can't wait in front of the monitor to check a daemon that may be shutting down.


[spypiggy@localhost strace]$ ./strace_wrapper -s ./buggy
03:48:29 execve("./buggy", ["./buggy", "-f"], 0x7fff14193980 /* 34 vars */) = 0
03:48:29 brk(NULL)                      = 0xf11000
03:48:29 arch_prctl(0x3001 /* ARCH_??? */, 0x7ffc11bcb160) = -1 EINVAL (Invalid argument)
03:48:29 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "glibc-hwcaps/x86-64-v3/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "glibc-hwcaps/x86-64-v2/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/glibc-hwcaps/x86-64-v3/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 stat("/usr/local/lib/glibc-hwcaps/x86-64-v3", 0x7ffc11bca370) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/glibc-hwcaps/x86-64-v2/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 stat("/usr/local/lib/glibc-hwcaps/x86-64-v2", 0x7ffc11bca370) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/tls/x86_64/x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 stat("/usr/local/lib/tls/x86_64/x86_64", 0x7ffc11bca370) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/tls/x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 stat("/usr/local/lib/tls/x86_64", 0x7ffc11bca370) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/tls/x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 stat("/usr/local/lib/tls/x86_64", 0x7ffc11bca370) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/tls/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 stat("/usr/local/lib/tls", 0x7ffc11bca370) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/x86_64/x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 stat("/usr/local/lib/x86_64/x86_64", 0x7ffc11bca370) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 stat("/usr/local/lib/x86_64", 0x7ffc11bca370) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/x86_64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 stat("/usr/local/lib/x86_64", 0x7ffc11bca370) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 stat("/usr/local/lib", {st_mode=S_IFDIR|0755, st_size=261, ...}) = 0
03:48:29 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
03:48:29 fstat(3, {st_mode=S_IFREG|0644, st_size=51579, ...}) = 0
03:48:29 mmap(NULL, 51579, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f459299a000
03:48:29 close(3)                       = 0
03:48:29 openat(AT_FDCWD, "/lib64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 3
03:48:29 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\373\10\0\0\0\0\0"..., 832) = 832
03:48:29 lseek(3, 1594712, SEEK_SET)    = 1594712
03:48:29 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
03:48:29 fstat(3, {st_mode=S_IFREG|0755, st_size=1660936, ...}) = 0
03:48:29 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4592998000
03:48:29 lseek(3, 1594712, SEEK_SET)    = 1594712
03:48:29 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
03:48:29 mmap(NULL, 3753504, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f45923e6000
03:48:29 mprotect(0x7f459256c000, 2093056, PROT_NONE) = 0
03:48:29 mmap(0x7f459276b000, 53248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x185000) = 0x7f459276b000
03:48:29 mmap(0x7f4592778000, 9760, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f4592778000
03:48:29 close(3)                       = 0
03:48:29 openat(AT_FDCWD, "glibc-hwcaps/x86-64-v3/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "glibc-hwcaps/x86-64-v2/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/x86_64/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/x86_64/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
03:48:29 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \305\0\0\0\0\0\0"..., 832) = 832
03:48:29 fstat(3, {st_mode=S_IFREG|0755, st_size=2191840, ...}) = 0
03:48:29 mmap(NULL, 3674432, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f4592064000
03:48:29 mprotect(0x7f45921e5000, 2093056, PROT_NONE) = 0
03:48:29 mmap(0x7f45923e4000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x180000) = 0x7f45923e4000
03:48:29 close(3)                       = 0
03:48:29 openat(AT_FDCWD, "glibc-hwcaps/x86-64-v3/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "glibc-hwcaps/x86-64-v2/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/x86_64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/x86_64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/lib64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3
03:48:29 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0.\0\0\0\0\0\0"..., 832) = 832
03:48:29 lseek(3, 92624, SEEK_SET)      = 92624
03:48:29 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
03:48:29 fstat(3, {st_mode=S_IFREG|0755, st_size=99672, ...}) = 0
03:48:29 lseek(3, 92624, SEEK_SET)      = 92624
03:48:29 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
03:48:29 mmap(NULL, 2192080, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f4591e4c000
03:48:29 mprotect(0x7f4591e63000, 2093056, PROT_NONE) = 0
03:48:29 mmap(0x7f4592062000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16000) = 0x7f4592062000
03:48:29 close(3)                       = 0
03:48:29 openat(AT_FDCWD, "glibc-hwcaps/x86-64-v3/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "glibc-hwcaps/x86-64-v2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/usr/local/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
03:48:29 openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
03:48:29 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\2405\2\0\0\0\0\0"..., 832) = 832
03:48:29 fstat(3, {st_mode=S_IFREG|0755, st_size=3167872, ...}) = 0
03:48:29 lseek(3, 808, SEEK_SET)        = 808
03:48:29 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32
03:48:29 mmap(NULL, 3950400, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f4591a87000
03:48:29 mprotect(0x7f4591c43000, 2093056, PROT_NONE) = 0
03:48:29 mmap(0x7f4591e42000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bb000) = 0x7f4591e42000
03:48:29 mmap(0x7f4591e48000, 14144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f4591e48000
03:48:29 close(3)                       = 0
03:48:29 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4592996000
03:48:29 arch_prctl(ARCH_SET_FS, 0x7f4592996f40) = 0
03:48:29 mprotect(0x7f4591e42000, 16384, PROT_READ) = 0
03:48:29 mprotect(0x7f4592062000, 4096, PROT_READ) = 0
03:48:29 mprotect(0x7f45923e4000, 4096, PROT_READ) = 0
03:48:29 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4592994000
03:48:29 mprotect(0x7f459276b000, 49152, PROT_READ) = 0
03:48:29 mprotect(0x600000, 4096, PROT_READ) = 0
03:48:29 mprotect(0x7f45929a7000, 4096, PROT_READ) = 0
03:48:29 munmap(0x7f459299a000, 51579)  = 0
03:48:29 brk(NULL)                      = 0xf11000
03:48:29 brk(0xf32000)                  = 0xf32000
03:48:29 fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
03:48:29 write(2, "x = 0\n", 6x = 0
)         = 6
03:48:29 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:48:35 write(2, "x = 1\n", 6x = 1
)         = 6
03:48:35 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:48:41 write(2, "x = 2\n", 6x = 2
)         = 6
03:48:41 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:48:47 write(2, "x = 3\n", 6x = 3
)         = 6
03:48:47 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:48:53 write(2, "x = 4\n", 6x = 4
)         = 6
03:48:53 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:48:59 write(2, "x = 5\n", 6x = 5
)         = 6
03:48:59 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:49:05 write(2, "x = 6\n", 6x = 6
)         = 6
03:49:05 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:49:11 write(2, "x = 7\n", 6x = 7
)         = 6
03:49:11 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:49:17 write(2, "x = 8\n", 6x = 8
)         = 6
03:49:17 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:49:23 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x65} ---
03:49:23 +++ killed by SIGSEGV (core dumped) +++
sh: line 1: 1056971 Segmentation fault      (core dumped) strace -t ./buggy -f 2>&1


Now let's check the log file. You can see that only the last 10 lines are saved. The number of lines can be adjusted by adding the -q option.

[spypiggy@localhost strace]$ cat strace.log
)         = 6
03:49:05 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:49:11 write(2, "x = 7\n", 6x = 7
)         = 6
03:49:11 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:49:17 write(2, "x = 8\n", 6x = 8
)         = 6
03:49:17 nanosleep({tv_sec=6, tv_nsec=0}, 0x7ffc11bcb050) = 0
03:49:23 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x65} ---
03:49:23 +++ killed by SIGSEGV (core dumped) +++


Running strace_wrapper in the background

Programs executed in the terminal are automatically terminated when the parent process, the terminal, is terminated. I don't want to spend days in front of the terminal. Also, I don't want strace_wrapper to exit with the remote terminal's network shutting down abruptly.

My preferred method is to use tmux. tmux also provides a function to use a divided terminal, but the biggest advantage is that the session can be maintained even after the terminal is closed.

tmux can be installed with a command such as dnf, apt, yum.


Create a tmux session

[spypiggy@localhost strace]$ tmux new -s strace

When tmux is executed, a green bar appears at the bottom of the screen and the session name (strace) is displayed.


The buggy example we just tested terminating quickly. Instead run your program, which doesn't quit easily, but quits unexpectedly. Then the strace_wrapper program will also continue working until your program terminates.


Run strace_wrapper inside tmux session

[spypiggy@localhost strace]$ ./strace_wrapper -s ./buggy

Once the program is running, exit the tmux shell. The way to get out is (Ctrl + b) and press the d key.

And with the tmux ls command, you can see that the session you just created (strace) is still running.



Now you can close your remote terminal and walk away. After a while, connect the remote terminal again and reconnect to the tmux strace session as follows.

[spypiggy@localhost strace]$ tmux attach -t strace


You can access the last working window again. I restarted the terminal, but the previous working session is still there.



Wrapping up

strace is a very useful tool if you are debugging daemon programs. If you use the wrapper described above, you don't have to worry about the large log file left by strace because you can leave only the last system call and log at the moment of shutdown.

And if tmux is used together, the strace operation can continue even after the terminal is closed using the tmux session. You can log in again at the desired time and check the result.



댓글

이 블로그의 인기 게시물

MQTT - C/C++ Client

RabbitMQ - C++ Client #1 : Installing C/C++ Libraries

C/C++ - Everything about time, date