Software engineering notes

Operating System

File descriptor (FD)

動態庫與靜態庫

linux 下有動態庫和靜態庫. 動態庫以 .so 為副檔名, 靜態庫以 .a 為副檔名

ldd 指令可以查詢此程序用到哪些動態庫

$ ldd dch_worker
linux-vdso.so.1 =>  (0x00007fff60bfe000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f6d583b2000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6d57fec000)
/lib64/ld-linux-x86-64.so.2 (0x00007f6d585dc000)

檢查主機狀態

top

ps

顯示特定幾個欄位

ps -Ao pid,%cpu,%mem,comm

上面的參數 commcommand 的差別在 command 會完整顯示程序的指令,而 comm 只會顯示你用什麼程序,例如使用 command 會多看到後面的參數 /Google Chrome Helper --type=renderer

記億體使用量前十的程序

PID     %CPU    %MEM    COMMAND
ps -Ao pid,%cpu,%mem,comm | sort -k3rn | head -n 10

CPU使用量前十的程序

PID     %CPU    %MEM    COMMAND
ps -Ao pid,%cpu,%mem,comm | sort -k2rn | head -n 10

查看 process 詳細資料

ps -v 97280
  PID STAT      TIME  SL  RE PAGEIN      VSZ    RSS   LIM     TSIZ  %CPU %MEM COMMAND
97280 S      1:36.19   0   0      0  4922424  10452     -        0   0.0  0.1 /Applications/Google Chrome.app/Contents/Versions/70.0.3

記算:

So if process A has a 500K binary and is linked to 2500K of shared libraries, has 200K of stack/heap allocations of which 100K is actually in memory (rest is swapped or unused), and it has only actually loaded 1000K of the shared libraries and 400K of its own binary then:

RSS: 400K + 1000K + 100K = 1500K
VSZ: 500K + 2500K + 200K = 3200K

if there were two processes using the same shared library from before:
PSS: 400K + (1000K/2) + 100K = 400K + 500K + 100K = 1000K

ref: https://stackoverflow.com/questions/7880784/what-is-rss-and-vsz-in-linux-memory-management

ps aux 查看 process 狀態

各欄位所代表的意思

看 process 的啟動及持續時間

ps -eo pid,comm,lstart,etime,time,args

PID COMMAND                          STARTED     ELAPSED     TIME COMMAND
  1 init            Thu Sep 21 10:38:33 2017 11-22:21:26 00:00:01 /sbin/init
  2 kthreadd        Thu Sep 21 10:38:33 2017 11-22:21:26 00:00:00 [kthreadd]
  3 ksoftirqd/0     Thu Sep 21 10:38:33 2017 11-22:21:26 00:00:00 [ksoftirqd/0]

htop

CPU:

Memory:

||||||||||||(綠藍黃)    57/2003MB   其中的 57 就是 Used memory

State:

如果搭配 free -m 來看的話,

             total       used       free     shared    buffers     cached
Mem:          2003        153       1849          0         11         85
                                                        (藍色)      (黃色)

57 = 153 - 11 - 85

快捷鍵:

Shortcut Key    Function Key    Description
h               F1              Invoke htop Help
S               F2              Htop Setup Menu
/               F3              Search for a Process
I               F4              Invert Sort Order
t               F5              Tree View
>               F6              Sort by a column
[               F7              Nice – (change priority)
]               F8              Nice + (change priority)
k               F9              Kill a Process
q               F10             Quit htop

pmap

查看 process 使用記憶體的狀況

pmap -x 11047

Address           Kbytes     RSS   Dirty Mode  Mapping
...略...
---------------- ------- ------- -------
total kB          153776    7156    2172

這裡的 RSS 不會跟 ps 的 RSS 一樣, 因為計算上有很多需要考慮 e.g. kernel stack, code, data, stack

strace

查看 process 或 command 執行時從頭到尾的狀況

strace -p 11047
strace echo "ff"

Process vs Thread

Process thread

如果有一個程式執行在 htop 可能會看到4個行程, 而且 pid 都不一樣, 但並不代表他是用 4個process

如果按 t, 就會可以清楚的看到是一個主行程跟三個子行程, 但那三個子行程都是 thread, 有可能是在

不同 CPU 跑的, thread 的記憶體是跨 cpu 可以共用的, 但 process 不行

為什麼 thread 也會有 pid? 因為 linux 把 thread 也看成一個 process

stack / heap

Difference between heap and stack

In java:

The main difference between heap and stack is that stack memory is used to store local variables and function call while heap memory is used to store objects in Java. No matter, where the object is created in code e.g. as a member variable, local variable or class variable, they are always created inside heap space in Java. if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors. You don’t always know if your variable is allocated on the stack or heap.

Out of memory

當記憶體不夠用時, 系統 OOM Killer 會開始 kill 最耗記憶體的 process

先看 log 狀態

/var/log/messages (amz-linux)

[952079.901834] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[952079.906483] 357 total pagecache pages
[952079.908658] 0 pages in swap cache
[952079.910747] Swap cache stats: add 0, delete 0, find 0/0
[952079.913679] Free swap  = 0kB
[952079.915551] Total swap = 0kB
[952079.917343] 524189 pages RAM
[952079.919165] 0 pages HighMem/MovableOnly
[952079.921434] 11312 pages reserved
[952079.923367] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
[952079.928155] [ 1520]     0  1520     2866      232      11       3        0         -1000 udevd
[952079.932846] [ 1646]     0  1646     2865      221      10       3        0         -1000 udevd
                                        ...(略)...
[952080.158896] [ 5665]   500  5665   142716    41467     189       3        0             0 php-fpm-7.0
[952080.164699] [ 5693]   500  5693   142102    13936     136       3        0             0 php-fpm-7.0
[952080.169943] [ 5695]   500  5695   143401    16756     140       3        0             0 php-fpm-7.0
[952080.174979] [ 5707]   500  5707   141966    14276     136       3        0             0 php-fpm-7.0
[952080.180028] [ 5716]   500  5716   143000    41756     189       3        0             0 php-fpm-7.0
[952080.185071] [ 5753]   500  5753   143162    16590     141       3        0             0 php-fpm-7.0
[952080.190104] [ 5754]   500  5754   142978    14747     137       3        0             0 php-fpm-7.0
[952080.196286] [ 5766]   500  5766   142451    14158     135       3        0             0 php-fpm-7.0
[952080.201450] [ 5771]   500  5771   106828     5564     118       3        0             0 php-fpm-7.0
                                        ...(略)...
[952080.390850] Out of memory: Kill process 5716 (php-fpm-7.0) score 81 or sacrifice child
[952080.395916] Killed process 5716 (php-fpm-7.0) total-vm:572000kB, anon-rss:167024kB, file-rss:0kB
[952320.400064] INFO: task php-fpm-7.0:5716 blocked for more than 120 seconds.

在最後幾行可以看的出來, php-fpm 被 kill 了, 而在出現這行前面你可以看到當時各 process 使用的 memory 狀態

各欄位代表的意思

算出某個 process 使用的記憶體量

[952080.395916] Killed process 5716 (php-fpm-7.0) total-vm:572000kB, anon-rss:167024kB, file-rss:0kB

例如 5716 這個 process, total_vm: 143000, rss: 41756, 上面一行得出來的數字也這裡的 x4, 也就是 5716 這個 process 實體記憶體+swap 用了 167MB

ref: https://stackoverflow.com/questions/9199731/understanding-the-linux-oom-killers-logs

系統 log

Ubuntu: /var/log/kern.log, /var/log/syslog

amz-linux: /var/log/messages

檢查 memory leak 工具

What is /proc/self?

It is a symlink to the path of current running process.

$ ls -al /proc/self
lrwxrwxrwx 1 root root 0 Dec 12 06:34 /proc/self -> 11845

If you execute command below, you will get ls’s pid and find proc/self that points to it at that moment.

$ ls -al /proc/self & echo $!
[1] 792
792
lrwxrwxrwx 1 root root 0 Dec 12 06:34 /proc/self -> 792
[1]+  Done                    ls --color=auto -al /proc/self