一、需求与背景
LINUX主机上的内存很为user、buffer、cache这个大的块。经常在使用free -m查看时,会发现cache占去了大头,从理论上我们也一直都知道cache里存放的是系统上访问的热数据(不经常访问的数据会被标记后,更换为新的数据),这样会加快主机的访问速度。而cache内部这些数据的命中率如何查看呢?今天在跟相关人员分析现网一个问题时,有同事提到的一个问题。
二、cache命中率计算方法
网上找了下Brendan Gregg 在 perf-tools 里集成了一个工具cachestat,可以查看cache命中率,该工具是用shell 写的,读了下源码,发现其是通过读取/sys/kernel/debug/tracing/trace_stat/ 目录下的所有function调用,查看其中的四项内容:
mark_page_accessed() for measuring cache accesses mark_buffer_dirty() for measuring cache writes add_to_page_cache_lru() for measuring page additions account_page_dirtied() for measuring page dirties
而四者之间的值是按如下公式计算的:
总cache page数(total) = mark_page_accessed(标记为访问的) - mark_buffer_dirty(标记为脏的) misses = add_to_page_cache_lru(新增的) - account_page_dirtied (统计为脏状态的) total - misses = hit hit/total = ratio (命中率)
这里给一个读取人trace_stat下的文件的内容,如下:
在该工具同时还读取了/proc/meminfo,不过读取该文件并不参与命中率计算,读取这个文件的内容只是用于算出目前使用的cache大小,并在界面输出。
三、cachestat代码及执行
cachestat代码如下:
#!/bin/bash # 28-Dec-2014 Brendan Gregg Created this. ### default variables tracing=/sys/kernel/debug/tracing interval=1; opt_timestamp=0; opt_debug=0 trap 'quit=1' INT QUIT TERM PIPE HUP # sends execution to end tracing section function usage { cat <<-END >&2 USAGE: cachestat [-Dht] [interval] -D # print debug counters -h # this usage message -t # include timestamp interval # output interval in secs (default 1) eg, cachestat # show stats every second cachestat 5 # show stats every 5 seconds See the man page and example file for more info. END exit } function warn { if ! eval "$@"; then echo >&2 "WARNING: command failed \"$@\"" fi } function die { echo >&2 "$@" exit 1 } ### process options while getopts Dht opt do case $opt in D) opt_debug=1 ;; t) opt_timestamp=1 ;; h|?) usage ;; esac done shift $(( $OPTIND - 1 )) ### option logic if (( $# )); then interval=$1 fi echo "Counting cache functions... Output every $interval seconds." ### check permissions cd $tracing || die "ERROR: accessing tracing. Root user? Kernel has FTRACE? debugfs mounted? (mount -t debugfs debugfs /sys/kernel/debug)" ### enable tracing sysctl -q kernel.ftrace_enabled=1 # doesn't set exit status printf "mark_page_accessed\nmark_buffer_dirty\nadd_to_page_cache_lru\naccount_page_dirtied\n" > set_ftrace_filter || \ die "ERROR: tracing these four kernel functions: mark_page_accessed,"\ "mark_buffer_dirty, add_to_page_cache_lru and account_page_dirtied (unknown kernel version?). Exiting." warn "echo nop > current_tracer" if ! echo 1 > function_profile_enabled; then echo > set_ftrace_filter die "ERROR: enabling function profiling. Have CONFIG_FUNCTION_PROFILER? Exiting." fi (( opt_timestamp )) && printf "%-8s " TIME printf "%8s %8s %8s %8s %12s %10s" HITS MISSES DIRTIES RATIO "BUFFERS_MB" "CACHE_MB" (( opt_debug )) && printf " DEBUG" echo ### summarize quit=0; secs=0 while (( !quit && (!opt_duration || secs < duration) )); do (( secs += interval )) echo 0 > function_profile_enabled echo 1 > function_profile_enabled sleep $interval (( opt_timestamp )) && printf "%(%H:%M:%S)T " -1 # cat both meminfo and trace stats, and let awk pick them apart cat /proc/meminfo trace_stat/function* | awk -v debug=$opt_debug ' # match meminfo stats: $1 == "Buffers:" && $3 == "kB" { buffers_mb = $2 / 1024 } $1 == "Cached:" && $3 == "kB" { cached_mb = $2 / 1024 } # identify and save trace counts: $2 ~ /[0-9]/ && $3 != "kB" { a[$1] += $2 } END { mpa = a["mark_page_accessed"] mbd = a["mark_buffer_dirty"] apcl = a["add_to_page_cache_lru"] apd = a["account_page_dirtied"] total = mpa - mbd misses = apcl - apd if (misses < 0) misses = 0 hits = total - misses ratio = 100 * hits / total printf "%8d %8d %8d %7.1f%% %12.0f %10.0f", hits, misses, mbd, ratio, buffers_mb, cached_mb if (debug) printf " (%d %d %d %d)", mpa, mbd, apcl, apd printf "\n" }' done ### end tracing echo 2>/dev/null echo "Ending tracing..." 2>/dev/null warn "echo 0 > function_profile_enabled" warn "echo > set_ftrace_filter"
脚本执行后,结果如下:
参考页面:
http://www.brendangregg.com/blog/2014-12-31/linux-page-cache-hit-ratio.html (Brendan Gregg大神的博客)
《Linux cache命中率查看》有1条评论