?> 运维文档

MySQL参数详解

2016年7月6日 没有评论
状态名 作用域 详细解释
Aborted_clients Global 由于客户端没有正确关闭连接导致客户端终止而中断的连接数
Aborted_connects Global 试图连接到MySQL服务器而失败的连接数
Binlog_cache_disk_use Global 使用临时二进制日志缓存但超过binlog_cache_size值并使用临时文件来保存事务中的语句的事务数量
Binlog_cache_use Global 使用临时二进制日志缓存的事务数量
Bytes_received Both 从所有客户端接收到的字节数。
Bytes_sent Both 发送给所有客户端的字节数。
com*   各种数据库操作的数量
Compression Session 客户端与服务器之间只否启用压缩协议
Connections Global 试图连接到(不管是否成功)MySQL服务器的连接数
Created_tmp_disk_tables Both 服务器执行语句时在硬盘上自动创建的临时表的数量
Created_tmp_files Global mysqld已经创建的临时文件的数量
Created_tmp_tables Both 服务器执行语句时自动创建的内存中的临时表的数量。如果Created_tmp_disk_tables较大,你可能要增加tmp_table_size值使临时 表基于内存而不基于硬盘
Delayed_errors Global 用INSERT DELAYED写的出现错误的行数(可能为duplicate key)。
Delayed_insert_threads Global 使用的INSERT DELAYED处理器线程数。
Delayed_writes Global 写入的INSERT DELAYED行数
Flush_commands Global 执行的FLUSH语句数。
Handler_commit Both 内部提交语句数
Handler_delete Both 行从表中删除的次数。
Handler_discover Both MySQL服务器可以问NDB CLUSTER存储引擎是否知道某一名字的表。这被称作发现。Handler_discover说明通过该方法发现的次数。
Handler_prepare Both A counter for the prepare phase of two-phase commit operations.
Handler_read_first Both 索引中第一条被读的次数。如果较高,它建议服务器正执行大量全索引扫描;例如,SELECT col1 FROM foo,假定col1有索引。
Handler_read_key Both 根据键读一行的请求数。如果较高,说明查询和表的索引正确。
Handler_read_next Both 按照键顺序读下一行的请求数。如果你用范围约束或如果执行索引扫描来查询索引列,该值增加。
Handler_read_prev Both 按照键顺序读前一行的请求数。该读方法主要用于优化ORDER BY … DESC。
Handler_read_rnd Both 根据固定位置读一行的请求数。如果你正执行大量查询并需要对结果进行排序该值较高。你可能使用了大量需要MySQL扫描整个表的查询或你的连接没有正确使用键。
Handler_read_rnd_next Both 在数据文件中读下一行的请求数。如果你正进行大量的表扫描,该值较高。通常说明你的表索引不正确或写入的查询没有利用索引。
Handler_rollback Both 内部ROLLBACK语句的数量。
Handler_savepoint Both 在一个存储引擎放置一个保存点的请求数量。
Handler_savepoint_rollback Both 在一个存储引擎的要求回滚到一个保存点数目。
Handler_update Both 在表内更新一行的请求数。
Handler_write Both 在表内插入一行的请求数。
Innodb_buffer_pool_pages_data Global 包含数据的页数(脏或干净)。
Innodb_buffer_pool_pages_dirty Global 当前的脏页数。
Innodb_buffer_pool_pages_flushed Global 要求清空的缓冲池页数
Innodb_buffer_pool_pages_free Global 空页数。
Innodb_buffer_pool_pages_latched Global 在InnoDB缓冲池中锁定的页数。这是当前正读或写或由于其它原因不能清空或删除的页数。
Innodb_buffer_pool_pages_misc Global 忙的页数,因为它们已经被分配优先用作管理,例如行锁定或适用的哈希索引。该值还可以计算为Innodb_buffer_pool_pages_total – Innodb_buffer_pool_pages_free – Innodb_buffer_pool_pages_data。
Innodb_buffer_pool_pages_total Global 缓冲池总大小(页数)。
Innodb_buffer_pool_read_ahead_rnd Global InnoDB初始化的“随机”read-aheads数。当查询以随机顺序扫描表的一大部分时发生。
Innodb_buffer_pool_read_ahead_seq Global InnoDB初始化的顺序read-aheads数。当InnoDB执行顺序全表扫描时发生。
Innodb_buffer_pool_read_requests Global InnoDB已经完成的逻辑读请求数。
Innodb_buffer_pool_reads Global 不能满足InnoDB必须单页读取的缓冲池中的逻辑读数量。
Innodb_buffer_pool_wait_free Global 一般情况,通过后台向InnoDB缓冲池写。但是,如果需要读或创建页,并且没有干净的页可用,则它还需要先等待页面清空。该计数器对等待实例进行记数。如果已经适当设置缓冲池大小,该值应小。
Innodb_buffer_pool_write_requests Global 向InnoDB缓冲池的写数量。
Innodb_data_fsyncs Global fsync()操作数。
Innodb_data_pending_fsyncs Global 当前挂起的fsync()操作数。
Innodb_data_pending_reads Global 当前挂起的读数。
Innodb_data_pending_writes Global 当前挂起的写数。
Innodb_data_read Global 至此已经读取的数据数量(字节)。
Innodb_data_reads Global 数据读总数量。
Innodb_data_writes Global 数据写总数量。
Innodb_data_written Global 至此已经写入的数据量(字节)。
Innodb_dblwr_pages_written Global 已经执行的双写操作数量
Innodb_dblwr_writes Global 双写操作已经写好的页数
Innodb_log_waits Global 我们必须等待的时间,因为日志缓冲区太小,我们在继续前必须先等待对它清空
 
Innodb_log_write_requests Global 日志写请求数。
Innodb_log_writes Global 向日志文件的物理写数量。
Innodb_os_log_fsyncs Global 向日志文件完成的fsync()写数量。
Innodb_os_log_pending_fsyncs Global 挂起的日志文件fsync()操作数量。
Innodb_os_log_pending_writes Global 挂起的日志文件写操作
Innodb_os_log_written Global 写入日志文件的字节数。
Innodb_page_size Global 编译的InnoDB页大小(默认16KB)。许多值用页来记数;页的大小很容易转换为字节。
Innodb_pages_created Global 创建的页数。
Innodb_pages_read Global 读取的页数。
Innodb_pages_written Global 写入的页数。
Innodb_row_lock_current_waits Global 当前等待的待锁定的行数。
Innodb_row_lock_time Global 行锁定花费的总时间,单位毫秒。
Innodb_row_lock_time_avg Global 行锁定的平均时间,单位毫秒。
Innodb_row_lock_time_max Global 行锁定的最长时间,单位毫秒。
Innodb_row_lock_waits Global 一行锁定必须等待的时间数。
Innodb_rows_deleted Global 从InnoDB表删除的行数。
Innodb_rows_inserted Global 插入到InnoDB表的行数。
Innodb_rows_read Global 从InnoDB表读取的行数。
Innodb_rows_updated Global InnoDB表内更新的行数。
Key_blocks_not_flushed Global 键缓存内已经更改但还没有清空到硬盘上的键的数据块数量。
Key_blocks_unused Global 键缓存内未使用的块数量。你可以使用该值来确定使用了多少键缓存
Key_blocks_used Global 键缓存内使用的块数量。该值为高水平线标记,说明已经同时最多使用了多少块。
Key_read_requests Global 从缓存读键的数据块的请求数。
Key_reads Global 从硬盘读取键的数据块的次数。如果Key_reads较大,则Key_buffer_size值可能太小。可以用Key_reads/Key_read_requests计算缓存损失率。
Key_write_requests Global 将键的数据块写入缓存的请求数。
Key_writes Global 向硬盘写入将键的数据块的物理写操作的次数。
Last_query_cost Session 用查询优化器计算的最后编译的查询的总成本。用于对比同一查询的不同查询方案的成本。默认值0表示还没有编译查询。 默认值是0。Last_query_cost具有会话范围。
Max_used_connections Global 服务器启动后已经同时使用的连接的最大数量。
ndb*   ndb集群相关
Not_flushed_delayed_rows Global 等待写入INSERT DELAY队列的行数。

 

 

 

Open_files Global 打开的文件的数目。
Open_streams Global 打开的流的数量(主要用于记录)。
Open_table_definitions Global 缓存的.frm文件数量
Open_tables Both 当前打开的表的数量。
 
Opened_files Global 文件打开的数量。不包括诸如套接字或管道其他类型的文件。 也不包括存储引擎用来做自己的内部功能的文件。
Opened_table_definitions Both 已经缓存的.frm文件数量
Opened_tables Both 已经打开的表的数量。如果Opened_tables较大,table_cache 值可能太小。
Prepared_stmt_count Global 当前的预处理语句的数量。 (最大数为系统变量: max_prepared_stmt_count)
Qcache_free_blocks Global 查询缓存内自由内存块的数量。
Qcache_free_memory Global 用于查询缓存的自由内存的数量。
Qcache_hits Global 查询缓存被访问的次数。
Qcache_inserts Global 加入到缓存的查询数量。
Qcache_lowmem_prunes Global 由于内存较少从缓存删除的查询数量。
Qcache_not_cached Global 非缓存查询数(不可缓存,或由于query_cache_type设定值未缓存)。
Qcache_queries_in_cache Global 登记到缓存内的查询的数量。
Qcache_total_blocks Global 查询缓存内的总块数。
Queries Both 服务器执行的请求个数,包含存储过程中的请求。
Questions Both 已经发送给服务器的查询的个数。
Rpl_status Global 失败安全复制状态(还未使用)。
Select_full_join Both 没有使用索引的联接的数量。如果该值不为0,你应仔细检查表的索引
Select_full_range_join Both 在引用的表中使用范围搜索的联接的数量。
Select_range Both 在第一个表中使用范围的联接的数量。一般情况不是关键问题,即使该值相当大。
Select_range_check Both 在每一行数据后对键值进行检查的不带键值的联接的数量。如果不为0,你应仔细检查表的索引。
Select_scan Both 对第一个表进行完全扫描的联接的数量。
Slave_heartbeat_period Global 复制的心跳间隔
Slave_open_temp_tables Global 从服务器打开的临时表数量
Slave_received_heartbeats Global 从服务器心跳数
Slave_retried_transactions Global 本次启动以来从服务器复制线程重试次数
Slave_running Global 如果该服务器是连接到主服务器的从服务器,则该值为ON。
Slow_launch_threads Both 创建时间超过slow_launch_time秒的线程数。
Slow_queries Both 查询时间超过long_query_time秒的查询的个数。
Sort_merge_passes Both 排序算法已经执行的合并的数量。如果这个变量值较大,应考虑增加sort_buffer_size系统变量的值。
Sort_range Both 在范围内执行的排序的数量。
Sort_rows Both 已经排序的行数。
Sort_scan Both 通过扫描表完成的排序的数量。
ssl*   ssl连接相关
Table_locks_immediate Global 立即获得的表的锁的次数。
Table_locks_waited Global 不能立即获得的表的锁的次数。如果该值较高,并且有性能问题,你应首先优化查询,然后拆分表或使用复制。
Threads_cached Global 线程缓存内的线程的数量。
Threads_connected Global 当前打开的连接的数量。
Threads_created Global 创建用来处理连接的线程数。如果Threads_created较大,你可能要增加thread_cache_size值。缓存访问率的计算方法Threads_created/Connections。
Threads_running Global 激活的(非睡眠状态)线程数。
Uptime Global 服务器已经运行的时间(以秒为单位)。
Uptime_since_flush_status Global 最近一次使用FLUSH STATUS 的时间(以秒为单位)。

 

 

 

分类: MySQL 标签:

MySQL参数Handler

2016年6月27日 没有评论

#############################################################
首先7个计数器,我们应该分为两部分:
1)对索引读的计数器:前面的5个都是对索引读情况的计数器,
     Handler_read_first:是指读索引的第一项(的次数);
     Handler_read_key:是指读索引的某一项(的次数);
     Handler_read_next:是指读索引的下一项(的次数);
     Handler_read_last:是指读索引的最后第一项(的次数);
     Handler_read_prev:是指读索引的前一项(的次数);
5者应该有四种组合:
1. Handler_read_first 和 Handler_read_next 组合应该是索引覆盖扫描
2. Handler_read_key 基于索引取值
3. Handler_read_key 和 Handler_read_next 组合应该是索引范围扫描
4. Handler_read_last 和 Handler_read_prev 组合应该是索引范围扫描(orde by desc)

2)对数据文件的计数器:后面的2个都是对数据文件读情况的计数器,
Handler_read_rnd:
The number of requests to read a row based on a fixed position. This value is high if you are doing a lot of queries that require sorting of the result. You probably have a lot of queries that require MySQL to scan entire tables or you have joins that do not use keys properly.

Handler_read_rnd_next
The number of requests to read the next row in the data file. This value is high if you are doing a lot of
table scans. Generally this suggests that your tables are not properly indexed or that your queries are
not written to take advantage of the indexes you have.

这里很重要的一点要理解:索引项之间都是有顺序的,所以才有first, last, next, prev等等,所以前面的5个都是对索引读情况
的计数器,而后面的2个是对数据文件的读情况的计数器。

很显然的一点:
后面的2个 Handler_read_rnd 和 Handler_read_rnd_next 是越低越好,如果很高,应该进行索引相关的调优。而Handler_read_key的数值
肯定是越高越好,越高代表使用索引读很高。其他的计数器,要具体情况具体分析 
#################

###########################################################################################
select uid from user_1;
+———————–+——-+
| Variable_name         | Value |
+———————–+——-+
| Handler_read_first    | 1     |
| Handler_read_key      | 1     |    走索引就增加
| Handler_read_last     | 0     |
| Handler_read_next     | 4863  |  这个是走索引总共读了多少行
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |

####################################################################

mysql> select uid from user_1 where nickname = '静静';
+———–+
| uid       |
+———–+
| 100000099 |
+———–+
1 row in set (0.01 sec)

mysql> show session status like 'Handler_read%';
+———————–+——-+
| Variable_name         | Value |
+———————–+——-+
| Handler_read_first    | 1     |
| Handler_read_key      | 1     |
| Handler_read_last     | 0     |
| Handler_read_next     | 0     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 4864  |    扫描了多少行,值大表明,索引设置不合理
+———————–+——-+
########################################################################
mysql> select nickname,email from user_1 order by uid desc limit 5\G
*************************** 1. row ***************************
nickname: clark
   email: 
*************************** 2. row ***************************
nickname: 雨御寒
   email: 
*************************** 3. row ***************************
nickname: haha
   email: 
*************************** 4. row ***************************
nickname: 公众号不同
   email: 
*************************** 5. row ***************************
nickname: 111111eee
   email: ddd@jm.com
5 rows in set (0.00 sec)

mysql> show session status like 'Handler_read%';
+———————–+——-+
| Variable_name         | Value |
+———————–+——-+
| Handler_read_first    | 0     |    #读取索引中的第一个值
| Handler_read_key      | 1     |    #从索引读取
| Handler_read_last     | 1     |    #读取索引中最后一个值,多出现在order by … desc语句中
| Handler_read_next     | 0     |    #读取索引下一行的次数
| Handler_read_prev     | 4     |    ##在索引顺序中读了前面的多少行,这个值主要是在于优化order by … desc
| Handler_read_rnd      | 0     |    #This method fetches a row from a table based on a “fixed position,” i.e. a random-read.
| Handler_read_rnd_next | 0     |    #表示“在数据文件中读下一行的请求数。如果你正进行大量的表扫描,该值较高。通常说明你的表索引不正确或写入的查询没有利用索引。”
+———————–+——-+
7 rows in set (0.00 sec)

 

 

 

 

参考文档

https://www.percona.com/blog/2010/06/15/what-does-handler_read_rnd-mean/

 

分类: MySQL 标签:

内核参数overcommit_memory

2015年11月8日 1 条评论

 进程通常调用malloc()函数来分配内存,内存决定是否有足够的可用内存,并允许或拒绝内存分配的请求。Linux支持超量分配内存,以允许分配比可用RAM加上交换内存的请求。
vm.overcommit_memory参数有三种可能的配置:
This value contains a flag that enables memory overcommitment.

When this flag is 0, the kernel attempts to estimate the amount

of free memory left when userspace requests more memory.


When this flag is 1, the kernel pretends there is always enough

memory until it actually runs out.


When this flag is 2, the kernel uses a "never overcommit"

policy that attempts to prevent any overcommit of memory.


This feature can be very useful because there are a lot of

programs that malloc() huge amounts of memory "just-in-case"

and don't use much of it.

The default value is 0.
overcommit_ratio:

When overcommit_memory is set to 2, the committed address

space is not permitted to exceed swap plus this percentage

of physical RAM.  See above.


The Linux kernel supports the following overcommit handling modes

0   –   Heuristic overcommit handling. Obvious overcommits of

        address space are refused. Used for a typical system. It

        ensures a seriously wild allocation fails while allowing

        overcommit to reduce swap usage.  root is allowed to

        allocate slightly more memory in this mode. This is the

        default.

1   –   Always overcommit. Appropriate for some scientific

        applications.
 
2   –   Don't overcommit. The total address space commit

        for the system is not permitted to exceed swap + a

        configurable percentage (default is 50) of physical RAM.

        Depending on the percentage you use, in most situations

        this means a process will not be killed while accessing

        pages but will receive errors on memory allocation as

        appropriate.
如下例子
[root@localhost ~]# grep -i commit /proc/meminfo
CommitLimit:     3330352 kB     ##
Committed_AS:    4000764 kB         ##这个是已经分配的内存
[root@localhost ~]# free 
             total       used       free     shared    buffers     cached
Mem:       2531956    2438888      93068          0     117460      82020
-/+ buffers/cache:    2239408     292548
Swap:      2064376     791008    1273368
commitLimit=2531956×0.5 + 2064376
如果vm.overcommit_memory设置为2时linux分配的最大内存不能大于commitLimit
sysctl -a | grep vm.overcommit_memory
sysctl -w vm.overcommit_memory=1
sysctl -p

 

 

 

分类: kernel 标签:

tcpdump命令使用例子

2015年11月2日 没有评论

目的:捕捉web发送到mysql数据库服务器的sql语句

使用tcpdump捕捉发往mysql端口3306的sql语句,具体命令如下: tcpdump -A -s 0 -i eth1 host 172.16.0.18 -w dumpfile1.cap,将tcpdump捕捉的包,用Wireshark解包即可获取发送mysql服务器的sql语句

其中 -A 是以ASCII码

   -s 0 截取全部的数据包 默认值是截取部分

   -i 指定从那个接口捕捉数据包

   -w 将指定的数据包的存储文件

sed模式空间

2015年10月21日 没有评论
 

    SED之所以能以行为单位的编辑或修改文本,其原因在于它使用了两个空间:一个是活动的“模式空间(pattern space)”,另一个是起辅助作用的“暂存缓冲区(holdingspace)这2个空间的使用。

    sed编辑器逐行处理文件,并将输出结果打印到屏幕上。sed命令将当前处理的行读入模式空间(pattern space)进行处理,sed在该行上执行完所有命令后就将处理好的行打印到屏幕上(除非之前的命令删除了该行),sed处理完一行就将其从模式空间中删除,然后将下一行读入模式空间,进行处理、显示。处理完文件的最后一行,sed便结束运行。sed在临时缓冲区(模式空间)对文件进行处理,所以不会修改原文件,除非显示指明-i选项。

与模式空间和暂存空间(hold space)相关的命令:
n 输出模式空间行,读取下一行替换当前模式空间的行,执行下一条处理命令而非第一条命令。
N 读入下一行,追加到模式空间行后面,此时模式空间有两行。
h 把模式空间里的行拷贝到暂存空间。
H 把模式空间里的行追加到暂存空间。
g 用暂存空间的内容替换模式空间的行。
G 把暂存空间的内容追加到模式空间的行后。
x 将暂存空间的内容于模式空间里的当前行互换。
! 对所选行以外的所有行应用命令。
注意:暂存空间里默认存储一个空行。

buffer与cached

2015年10月15日 没有评论

buffer与cached
[root@localhost ~]# free -m
             total       used       free     shared    buffers     cached
Mem:          2472       2342        130          0        169        278
-/+ buffers/cache:       1894        578
Swap:         2015       2015          0
实际可用内存是578M,如果执行大文件copy,cached会急剧升高。那么什么是cached?什么是buffer呢?简单点说
cached缓存了对文件的读写,buffers缓存inode、dentry等文件系统metadata,显然metadata要远远小于文件内存,所以
buffer一般也远小于cached。其中cached的部分叫page cache,buffer的部分叫buffer cache。page cache与buffer cache的区别见下文
 http://www.quora.com/Linux-Kernel/What-is-the-major-difference-between-the-buffer-cache-and-the-page-cache

  

Linux内核参数Transparent Hugepages

2015年10月5日 没有评论

Environment

  • Red Hat Enterprise Linux (RHEL) 6

 

Resolution

Note: Transparent Huge Pages are not available on the 32-bit version of RHEL 6.

Transparent Huge Pages (THP) are enabled by default in RHEL 6 for all applications. The kernel attempts to allocate hugepages whenever possible and any Linux process will receive 2MB pages if the mmap region is 2MB naturally aligned. The main kernel address space itself is mapped with hugepages, reducing TLB pressure from kernel code. For general information on Hugepages, see: What are Huge Pages and what are the advantages of using them?

The kernel will always attempt to satisfy a memory allocation using hugepages. If no hugepages are available (due to non availability of physically continuous memory for example) the kernel will fall back to the regular 4KB pages. THP are also swappable (unlike hugetlbfs). This is achieved by breaking the huge page to smaller 4KB pages, which are then swapped out normally.

But to use hugepages effectively, the kernel must find physically continuous areas of memory big enough to satisfy the request, and also properly aligned. For this, a khugepaged kernel thread has been added. This thread will occasionally attempt to substitute smaller pages being used currently with a hugepage allocation, thus maximizing THP usage.

In userland, no modifications to the applications are necessary (hence transparent). But there are ways to optimize its use. For applications that want to use hugepages, use of posix_memalign() can also help ensure that large allocations are aligned to huge page (2MB) boundaries.

Also, THP is only enabled for anonymous memory regions. There are plans to add support for tmpfs and page cache. THP tunables are found in the /sys tree under /sys/kernel/mm/redhat_transparent_hugepage.

The values for /sys/kernel/mm/redhat_transparent_hugepage/enabled can be one of the following:

Raw
always   -  always use THP
never    -  disable THP

 

khugepaged will be automatically started when transparent_hugepage/enabled is set to "always" or "madvise, and it'll be automatically shutdown if it's set to "never". The redhat_transparent_hugepage/defrag parameter takes the same values and it controls whether the kernel should make aggressive use of memory compaction to make more hugepages available.

 

Check system-wide THP usage

Run the following command to check system-wide THP usage:

Raw
# grep AnonHugePages /proc/meminfo 
AnonHugePages:    632832 kB

 

Note: Red Hat Enterprise Linux 6.2 or later publishes additional THP monitoring via /proc/vmstat:

Raw
# egrep 'trans|thp' /proc/vmstat
nr_anon_transparent_hugepages 2018
thp_fault_alloc 7302
thp_fault_fallback 0
thp_collapse_alloc 401
thp_collapse_alloc_failed 0
thp_split 21

 

 

Check THP usage per process

Run the following command to monitor which processes are using THP:

Raw
    # grep -e AnonHugePages  /proc/*/smaps | awk  '{ if($2>4) print $0} ' |  awk -F "/"  '{print $0; system("ps -fp " $3)} '
/proc/7519/smaps:AnonHugePages:    305152 kB
UID        PID  PPID  C STIME TTY          TIME CMD
qemu      7519     1  1 08:53 ?        00:00:48 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name rhel7 -S -machine pc-i440fx-1.6,accel=kvm,usb=of
/proc/7610/smaps:AnonHugePages:    491520 kB
UID        PID  PPID  C STIME TTY          TIME CMD
qemu      7610     1  2 08:53 ?        00:01:30 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name util6vm -S -machine pc-i440fx-1.6,accel=kvm,usb=
/proc/7788/smaps:AnonHugePages:    389120 kB
UID        PID  PPID  C STIME TTY          TIME CMD
qemu      7788     1  1 08:54 ?        00:00:55 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name rhel64eus -S -machine pc-i440fx-1.6,accel=kvm,us

 

 

To disable THP at boot time

Append the following to the kernel command line in grub.conf:

Raw
transparent_hugepage=never

 

Note: Certain ktune and/or tuned profiles specify to enable THP when they are applied. If the transparent_hugepage=never parameter is set at boot time, but THP does not appear to be disabled after the system is fully booted. Refer to the following article:

Disabling transparent hugepages (THP) on Red Hat Enterprise Linux 6 is not taking effect


 

To disable THP at run time

Run the following commands to disable THP on-the-fly:

Raw
# echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

 

  • NOTE: * Running the above commands will stop only creation and usage of the new THP. The THP which were created and used at the moment the above commands were run would not be disassembled into the regular memory pages. To get rid of THP completely the system should be rebooted with THP disabled at boot time.

 

How to tell if Explicit HugePages is enabled or disabled

There can be two types of HugePages in the system: Explicit Huge Pages which are allocated explicitly by vm.nr_hugepages sysctl parameter and Tranparent Huge Pages which are allocated automatically by the kernel. See below on how to tell if Explicit HugePages is enabled or disabled.

  • Explicit HugePages DISABLED:

    • If the value of HugePages_Total is "0" it means HugePages is disabled on the system.

      Raw
      # grep -i HugePages_Total /proc/meminfo 
      HugePages_Total:       0
      

       

    • Similarly, if the value in /proc/sys/vm/nr_hugepages file or vm.nr_hugepages sysctl parameter is "0" it means HugePages isdisabled on the system:

      Raw
      # cat /proc/sys/vm/nr_hugepages 
      0
      # sysctl vm.nr_hugepages
      vm.nr_hugepages = 0
      

       

  • Explicit HugePages ENABLED:

    • If the value of HugePages_Total is greater than "0", it means HugePages is enabled on the system:

      Raw
      # grep -i HugePages_Total /proc/meminfo 
      HugePages_Total:       1024
      

       

    • Similarly if the value in /proc/sys/vm/nr_hugepages file or vm.nr_hugepages sysctl parameter is greater than "0", it means HugePages is enabled on the system:

      Raw
      # cat /proc/sys/vm/nr_hugepages 
      1024
      # sysctl vm.nr_hugepages
      vm.nr_hugepages = 1024
      

       

 

Comments

  • RHEL 6 disables THP on systems with < 1G of ram. Refer to Red Hat Bug 618444 – disable transparent hugepages by default on small systems for more information.
  • Disadvantages of using the explicit hugepages (libhugetlbfs): Using hugetlbfs requires significant work from both application developers and system administrators; explicit hugepages must be set aside at boot time, and applications must map them explicitly. The process is fiddly enough that use of hugetlbfs is restricted to those who really care and who have the time to mess with it. Hugetlbfs is often seen as a feature for large, proprietary database management systems and little else.

使用情况监控


可以查看/sys/kernel/mm/transparent_hugepage/khugepaged下信息 


pages_to_scan (默认 4096 = 16MB)


一个扫描周期被扫描的内存页数


scan_sleep_millisecs (默认 10000 = 10sec)


多长时间扫描一次


alloc_sleep_millisecs (默认 60000 = 60sec) 


多长时间整理一次碎片


也可以查看/proc/meminfo信息





grep Huge /proc/meminfo


AnonHugePages: 266240 kB


HugePages_Total: 0


HugePages_Free: 0


HugePages_Rsvd: 0


HugePages_Surp: 0


Hugepagesize: 2048 kB

注意点:

1、redis 不建议开启这个参数

2、orancle数据库也不建议开启这个参数

 

 

分类: kernel 标签:

redis官方优化建议

2015年10月2日 没有评论

Redis Administration

This page contains topics related to the administration of Redis instances. Every topic is self contained in form of a FAQ. New topics will be created in the future.

Redis setup hints

  • We suggest deploying Redis using the Linux operating system. Redis is also tested heavily on OS X, and tested from time to time on FreeBSD and OpenBSD systems. However Linux is where we do all the major stress testing, and where most production deployments are working.
  • Make sure to set the Linux kernel overcommit memory setting to 1. Add vm.overcommit_memory = 1 to/etc/sysctl.conf and then reboot or run the command sysctl vm.overcommit_memory=1 for this to take effect immediately.
  • Make sure to disable Linux kernel feature transparent huge pages, it will affect greatly both memory usage and latency in a negative way. This is accomplished with the following command: echo never > sys/kernel/mm/transparent_hugepage/enabled.
  • Make sure to setup some swap in your system (we suggest as much as swap as memory). If Linux does not have swap and your Redis instance accidentally consumes too much memory, either Redis will crash for out of memory or the Linux kernel OOM killer will kill the Redis process.
  • Set an explicit maxmemory option limit in your instance in order to make sure that the instance will report errors instead of failing when the system memory limit is near to be reached.
  • If you are using Redis in a very write-heavy application, while saving an RDB file on disk or rewriting the AOF logRedis may use up to 2 times the memory normally used. The additional memory used is proportional to the number of memory pages modified by writes during the saving process, so it is often proportional to the number of keys (or aggregate types items) touched during this time. Make sure to size your memory accordingly.
  • Use daemonize no when run under daemontools.
  • Even if you have persistence disabled, Redis will need to perform RDB saves if you use replication, unless you use the new diskless replication feature, which is currently experimental.
  • If you are using replication, make sure that either your master has persistence enabled, or that it does not automatically restarts on crashes: slaves will try to be an exact copy of the master, so if a master restarts with an empty data set, slaves will be wiped as well.

Running Redis on EC2

  • Use HVM based instances, not PV based instances.
  • Don't use old instances families, for example: use m3.medium with HVM instead of m1.medium with PV.
  • The use of Redis persistence with EC2 EBS volumes needs to be handled with care since sometimes EBS volumes have high latency characteristics.
  • You may want to try the new diskless replication (currently experimental) if you have issues when slaves are synchronizing with the master.

Upgrading or restarting a Redis instance without downtime

Redis is designed to be a very long running process in your server. For instance many configuration options can be modified without any kind of restart using the CONFIG SET command.

Starting from Redis 2.2 it is even possible to switch from AOF to RDB snapshots persistence or the other way around without restarting Redis. Check the output of the CONFIG GET * command for more information.

However from time to time a restart is mandatory, for instance in order to upgrade the Redis process to a newer version, or when you need to modify some configuration parameter that is currently not supported by the CONFIG command.

The following steps provide a very commonly used way in order to avoid any downtime.

  • Setup your new Redis instance as a slave for your current Redis instance. In order to do so you need a different server, or a server that has enough RAM to keep two instances of Redis running at the same time.
  • If you use a single server, make sure that the slave is started in a different port than the master instance, otherwise the slave will not be able to start at all.
  • Wait for the replication initial synchronization to complete (check the slave log file).
  • Make sure using INFO that there are the same number of keys in the master and in the slave. Check with redis-cli that the slave is working as you wish and is replying to your commands.
  • Allow writes to the slave using CONFIG SET slave-read-only no
  • Configure all your clients in order to use the new instance (that is, the slave).
  • Once you are sure that the master is no longer receiving any query (you can check this with the MONITOR command), elect the slave to master using the SLAVEOF NO ONE command, and shut down your master.

原文链接:http://redis.io/topics/admin

查看最容易被oom杀掉进程的脚本

2015年9月29日 没有评论

#!/bin/bash
for proc in $(find /proc -maxdepth 1 -regex '/proc/[0-9]+'); do
    printf "%2d %5d %s\n" \
        "$(cat $proc/oom_score)" \
        "$(basename $proc)" \
        "$(cat $proc/cmdline | tr '\0' ' ' | head -c 50)"
done 2>/dev/null | sort -nr | head -n 10
分类: shell 标签:

pmap命令

2015年9月27日 没有评论

语法或用法

#pmap PID 或者 #pmap [options] PID 

在输出中它显示全部的地址,kbytes,mode还有mapping。

选项

-x extended显示扩展格式
-d device显示设备格式
-q quiet不显示header/footer行
-V 显示版本信息

单一进程内存状态
[root@mail ~]# pmap  -x 9652
9652:   /usr/local/application/webserver/redis/src/redis-server *:22005                                                    
Address           Kbytes     RSS   Dirty Mode   Mapping
0000000000400000     828     480       0 r-x–  redis-server
00000000006ce000      20      20      20 rw—  redis-server
00000000006d3000     152      92      92 rw—    [ anon ]
00000000015ea000     132      52      52 rw—    [ anon ]
0000003d58c00000     128      36       0 r-x–  ld-2.12.so
0000003d58e1f000       4       4       4 r—-  ld-2.12.so
0000003d58e20000       4       4       4 rw—  ld-2.12.so
0000003d58e21000       4       4       4 rw—    [ anon ]
0000003d59000000    1580     528       0 r-x–  libc-2.12.so
0000003d5918b000    2044       0       0 —–  libc-2.12.so
0000003d5938a000      16      16       8 r—-  libc-2.12.so
0000003d5938e000       4       4       4 rw—  libc-2.12.so
0000003d5938f000      20      16      16 rw—    [ anon ]
0000003d59400000       8       4       0 r-x–  libdl-2.12.so
0000003d59402000    2048       0       0 —–  libdl-2.12.so
0000003d59602000       4       4       4 r—-  libdl-2.12.so
0000003d59603000       4       4       4 rw—  libdl-2.12.so
0000003d59800000      92      68       0 r-x–  libpthread-2.12.so
0000003d59817000    2048       0       0 —–  libpthread-2.12.so
0000003d59a17000       4       4       4 r—-  libpthread-2.12.so
0000003d59a18000       4       4       4 rw—  libpthread-2.12.so
0000003d59a19000      16       4       4 rw—    [ anon ]
0000003d5a400000     524       4       0 r-x–  libm-2.12.so
0000003d5a483000    2044       0       0 —–  libm-2.12.so
0000003d5a682000       4       4       4 r—-  libm-2.12.so
0000003d5a683000       4       4       4 rw—  libm-2.12.so
00007f808d000000  688128  685848  685848 rw—    [ anon ]
00007f80b73fe000       4       0       0 —–    [ anon ]
00007f80b73ff000   10240       8       8 rw—    [ anon ]
00007f80b7dff000       4       0       0 —–    [ anon ]
00007f80b7e00000   75776   40960   40960 rw—    [ anon ]
00007f80bc800000    4096    4096    4096 rw—    [ anon ]
00007f80bcd70000   96832       0       0 r—-  locale-archive
00007f80c2c00000    4096    2048    2048 rw—    [ anon ]
00007f80c3070000      16      16      16 rw—    [ anon ]
00007f80c3083000       4       4       4 rw—    [ anon ]
00007fff7e495000      84      16      16 rw—    [ stack ]
00007fff7e5c6000       4       4       0 r-x–    [ anon ]
ffffffffff600000       4       0       0 r-x–    [ anon ]
—————-  ——  ——  ——
total kB          891028  734360  733228


这里的Address,Kbyte,Dirty,RSS,mode还有mapping的说明如下

扩展和设备格式区域

Address: 内存开始地址
Kbytes: 占用内存的字节数(KB)
RSS: 保留内存的字节数(KB)
Dirty: 脏页的字节数(包括共享和私有的)(KB)
Mode: 内存的权限:read、write、execute、shared、private (写时复制)
Mapping: 占用内存的文件、或[anon](分配的内存)、或[stack](堆栈)
Offset: 文件偏移
Device: 设备名 (major:minor)