CPU 飙升是一个常见的问题。

CPU100%定位的两大神器

img

想要定位到具体是哪一行的代码导致, 一般都会使用下面的两大神器

  • 通常使用的jvm自带的工具jstack,
  • 还有一种就是开源神器arthas,

一般而言,arthas还有其它的功能,所以选择它多一点.

CPU 飙升100%的解决思路****与方法论

img

使用jstack 解决CPU 100%问题实操

使用jstack 解决CPU 100%问题,在方法论上要用到两个命令,

  • top 命令查看TOP N线程,
  • jstack命令查看堆栈信息

img

1.jstack命令讲解

命令jstack是java堆栈的跟踪工具,可以打印出程序中所有线程的堆栈信息,包括线程状态,调用栈信息,锁信息等。

jstack可以诊断线程死锁、内存泄漏等问题

命令格式: jstack [options] pid

常用例子: jstack -l pid,查看线程的堆栈信息

堆栈信息解读:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
[root@192-168-65-185 ~]# jstack -l 294402024-12-17 13:48:51Full thread dump OpenJDK 64-Bit Server VM (25.412-b08 mixed mode):
"Attach Listener" #12 daemon prio=9 os_prio=0 tid=0x00007f097c001000 nid=0x763a runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"DestroyJavaVM" #11 prio=5 os_prio=0 tid=0x00007f09e004b800 nid=0x7301 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"pool-1-thread-1" #10 prio=5 os_prio=0 tid=0x00007f09e0166000 nid=0x7316 runnable [0x00007f09cc9cb000] java.lang.Thread.State: RUNNABLE at com.tuling.learnjuc.demo.JVMCPU$CPUTask.run(JVMCPU.java:16) - locked <0x0000000757d645b8> (a java.lang.Object) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)
Locked ownable synchronizers: - <0x0000000757d65d00> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"Service Thread" #9 daemon prio=9 os_prio=0 tid=0x00007f09e0140000 nid=0x7314 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"C1 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00007f09e0133000 nid=0x7313 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007f09e0131000 nid=0x7312 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f09e012f000 nid=0x7311 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f09e012c800 nid=0x7310 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f09e011e000 nid=0x730f runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f09e00f2000 nid=0x730e in Object.wait() [0x00007f09cd1d3000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x0000000757d08f00> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) - locked <0x0000000757d08f00> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:188)
Locked ownable synchronizers: - None
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f09e00ed000 nid=0x730d in Object.wait() [0x00007f09cd2d4000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x0000000757d06b98> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference.tryHandlePending(Reference.java:191) - locked <0x0000000757d06b98> (a java.lang.ref.Reference$Lock) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
Locked ownable synchronizers: - None
"VM Thread" os_prio=0 tid=0x00007f09e00e3800 nid=0x730c runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f09e005e000 nid=0x7302 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f09e0060000 nid=0x7303 runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f09e0062000 nid=0x7304 runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f09e0064000 nid=0x7305 runnable
"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007f09e0065800 nid=0x7306 runnable
"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007f09e0067800 nid=0x7307 runnable
"GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007f09e0069800 nid=0x7308 runnable
"GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007f09e006b800 nid=0x7309 runnable
"GC task thread#8 (ParallelGC)" os_prio=0 tid=0x00007f09e006d000 nid=0x730a runnable
"GC task thread#9 (ParallelGC)" os_prio=0 tid=0x00007f09e006f000 nid=0x730b runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007f09e0142800 nid=0x7315 waiting on condition
JNI global references: 5

你会发现上面的信息其实是一段一段的,摘取其中的一段为大家说明:

1
2
"pool-1-thread-1" #10 prio=5 os_prio=0 tid=0x00007f09e0166000 nid=0x7316 runnable [0x00007f09cc9cb000]   java.lang.Thread.State: RUNNABLE	at com.tuling.learnjuc.demo.JVMCPU$CPUTask.run(JVMCPU.java:16)	- locked <0x0000000757d645b8> (a java.lang.Object)	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)	at java.lang.Thread.run(Thread.java:750)
Locked ownable synchronizers: - <0x0000000757d65d00> (a java.util.concurrent.ThreadPoolExecutor$Worker)

pool-1-thread-1:线程名称

#1:当前线程ID,从main开始,jvm会根据线程创建的顺序为其线程编号

prio:优先级的顺序,一般默认是5

os_prio:线程对应系统的优先级

tid:java内的线程id

nid:操作系统级别的线程id,是一个十六进制

关于线程的信息:

NEW:线程新建,还没开始运行

RUNNABLE:正在java虚拟机中运行的线程

BLOCKED:被阻塞,正在等待监视器锁的线程

WAITING:无限期等待另一个线程执行特定操作的线程

TIMED_WAITING:等待另一个线程执行操作达到指定等待时间的线程

TERMINATED:已经退出的线程

我们这里关注的最多的就是nid

2.使用jstack解决CPU占用很高的问题并定位具体行数

先来看一段代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
package com.tuling.learnjuc.demo;
import java.util.concurrent.ExecutorService;import java.util.concurrent.Executors;
public class JVMCPU {
private static ExecutorService service = Executors.newFixedThreadPool(5);
private static Object lock = new Object();
public static class CPUTask implements Runnable{
@Override
public void run() {
synchronized (lock){ l
ong sum = 0L;
while(true){
sum +=1;
}
}
}
}
public static void main(String[] args) {
CPUTask cpuTask = new CPUTask();
service.execute(cpuTask);
}
}

将这段代码上传到linux服务器上编译

注意:如果提示javac: 未找到命令… ,执行yum install java-1.8.0-openjdk-devel

img

使用nohup java com.tuling.learnjuc.demo.JVMCPU &运行

img

使用top命令可以看到cpu被打满了

img

知道了进程的PID,如何找到进程下是哪个线程呢?可以使用命令top -Hp 32208,如下所示

img

如果能获取到线程名,下面这步可以忽略,不需要根据线程id找线程

从上面的图可以看到,cpu占用最多的线程是32230这个线程id,接下来就是使用jstack命令来查看程序的所有堆栈信息,但是,这里需要有一个注意的点,32230这个是一个十进制的,使用jstack看到的nid是十六进制,所以我们需要转换,可以使用printf “%x\n”这个命令

img

接下来使用jstack -l 32208打印堆栈信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
[root@192-168-65-185 ~]# jstack -l 322082024-12-17 14:04:09Full thread dump OpenJDK 64-Bit Server VM (25.412-b08 mixed mode):
"Attach Listener" #12 daemon prio=9 os_prio=0 tid=0x00007fb498001000 nid=0x4e5 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"DestroyJavaVM" #11 prio=5 os_prio=0 tid=0x00007fb4fc04b800 nid=0x7dd1 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"pool-1-thread-1" #10 prio=5 os_prio=0 tid=0x00007fb4fc16e000 nid=0x7de6 runnable [0x00007fb4e87fe000] java.lang.Thread.State: RUNNABLE at com.tuling.learnjuc.demo.JVMCPU$CPUTask.run(JVMCPU.java:16) - locked <0x0000000757d645b8> (a java.lang.Object) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)
Locked ownable synchronizers: - <0x0000000757d65d00> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"Service Thread" #9 daemon prio=9 os_prio=0 tid=0x00007fb4fc140000 nid=0x7de4 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"C1 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00007fb4fc133000 nid=0x7de3 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007fb4fc131000 nid=0x7de2 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007fb4fc12f000 nid=0x7de1 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007fb4fc12c800 nid=0x7de0 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007fb4fc11e000 nid=0x7ddf runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers: - None
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fb4fc0f2000 nid=0x7dde in Object.wait() [0x00007fb4e92d4000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x0000000757d08f00> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) - locked <0x0000000757d08f00> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:188)
Locked ownable synchronizers: - None
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fb4fc0ed000 nid=0x7ddd in Object.wait() [0x00007fb4e93d5000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x0000000757d06b98> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference.tryHandlePending(Reference.java:191) - locked <0x0000000757d06b98> (a java.lang.ref.Reference$Lock) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
Locked ownable synchronizers: - None
"VM Thread" os_prio=0 tid=0x00007fb4fc0e3800 nid=0x7ddc runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fb4fc05e000 nid=0x7dd2 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fb4fc060000 nid=0x7dd3 runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007fb4fc062000 nid=0x7dd4 runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fb4fc064000 nid=0x7dd5 runnable
"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007fb4fc065800 nid=0x7dd6 runnable
"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007fb4fc067800 nid=0x7dd7 runnable
"GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007fb4fc069800 nid=0x7dd8 runnable
"GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007fb4fc06b800 nid=0x7dd9 runnable
"GC task thread#8 (ParallelGC)" os_prio=0 tid=0x00007fb4fc06d000 nid=0x7dda runnable
"GC task thread#9 (ParallelGC)" os_prio=0 tid=0x00007fb4fc06f000 nid=0x7ddb runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007fb4fc142800 nid=0x7de5 waiting on condition
JNI global references: 5

从上面的信息中,可以看到转换的结果和nid是一致的,

从下面的信息中就可以看到问题其实是出现在JVMCPU.java的16行左右,这里给出的是16行,但是实际情况是16行的附近

1
2
"pool-1-thread-1" #10 prio=5 os_prio=0 tid=0x00007fb4fc16e000 nid=0x7de6 runnable [0x00007fb4e87fe000]   java.lang.Thread.State: RUNNABLE	at com.tuling.learnjuc.demo.JVMCPU$CPUTask.run(JVMCPU.java:16)	- locked <0x0000000757d645b8> (a java.lang.Object)	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)	at java.lang.Thread.run(Thread.java:750)
Locked ownable synchronizers: - <0x0000000757d65d00> (a java.util.concurrent.ThreadPoolExecutor$Worker)

结合代码来看一下就很容易问题

img

使用arthas解决CPU占用很高的问题,定位具体代码行

官方文档:https://arthas.aliyun.com/

使用arthas解决CPU 100%问题,在方法论上要用到两个命令,

  • dashboard 命令查看TOP N线程,
  • thread 命令查看堆栈信息

img

先来运行arthas

1
curl -O https://arthas.aliyun.com/arthas-boot.jarjava -jar arthas-boot.jar

img

输入1显示如下

img

输入dashboard命令可以看到是哪个线程占用cpu最高

img

接下来输入thread -n 1,表示最忙的线程并打印信息

img

从上面的图中可以看到arthas和jstack展示的信息差不多,都定位到了JVMCPU.java的16行程序