当前位置: 首页 > news >正文

企业级Jenkins实践

  • 1. 概述
  • 2. Jenkins部署
    • 2.1. Jenkins容器镜像构建
    • 2.2. 部署
  • 3. 其他组件集成
    • 3.1. sonarqube
    • 3.2. gitlab
    • 3.3. harbor
  • 4. 构建运行时
    • 4.1. 动态slave
    • 4.2. 定制化镜像
    • 4.3. pod启动加速
  • 5. CI构建
    • 5.1. 构建改造
    • 5.2. 缓存
    • 5.3. DID
    • 5.4. buildx
    • 5.5. DOCKER_BUILDKIT
  • 6. 落地规划
  • 7. 附录
    • 7.1. 附录1: buildx安装配置
      • 7.1.1. 前提
      • 7.1.2. 安装
      • 7.1.3. 交叉编译支持
      • 7.1.4. 创建新的builder
      • 7.1.5. 验证
      • 7.1.6. 构建命令
      • 7.1.7. 构建脚本
      • 7.1.8. 常见问题及解决方式
        • 7.1.8.1. 编译arm64架构的nodejs项目失败
          • 7.1.8.1.0.1. 现象
          • 7.1.8.1.0.2. 解决
          • 7.1.8.1.1. 参考
    • 7.2. 附录2: k9s动态slave pod启动优化
      • 7.2.1. 概述
      • 7.2.2. 环境说明
      • 7.2.3. 时间分析
      • 7.2.4. 性能优化
        • 7.2.4.1. etcd
        • 7.2.4.2. 启动链分析
        • 7.2.4.3. Jenkins remoting协议
          • 7.2.4.3.1. jenkinsfile
          • 7.2.4.3.2. 服务端日志
            • 7.2.4.3.2.1. websocket模式
            • 7.2.4.3.2.2. jnlp模式
          • 7.2.4.3.3. 客户端agent日志
            • 7.2.4.3.3.1. websocke连接
            • 7.2.4.3.3.2. jnlp连接
      • 7.2.5. 参考

1. 概述

Jenkins作为开源的自动化CICD工具,自横空出世以来,为企业提供了稳定的CICD基础服务,也提供了丰富多彩的插件,用以包罗生态内各种软件、提供大量方面的快捷操作等。
云原生自出世以来大行其道,对计算服务提供了高效便捷的编排管理,成为当前计算服务生命周期管理方案的不二之选。Jenkins的软件特性与云原生相得益彰:

  • Jenkins可以使用k8s的pod作为slave,达成了云计算按需分配的特点
  • Jenkins可以通过k8s部署,实现高可用、与生命周期管理,提高服务交付效率、降低维护成本

但Jenkins本身是一个JAVA应用,且庞大,在和K8S集成过程中、大规模使用上仍有不少挑战, 下面从部署、集成、运行、构建、落地规划几个方面记录下Jenkins、Jenkins和K8s在企业实践中遇到的比较有意思的问题。

2. Jenkins部署

2.1. Jenkins容器镜像构建

  • custom-war-packager 定制Jenkins WAR包
  • jenkins-formulas 通过formula.yaml定制自己的jenkins,针对中国区优化
  • formula kubersphere定制的jenkins镜像,方便好用

定制镜像时,需要预制几个服务

  • blueocean 提供了jenkins的restful api
  • kubernetes 提供了动态slave能力
  • 其他常用服务

2.2. 部署

helm-charts

部署时最具多样性的操作是修改 JCasC(Jenkins Configuration as Code), JCasC为Jenkins提供了配置热加载功能、一切服务皆可通过JCasC配置,例如

  • 配置Cloud,自动对接k8s集群作为动态slave、自定义pod template
  • 配置sonarqube对接
  • 初始化jdk工具
  • 初始化maven工具
  • 初始化git工具

还可以通过修改configmap的挂载来动态修改Maven settings.xml,也可以通过groovy脚本进行用户名、密码、RBAC、邮箱等的初始化,这里的大部分现在可以通过JCasC做,但有一些高级功能仍需要这种传统方式。

3. 其他组件集成

整套DevOps系统要落地的话,从代码存储到容器镜像存储、代码扫描工具、maven仓库等等,需要在部署的时候集成gitlab、sonarqube、harbor、nexus, 这里除了sonarqube,其他的集成以helm charts形式集成部署即可

3.1. sonarqube

sonarqube不仅需要部署即成,还需要在jenkins中添加集成配置,并安装sonar-scanner, 安装sonar-scanner可以通过groovy初始化脚本,也可以通过JCasC,我们采用JCasC方式比较方便,配置如下

    unclassified:
      sonarGlobalConfiguration:
        buildWrapperEnabled: false
        installations:
        - name: "sonar01"
          serverUrl: "http://sonarqube:9000" # helm部署集成之后sonarqube的svc名称和端口,如果有命名空间的话需要加上命名空间
          credentialsId: token-sonarqube # 等部署完之后创建
          triggers:
            skipScmCause: false
            skipUpstreamCause: false
    tool:
      sonarRunnerInstallation:
        installations:
        - name: "sonar-scanner"
          properties:
          - installSource:
              installers:
              - sonarRunnerInstaller:
                  id: "4.2.0.1873"

3.2. gitlab

集成gitlab的时候需要考虑到部分场景可能没有域名,只能使用IP+PORT,且ssh+http不同,需要注意相关配置和环境变量

# 环境变量 http
EXTERNAL_URL
# 配置 ssh
gitlab_rails['gitlab_shell_ssh_port'] = 30750

3.3. harbor

helm-chart

helm安装harbor遇到耗时比较长的有两个问题:

  • externalURL 这个值需要配置成需要访问的url,否则docker无法login
  • 如果使用 nginx/haproxy(https) + ingress(http) + svc(http) 的方式则需要修改 configmap harbor-registryrelativeurls: true 否则docker 无法push

4. 构建运行时

4.1. 动态slave

使用k8s动态slave创建容器,首先需要Jenkins对接K8s,这里会用到kubernetes插件,实际操作为配置cloud,使用k8s serviceacount去连接k8s,这这里分为两步

  • 使用groovy脚本在Jenkins上创建secret
// cat /var/jenkins_home/init.groovy.d/initK8sCredentials.groovy
import com.clouds.plugins.credentials.CredentialsScope
import com.clouds.plugins.credentials.SystemCredentialsProvider
import com.clouds.plugins.credentials.domains.Domain
import org.csanchez.jenkins.plugins.kubernetes.ServiceAccountCredential

def addKubeCredential(String credentialId) {
  def kubeCredential = new ServiceAccountCredential(CredentialsScope.GLOBAL, credentialId, 'Kubernetes service account')
  SystemCredentialsProvider.getInstance().getStore().addCredentials(Domain.global(), kubeCredential)
  • 使用JCasC进行Cloud配置
jenkins:
  mode: EXCLUSIVE
  numExecutors: 0
  scmCheckoutRetryCount: 2
  disableRememberMe: true

  clouds:
    - kubernetes:
        name: "kubernetes"
        serverUrl: "https://kubernetes.default"
        skipTlsVerify: true
        namespace: "jenkins-worker" # worker pod的命名空间
        credentialsId: "k8s-service-account"
        jenkinsUrl: "http://jenkins-jenkins.kubesphere-devops-system:80" // jenkins svc地址
        webSocket: true # master slave通信使用websocket协议
        containerCapStr: "200"
        connectTimeout: "60"
        readTimeout: "60"
        maxRequestsPerHostStr: "32"
        templates:
          - name: "base"
            namespace: "jenkins-worker"
            label: "base"
            nodeUsageMode: "NORMAL"
            idleMinutes: 0
            containers:
            - name: "base"
              image: "harbor.com/kubesphere/builder-base:v3.2.2"
              command: "cat"
              args: ""
              ttyEnabled: true
              privileged: false
              resourceRequestCpu: "100m"
              resourceLimitCpu: "4000m"
              resourceRequestMemory: "100Mi"
              resourceLimitMemory: "4096Mi"
            - name: "jnlp"
              image: "reg.harbor.com/jenkins/inbound-agent:4.10-3"
              args: "^${computer.jnlpmac} ^${computer.name}"
              resourceRequestCpu: "50m"
              resourceLimitCpu: "4000m"
              resourceRequestMemory: "400Mi"
              resourceLimitMemory: "4000Mi"
            workspaceVolume:
              emptyDirWorkspaceVolume:
                memory: false
            volumes:
            - hostPathVolume:
                hostPath: "/var/run/docker.sock"
                mountPath: "/var/run/docker.sock"
            - hostPathVolume:
                hostPath: "/var/data/jenkins_sonar_cache"
                mountPath: "/root/.sonar/cache"
            - hostPathVolume:
                hostPath: "/usr/bin/docker"
                mountPath: "/usr/bin/docker"
            yaml: |
              spec:
                affinity:
                  nodeAffinity:
                    preferredDuringSchedulingIgnoredDuringExecution:
                    - weight: 1
                      preference:
                        matchExpressions:
                        - key: node-role.kubernetes.io/worker
                          operator: In
                          values:
                          - ci
                tolerations:
                - key: "node.kubernetes.io/ci"
                  operator: "Exists"
                  effect: "NoSchedule"
                - key: "node.kubernetes.io/ci"
                  operator: "Exists"
                  effect: "PreferNoSchedule"
                containers:
                - name: "base"
                  resources:
                    requests:
                      ephemeral-storage: "1Gi"
                    limits:
                      ephemeral-storage: "10Gi"
                securityContext:
                  fsGroup: 1000

这里需要挂载 /usr/bin/docker.sock 目录, 使用DID构建容器镜像

4.2. 定制化镜像

针对不同的语言需要有不同的镜像,例如go、java、python、nodejs等等,对于不同的语言版本也需要不同的镜像,对于不同的jenkins任务也需要不同的镜像,例如需要在镜像中运行git命令,则需要安装git等等,除此之外还需要有一个agent镜像用于和master通信, 对于不同的镜像需要不同的Dockerfile, kubersphere

devops-agent

4.3. pod启动加速

jenkins动态slave从发起流水线请求到创建pod再到执行任务默认耗时很长,经过分析主要在以下几个地方:

  • pod启动 – 有遇到由于etcd时间偏移导致启动慢的问题,校准时间
  • agent启动 – 调整pod limit增大cpu内存
  • agent和master建立联系, 使用jnlp协议需要9-14s,使用websocket仅需2s
  • pod池 – pod在运行完任务之后不删除,等有相同配置的pod需要的时候运行在存量pod之上,限制存活pod数量,超过一定时间回收pod

这部分参考 附录2: k9s动态slave pod启动优化

5. CI构建

5.1. 构建改造

  • 改造两阶段构建的Dockerfile,将二进制构建和dockerfile构建分开,这样可以使用缓存加速,利用语言特性进行交叉编译等等

5.2. 缓存

缓存,使用挂载共享存储的方式,对不同语言挂载不同缓存目录, 可以设置项目隔离缓存与集群隔离缓存多维度

nodejs npm缓存
/root/.yarn
/root/.npm
maven maven仓库
/root/.m2
golang golang缓存
/home/jenkins/go/pkg
pip缓存
/root/.cache/pip
/root/.local/share/virtualenvs
sonar缓存
/root/.sonar/cache
docker

5.3. DID

需要在容器中挂载母机的 docker.socket, 需要注意权限文件权限,可在母机的docker启动文件 /usr/lib/systemd/system/docker.service 编辑如下

ExecStartPost=/bin/chmod 666 /var/run/docker.sock

5.4. buildx

使用buildx可以达到编译多架构镜像的功能

需要安装buildx之后,安装buildx见附录1: buildx安装配置,挂载buildx目录到docker容器

/usr/libexec/docker

5.5. DOCKER_BUILDKIT

docker build时设置 DOCKER_BUILDKIT=1 可以加速

6. 落地规划

  • etcd 使用ssd
  • docker目录使用ssd
  • 连接共享存储集群的网卡选择万兆

7. 附录

7.1. 附录1: buildx安装配置

7.1.1. 前提

https://www.docker.com/blog/multi-platform-docker-builds/
需要提前安装buildx
需要检查base 镜像是否也是多架构=

7.1.2. 安装

wget https://github.com/docker/buildx/releases/download/v0.9.1/buildx-v0.9.1.linux-amd64
mkdir -p  /usr/libexec/docker/cli-plugins/
chmod +x buildx-v0.9.1.linux-amd64
mv buildx-v0.9.1.linux-amd64 /usr/libexec/docker/cli-plugins/docker-buildx

7.1.3. 交叉编译支持

  • 升级操作系统内核到4.8以上
  • docker启用 experimental
  • 使用qemu进行架构仿真,否则不支持除本机架构以外的架构

使用binfmt镜像时,在x86平台上构建arm64的nodejs项目会报错,配置 qemu-user-static 可解决

docker run --privileged --rm docker/binfmt:a7996909642ee92942dcd6cff44b9b95f08dad64
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes

7.1.4. 创建新的builder

docker buildx create --name mybuilder --driver-opt image=moby/buildkit:master,network=host --use

7.1.5. 验证

docker buildx ls

7.1.6. 构建命令

# --load 参数只适用于单个镜像,将oci load成本地docker镜像
# --push 参数适用于单个/多个镜像,将oci push到远端仓库
# x86
docker buildx build --platform linux/amd64 \
  -f build/mysql/Dockerfile -t reg.myregistry.com/my project/mysql-amd64:v1.3.2.2.gd75a3c0 . --load
docker push reg.myregistry.com/myproject/mysql-amd64:v1.3.2.2.gd75a3c0
# arm
docker buildx build --platform linux/arm64 \
  -f build/mysql/Dockerfile -t reg.myregistry.com/myproject/mysql-arm64:v1.3.2.2.gd75a3c0 . --load
docker push reg.myregistry.com/myproject/mysql-arm64:v1.3.2.2.gd75a3c0

# create manifest
docker manifest create reg.myregistry.com/myproject/mysql:v1.3.2.2.gd75a3c0 \
  reg.myregistry.com/myproject/mysql-arm64:v1.3.2.2.gd75a3c0 \
  reg.myregistry.com/myproject/mysql-amd64:v1.3.2.2.gd75a3c0

7.1.7. 构建脚本

# 抽象循环后的脚本
PLATFORMLIST="linux/amd64 linux/arm64 "
REGISTRY="reg.harbor.com"
REGISTRY_NAMESPACE="base"
TAG=$(git describe --dirty --always --tags | sed 's/-/./g')
SERVICENAME="storage-base"
DOCKERFILE_PATH="build/storage-base/Dockerfile"
MANIFEST_LIST="docker manifest create $REGISTRY/$REGISTRY_NAMESPACE/$SERVICENAME:$TAG"
for i in $PLATFORMLIST;
do {
  ARCH=$(echo $i|awk -F '/' '{print $2}')
  DOCKER_BUILDKIT=1 docker buildx build --progress plain --platform $i \
    -f $DOCKERFILE_PATH -t $REGISTRY/$REGISTRY_NAMESPACE/$SERVICENAME-$ARCH:$TAG . --load
  docker push $REGISTRY/$REGISTRY_NAMESPACE/$SERVICENAME-$ARCH:$TAG
}&
done

wait

for i in $PLATFORMLIST;
do {
  ARCH=$(echo $i|awk -F '/' '{print $2}')
  MANIFEST_LIST+=" $REGISTRY/$REGISTRY_NAMESPACE/$SERVICENAME-$ARCH:$TAG"
  echo $MANIFEST_LIST
}
done

eval $MANIFEST_LIST
docker manifest push $REGISTRY/$REGISTRY_NAMESPACE/$SERVICENAME:$TAG

7.1.8. 常见问题及解决方式

7.1.8.1. 编译arm64架构的nodejs项目失败

7.1.8.1.0.1. 现象
Unknown host QEMU_IFLA type: 47
Unknown host QEMU_IFLA type: 48
Unknown host QEMU_IFLA type: 43
#12 89.59 warning compression-webpack-plugin > cacache > @npmcli/move-file@1.1.2: This functionality has n moved to @npmcli/fs
#12 95.21 [2/4] Fetching packages...
#12 128.0 Unknown QEMU_IFLA_INFO_KIND ipip
#12 128.1 info There appears to be trouble with your network connection. Retrying...
#12 162.6 Unknown QEMU_IFLA_INFO_KIND ipip
#12 162.6 info There appears to be trouble with your network connection. Retrying...
#12 196.2 Unknown QEMU_IFLA_INFO_KIND ipip
#12 196.2 info There appears to be trouble with your network connection. Retrying...
#12 230.2 Unknown QEMU_IFLA_INFO_KIND ipip
#12 230.2 info There appears to be trouble with your network connection. Retrying...
#12 263.6 Unknown QEMU_IFLA_INFO_KIND ipip
#12 266.1 error An unexpected error occurred: "http://nexus.myregistry.cn/repository/lls-npm/link-ui-web/-/link-ui-web-2.1.7.tgz: ESOCKETTIMEDOUT".
#12 266.1 info If you think this is a bug, please open a bug report with the information provided in "/opt/yarn-error.log".
#12 266.1 info Visit https://yarnpkg.com/en/docs/cli/install for documentation about this command.
#12 ERROR: executor failed running [/bin/sh -c yarn install]: exit code: 1
------
 > [web-builder 4/6] RUN yarn install:
#12 162.6 Unknown QEMU_IFLA_INFO_KIND ipip
#12 162.6 info There appears to be trouble with your network connection. Retrying...
#12 196.2 Unknown QEMU_IFLA_INFO_KIND ipip
#12 196.2 info There appears to be trouble with your network connection. Retrying...
#12 230.2 Unknown QEMU_IFLA_INFO_KIND ipip
#12 230.2 info There appears to be trouble with your network connection. Retrying...
#12 263.6 Unknown QEMU_IFLA_INFO_KIND ipip
7.1.8.1.0.2. 解决
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes

参照 emu-user-static issue

7.1.8.1.1. 参考
  • qemu-user-static
  • multi-platform-docker-builds
  • binfmt
  • how-to-upgrade-kernel-centos

7.2. 附录2: k9s动态slave pod启动优化

7.2.1. 概述

jenkins 使用 k8s 作为动态slave,使用jnlp进行slave和master之间的通信,从容器启动到拉取代码需要25s,体验较差

Jenkins 的调度模式是一种尽量能够节省资源的方式进行调度的。在一般的调度过程中 Jenkins 需要经历以下几个阶段: 检查有没有可用的 agent -> 如果没有的可用的agent,计算是否有 agent 预计将要运行完任务 -> 等待一段时间-> 启动动态的 agent -> agent 与 master建立连接。 这种方式使 CI/CD 任务被执行的太慢,我们往往都需要等待几十秒甚至更长时间来准备 CI/CD的执行环境。

7.2.2. 环境说明

组件参数版本备注
CPU芯片架构amd64
服务器配置3 * 8C16G500G
操作系统版本CentOS Linux release 7.8.2003 (Core)
内核5.4.224-1.el7.elrepo.x86_64
K8sv1.22.10
K8s网络模式calico
jenkins版本2.319.1
jnlp agent版本4.10-3

7.2.3. 时间分析

  • 点击界面到jenkins打印日志2s
  • 容器到running状态3s
  • jnlp连接通信17s
  • 拉取代码3s

7.2.4. 性能优化

7.2.4.1. etcd

k8s的多个服务不定时在running和terminal之间切换频繁,查看etcd日志,有使用偏移较大的报错,校准始终之后现象没有改变,直到重启etcd解决,这是导致偶尔容器启动慢甚至启动失败的原因。

日志

Nov 30 14:22:50 node-76 etcd[1037]: rejected connection from "10.10.10.158:53604" (error "EOF", ServerName "")
Nov 30 14:22:50 node-76 etcd[1037]: rejected connection from "10.10.10.76:40270" (error "EOF", ServerName "")
Nov 30 14:22:50 node-76 etcd[1037]: rejected connection from "10.10.10.158:51594" (error "EOF", ServerName "")
Nov 30 14:22:50 node-76 etcd[1037]: rejected connection from "10.10.10.85:35032" (error "EOF", ServerName "")
Nov 30 14:22:50 node-76 etcd[1037]: rejected connection from "10.10.10.85:47684" (error "EOF", ServerName "")
Nov 30 14:22:50 node-76 etcd[1037]: rejected connection from "10.10.10.76:40182" (error "EOF", ServerName "")
Nov 30 14:22:50 node-76 etcd[1037]: rejected connection from "10.10.10.85:47604" (error "EOF", ServerName "")
Nov 30 14:22:50 node-76 etcd[1037]: rejected connection from "10.10.10.85:35028" (error "EOF", ServerName "")
Nov 30 14:22:51 node-76 etcd[1037]: rejected connection from "10.10.10.76:40130" (error "EOF", ServerName "")
Nov 30 14:22:51 node-76 etcd[1037]: rejected connection from "10.10.10.76:40226" (error "EOF", ServerName "")
Nov 30 14:22:51 node-76 etcd[1037]: rejected connection from "10.10.10.76:40246" (error "EOF", ServerName "")
Nov 30 14:22:51 node-76 etcd[1037]: rejected connection from "10.10.10.76:40230" (error "EOF", ServerName "")
Nov 30 14:22:51 node-76 etcd[1037]: rejected connection from "10.10.10.158:53668" (error "EOF", ServerName "")
Nov 30 14:22:51 node-76 etcd[1037]: rejected connection from "10.10.10.85:34996" (error "EOF", ServerName "")
Nov 30 14:22:51 node-76 etcd[1037]: rejected connection from "10.10.10.76:40258" (error "EOF", ServerName "")
Nov 30 14:22:51 node-76 etcd[1037]: raft2022/11/30 14:22:51 INFO: 496af9c69679c08a is starting a new election at term 1114
Nov 30 14:22:51 node-76 etcd[1037]: raft2022/11/30 14:22:51 INFO: 496af9c69679c08a became candidate at term 1115
Nov 30 14:22:51 node-76 etcd[1037]: raft2022/11/30 14:22:51 INFO: 496af9c69679c08a received MsgVoteResp from 496af9c69679c08a at term 1115
Nov 30 14:22:51 node-76 etcd[1037]: raft2022/11/30 14:22:51 INFO: 496af9c69679c08a [logterm: 9, index: 18853153] sent MsgVote request to 8e9e6d1fe6ff301d at term 1115
Nov 30 14:22:51 node-76 etcd[1037]: raft2022/11/30 14:22:51 INFO: 496af9c69679c08a [logterm: 9, index: 18853153] sent MsgVote request to dc6ed8a281b6886b at term 1115
Nov 30 14:22:52 node-76 etcd[1037]: read-only range request "key:\"/registry/health\" " with result "error:context deadline exceeded" took too long (2.000157124s) to execute
Nov 30 14:22:55 node-76 etcd[1037]: health check for peer dc6ed8a281b6886b could not connect: dial tcp 10.10.10.85:2380: i/o timeout
Nov 30 14:22:55 node-76 etcd[1037]: health check for peer dc6ed8a281b6886b could not connect: dial tcp 10.10.10.85:2380: i/o timeout
Nov 30 14:22:55 node-76 etcd[1037]: the clock difference against peer dc6ed8a281b6886b is too high [1m40.949394882s > 1s]
Nov 30 14:22:55 node-76 etcd[1037]: health check for peer 8e9e6d1fe6ff301d could not connect: dial tcp 10.10.10.158:2380: i/o timeout
Nov 30 14:22:55 node-76 etcd[1037]: health check for peer 8e9e6d1fe6ff301d could not connect: dial tcp 10.10.10.158:2380: i/o timeout
Nov 30 14:22:55 node-76 etcd[1037]: the clock difference against peer 8e9e6d1fe6ff301d is too high [1m35.440412268s > 1s]
Nov 30 14:22:55 node-76 etcd[1037]: raft2022/11/30 14:22:55 INFO: 496af9c69679c08a [term: 1115] ignored a MsgVoteResp message with lower term from dc6ed8a281b6886b [term: 1100]
Nov 30 14:22:58 node-76 etcd[1037]: raft2022/11/30 14:22:58 INFO: 496af9c69679c08a is starting a new election at term 1115
Nov 30 14:22:58 node-76 etcd[1037]: raft2022/11/30 14:22:58 INFO: 496af9c69679c08a became candidate at term 1116
Nov 30 14:22:58 node-76 etcd[1037]: raft2022/11/30 14:22:58 INFO: 496af9c69679c08a received MsgVoteResp from 496af9c69679c08a at term 1116
Nov 30 14:22:58 node-76 etcd[1037]: raft2022/11/30 14:22:58 INFO: 496af9c69679c08a [logterm: 9, index: 18853153] sent MsgVote request to 8e9e:

7.2.4.2. 启动链分析

  • jenkins接口到jenkins slave处理模块的2s – 优化空间小,难度高
  • k8s启动容器3s – 优化空间小,难度高
  • jenkins slave模块到agent通信的17s – 优化空间大
    • jenkins启动的pod都有两个容器,其中一个是用于与master通信的jnlp容器,jnlp容器主要做的事
      • 启动shell脚本
      • 启动java程序
      • 与服务端交互 – 模拟jnlp java程序与服务端建立交互,需要5s
      • 交互成功

其中,shell脚本比较简单,耗时比较少,剩下的就是java程序

/opt/java/openjdk/bin/java -cp /usr/share/jenkins/agent.jar hudson.remoting.jnlp.Main -headless -tunnel jenkins-jenkins-agent.kubesphere-devops-system:50000 -url http://jenkins-jenkins.kubesphere-devops-system:80/ -workDir /home/jenkins/agent 158e8612dcb51a9266d8579642ff4eb73a419967235729bb2e8294e0d5f80c5c base-hbrw8

从CPU、内存、磁盘IO分析,所用占系统总体都不高,Jenkins主程序经常由于处理很多并发任务,占用CPU资源比较多,一般应用启动瞬间,需要做很多检查、建立很多连接,所以耗计算资源较多,检查jnlp容器的资源确实limit很小,调整limit到4c8g之后有明显改观,整个过程从原来的25s降低到8s,整体体验上快了三倍。

7.2.4.3. Jenkins remoting协议

  • JNLP4-connect – 当前Jenkins slave通信使用的主流协议
    • 连接方式 – 服务端启动一个端口,等agent端回掉
  • WebSocket – 推荐使用的协议,简单、高效
    • 连接方式 – websocket
  • 插件式协议 remoting kafka plugin – 使用kafka通信
7.2.4.3.1. jenkinsfile
pipeline {
agent {
    node {
        label 'base'
    }
}

    stages {
        stage('Hello') {
            steps {
                echo 'Hello World'
            }
        }
    }
}
7.2.4.3.2. 服务端日志
7.2.4.3.2.1. websocket模式

jenkins container log

pod从创建到执行任务再到销毁需要 9s, slave+echo hello world通信需要2s

  • 从jenkins接收到请求到检索label
  • 发送创建pod请求
  • 创建pod到podrunning
  • slave建立连接进行通信
2022-12-05 07:57:28.263+0000 [id=92860]	INFO	i.j.p.g.event.HttpEventSender#send: Skipped event sending due to receiver URL not set
2022-12-05 07:57:28.265+0000 [id=92860]	INFO	i.j.p.g.event.HttpEventSender#send: Skipped event sending due to receiver URL not set
2022-12-05 07:57:33.249+0000 [id=48]	INFO	hudson.slaves.NodeProvisioner#update: base-hdn7j provisioning successfully completed. We have now 8 computer(s)
2022-12-05 07:57:33.285+0000 [id=92845]	INFO	o.c.j.p.k.KubernetesLauncher#launch: Created Pod: kubernetes kubesphere-devops-worker/base-hdn7j
2022-12-05 07:57:35.646+0000 [id=92845]	INFO	o.c.j.p.k.KubernetesLauncher#launch: Pod is running: kubernetes kubesphere-devops-worker/base-hdn7j
2022-12-05 07:57:37.227+0000 [id=92833]	INFO	o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent base-hdn7j
2022-12-05 07:57:37.261+0000 [id=92834]	INFO	o.j.p.workflow.job.WorkflowRun#finish: wanggangfeng-test01/helloworld #19 completed: SUCCESS
7.2.4.3.2.2. jnlp模式

jenkins container log

pod从创建到执行任务再到销毁需要 14s slave通信+echo hello world需要6s

  • 从jenkins接收到请求到检索label 6s 更换docker磁盘 加大内存 cpu 4s
  • 发送创建pod请求 18ms
  • 创建pod到podrunning 2s 2s
  • slave建立连接进行通信 6s 减少为1s通过更换协议 前面增大limit前14s
2022-12-05 05:53:07.316+0000 [id=90051]	INFO	i.j.p.g.event.HttpEv entSender#send: Skipped event sending due to receiver URL not set
2022-12-05 05:53:13.255+0000 [id=55]	INFO	hudson.slaves.NodeProvisioner#update: base-1pfs8 provisioning successfully completed. We have now 8 computer(s)
2022-12-05 05:53:13.273+0000 [id=89997]	INFO	o.c.j.p.k.KubernetesLauncher#launch: Created Pod: kubernetes kubesphere-devops-worker/base-1pfs8
2022-12-05 05:53:15.777+0000 [id=90058]	INFO	h.TcpSlaveAgentListener$ConnectionHandler#run: Accepted JNLP4-connect connection #302 from /10.222.122.126:39776
2022-12-05 05:53:15.913+0000 [id=89997]	INFO	o.c.j.p.k.KubernetesLauncher#launch: Pod is running: kubernetes kubesphere-devops-worker/base-1pfs8
2022-12-05 05:53:21.261+0000 [id=89988]	INFO	o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent base-1pfs8
2022-12-05 05:53:21.283+0000 [id=89989]	INFO	o.j.p.workflow.job.WorkflowRun#finish: wanggangfeng-test01/helloworld #2 completed: SUCCESS
7.2.4.3.3. 客户端agent日志
7.2.4.3.3.1. websocke连接
Warning: SECRET is defined twice in command-line arguments and the environment variable
Warning: AGENT_NAME is defined twice in command-line arguments and the environment variable
Dec 02, 2022 9:17:36 AM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: base1-qxbh2
Dec 02, 2022 9:17:36 AM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Dec 02, 2022 9:17:36 AM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 4.10
Dec 02, 2022 9:17:36 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /home/jenkins/agent/remoting as a remoting work directory
Dec 02, 2022 9:17:36 AM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /home/jenkins/agent/remoting
Dec 02, 2022 9:17:37 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: WebSocket connection open
Dec 02, 2022 9:17:38 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
Dec 02, 2022 9:17:42 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave$SlaveDisconnector call
INFO: Disabled agent engine reconnects.
7.2.4.3.3.2. jnlp连接
Warning: SECRET is defined twice in command-line arguments and the environment variable
Warning: AGENT_NAME is defined twice in command-line arguments and the environment variable
Dec 02, 2022 9:21:04 AM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: base-xjlhx
Dec 02, 2022 9:21:05 AM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Dec 02, 2022 9:21:05 AM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 4.10
Dec 02, 2022 9:21:05 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /home/jenkins/agent/remoting as a remoting work directory
Dec 02, 2022 9:21:05 AM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /home/jenkins/agent/remoting
Dec 02, 2022 9:21:05 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [http://jenkins-jenkins.kubesphere-devops-system:80/]
Dec 02, 2022 9:21:05 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Dec 02, 2022 9:21:05 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting TCP connection tunneling is enabled. Skipping the TCP Agent Listener Port availability check
Dec 02, 2022 9:21:05 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
  Agent address: jenkins-jenkins-agent.kubesphere-devops-system
  Agent port:    50000
  Identity:      f0:b3:19:11:68:7e:a7:9f:5c:0d:32:d9:f0:30:d2:45
Dec 02, 2022 9:21:05 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Dec 02, 2022 9:21:05 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins-jenkins-agent.kubesphere-devops-system:50000
Dec 02, 2022 9:21:05 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Dec 02, 2022 9:21:06 AM org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader run
INFO: Waiting for ProtocolStack to start.
Dec 02, 2022 9:21:10 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: f0:b3:19:11:68:7e:a7:9f:5c:0d:32:d9:f0:30:d2:45
Dec 02, 2022 9:21:10 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
Dec 02, 2022 9:21:15 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave$SlaveDisconnector call
INFO: Disabled agent engine reconnects.

7.2.5. 参考

  • remoting-4.14
  • configure-ports-jnlp-agents

相关文章:

  • 自己创建公司网站/廊坊seo推广公司
  • 做网站必须要文网文吗/seo软件排行榜前十名
  • 微信开发者平台文档/谷歌优化推广
  • 七星互联免费主机/汕头网站建设优化
  • 中介房产cms/seo学校
  • 技术支持-鼎维重庆网站建设专家/电子商务是干什么的
  • Python爬虫教你爬取csdn作者排行榜
  • OpenHarmony如何切换横竖屏?
  • 234. 回文链表
  • 2023中职网络安全技能竞赛新题
  • Bootstrap踩坑笔记(记录Bootstrap当中的核心知识点)
  • 计算机网络学习笔记(五)网络层 - 控制层面
  • 【记那些年我们链不明白的青春】Cmake常用函数一文总结
  • go 网络请求包 resty
  • 算法编程(美团)
  • 一次Navicat Premium 15(绿色) 命令页面的测试
  • 模糊图像检测(c++)
  • 合宙ESP32S3 CameraWebServe 测试demo