XTY Blog | Linux Ops Docs | SRE

elk 1.1.0: production installation

14 May 2018

0. 组件介绍

详细的组件说明可参照官方文档，此处只列出基于生产环境的标准考量采用的组件。

filebeat, log collector
redis, message queue
logstash, log analysis & filter
elasticsearch, storage & search
kibana, web UI

服务分布架构

服务器ip	角色	备注
192.168.86.24	tomcat & filebeat	6.2.4
192.168.86.105	tomcat & filebeat	6.2.4
192.168.86.138	redis	3.2.4
192.168.86.130	logstash	6.2.4
192.168.100.68	elasticsearch	6.2.4
192.168.100.68	kibana	6.2.4

1. 安装及配置

1) elasticsearch

在elasticsearch节点机器上执行以下命令

系统调优

# 关闭swap
sudo swapoff -a
# 永久关闭需要打开/etc/fstab，注释包含swap关键字那行

# 虚拟内存
sysctl -w vm.max_map_count=262144
# 也可以在/etc/sysctl.conf中设置vm.max_map_count算

如果使用的是systemd来host服务，以下ulimit的三个选项可以在systemd的unitfile里面配置

# 增大文件句柄
ulimit -n 65536
# 或者打开/etc/security/limits.conf，把nofile设置成65536

# 线程数量
ulimit -u 4096
# 或者打开/etc/security/limits.conf，把nproc设定为4096

# 内存锁定容量
ulimit -l unlimited
# 或者打开/etc/security/limits.conf，把memlock设定为unlimited

用户

groupadd elasticsearch
useradd -g elasticsearch elasticsearch

安装

cd /usr/local/src
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.2.4.tar.gz
tar zxvf elasticsearch-6.2.4.tar.gz
mv elasticsearch-6.2.4 /usr/local/elasticsearch
mkdir /usr/local/elasticsearch/var
chown -R elasticsearch.elasticsearch /usr/local/elasticsearch

配置systemd文件

echo '[Unit]
Description=Elasticsearch
Documentation=http://www.elastic.co
Wants=network-online.target
After=network-online.target

[Service]
RuntimeDirectory=elasticsearch
Environment=JAVA_HOME=/usr/java/jdk1.8.0_144
Environment=JRE_HOME=/usr/java/jdk1.8.0_144/jre
Environment=ES_HOME=/usr/local/elasticsearch
Environment=ES_PATH_CONF=/usr/local/elasticsearch/config
Environment=PID_DIR=/usr/local/elasticsearch/var
EnvironmentFile=-/usr/local/elasticsearch/config/default/elasticsearch

WorkingDirectory=/usr/local/elasticsearch

User=elasticsearch
Group=elasticsearch

ExecStart=/usr/local/elasticsearch/bin/elasticsearch -p ${PID_DIR}/elasticsearch.pid --quiet

# StandardOutput is configured to redirect to journalctl since
# some error messages may be logged in standard output before
# elasticsearch logging system is initialized. Elasticsearch
# stores its logs in /var/log/elasticsearch and does not use
# journalctl by default. If you also want to enable journalctl
# logging, you can simply remove the "quiet" option from ExecStart.
StandardOutput=journal
StandardError=inherit

# Specifies the maximum file descriptor number that can be opened by this process
LimitNOFILE=65536

# Specifies the maximum number of processes
LimitNPROC=4096

# Specifies the maximum number of memory_lock
LimitMEMLOCK=infinity

# Specifies the maximum size of virtual memory
LimitAS=infinity

# Specifies the maximum file size
LimitFSIZE=infinity

# Disable timeout logic and wait until process is stopped
TimeoutStopSec=0

# SIGTERM signal is used to stop the Java process
KillSignal=SIGTERM

# Send the signal only to the JVM rather than its control group
KillMode=process

# Java process is never killed
SendSIGKILL=no

# When a JVM receives a SIGTERM signal it exits with code 143
SuccessExitStatus=143

[Install]
WantedBy=multi-user.target

# Built for ${project.name}-${project.version} (${project.name})' > /usr/lib/systemd/system/elasticsearch.service

systemctl daemon-reload

启动脚本：

SysV init file

elasticsearch systemd unit file

注意JAVA_HOME换成本地生效的版本，不要照抄。

配置elasticsearch.yml

cluster.name: dc-es
node.name: node-1
path.data: /home/es-data/data
path.logs: /home/es-data/logs
bootstrap.memory_lock: true
network.host: 192.168.100.68
http.port: 9200

当设定bootstrap.memory_lock: true锁定内存地址之后，容易出现错误：memory locking requested for elasticsearch process but memory is not locked，解决办法是在/etc/security/limits.conf中配置如下* soft memlock unlimited和* hard memlock unlimited

network.host尽量配置成确定的ip，不要用0.0.0.0

data和log目录设定好以后，要记得创建并授权给elasticsearch
mkdir -p /home/es-data/{data,logs}
chown -R elasticsearch.elasticsearch /home/es-data

配置jvm.options

-Xms16g
-Xmx16g

3) 安装kibana

以下命令在kibana节点服务器上执行

用户

groupadd kibana
useradd -g kibana kibana

安装

cd /usr/local/src
wget https://artifacts.elastic.co/downloads/kibana/kibana-6.2.4-linux-x86_64.tar.gz
tar -xzf kibana-6.2.4-linux-x86_64.tar.gz
mv kibana-6.2.4-linux-x86_64 /usr/local/kibana
chown -R kibana.kibana /usr/local/kibana

配置systemd

echo '[Unit]
Description=Kibana 6

[Service]
Type=simple
User=kibana
Environment=KIBANA_HOME=/usr/local/kibana
Environment=CONFIG_PATH=/usr/local/kibana/kibana.yml
Environment=NODE_ENV=production
ExecStart=/usr/local/kibana/bin/kibana

[Install]
WantedBy=multi-user.target' > /usr/lib/systemd/system/kibana.service

systemctl daemon-reload

配置kibana.yml

server.port: 5601
server.host: "0.0.0.0"
elasticsearch.url: "http://127.0.0.1:9200"

注意server.host改成指定ip，elasticsearch.url指定elasticsearch的服务ip。

5) 安装logstash

以下命令在logstash节点服务器上执行

用户

groupadd logstash
useradd -g logstash logstash

安装

cd /usr/local/src
wget https://artifacts.elastic.co/downloads/logstash/logstash-6.2.4.tar.gz
tar zxvf logstash-6.2.4.tar.gz
mv logstash-6.2.4 /usr/local/logstash
chown -R logstash.logstash /usr/local/logstash

mkdir /home/logstash/logs
mkdir /home/logstash/data
chown -R logstash.logstash /home/logstash

配置systemd

echo '[Unit]
Description=Logstash
Documentation=https://www.elastic.co/products/logstash
After=network.target
#ConditionPathExists=/etc/logstash.conf

[Service]
Environment=JAVA_HOME=/usr/java/jdk1.8.0_144
Environment=JRE_HOME=${JAVA_HOME}/jre
Environment=HOME=/usr/local/logstash
ExecStart=/usr/local/logstash/bin/logstash

[Install]
WantedBy=multi-user.target' > /usr/lib/systemd/system/logstash.service

systemctl daemon-reload

HOME变量是为了解决一个dotfile变量不存在问题

配置startup.options

# Override Java location
#JAVACMD=/usr/bin/java

# Set a home directory
LS_HOME=/usr/local/logstash
HOME=${LS_HOME}

# logstash settings directory, the path which contains logstash.yml
LS_SETTINGS_DIR="${LS_HOME}/config"

# Arguments to pass to logstash
LS_OPTS="--path.settings ${LS_SETTINGS_DIR}"

# Arguments to pass to java
LS_JAVA_OPTS=""

# pidfiles are not used the same way for upstart and systemd; this is for sysv users.
LS_PIDFILE=/var/run/logstash.pid

# user and group id to be invoked as
LS_USER=logstash
LS_GROUP=logstash

# Enable GC logging by uncommenting the appropriate lines in the GC logging
# section in jvm.options
LS_GC_LOG_FILE=/var/log/logstash/gc.log

# Open file limit
LS_OPEN_FILES=16384

# Nice level
LS_NICE=19

# Change these to have the init script named and described differently
# This is useful when running multiple instances of Logstash on the same
# physical box or vm
SERVICE_NAME="logstash"
SERVICE_DESCRIPTION="logstash"

这个文件仅用于$LS_HOME/bin/system-install去生成启动脚本用，如果使用的是systemd，那就不需要用这个文件了

配置jvm.options

-Xms16g
-Xmx16g

配置logstash.yml

# ------------ Data path ------------------
path.data: /home/logstash/data
# ------------ Pipeline Settings --------------
# pipeline.id: main
# pipeline.batch.size: 125
# path.config: /usr/local/logstash/config/redis-pipelines.conf
# ------------ Pipeline Configuration Settings --------------
# config.test_and_exit: true
config.reload.automatic: true
config.reload.interval: 3s
# ------------ Debugging Settings --------------
log.level: info
path.logs: /home/logstash/logs

此配置主要是配置logstash的启动选项，在command line指定的选项会覆盖此配置文件中的配置

配置pipeline.yml

- pipeline.id: redis-pipe
  queue.type: persisted
  path.config: "/usr/local/logstash/config/redis-pipelines.conf"

默认的pipeline配置的主文件

配置redis-pipelines.conf

input {
    redis {
        data_type => "list"
        key => "filebeat-midd"
        host => "192.168.86.138"
        port => 6379
        threads => 5
        password => "my_password"
    }
}
filter {

}
output {
  elasticsearch {
    hosts => ["192.168.100.68:9200"]
    index => "%{[fields][service]}"
  }
}

input redis docs

input redis example

"%{[fields][service]}"是filebeat中的一个field，如何在logstash中使用filebeat的field

logstash中的input部分，redis的key是不可以模糊匹配的，只能写唯一值。但是比较奇怪的是，好多中文甚至是英文文档里面，他们的配置举例里面都是带wildcard的类似于test-*这种写法，针对这种情况，我在elasticsearch的官方讨论区里面找到了一个elasticsearch团队的成员的确切回答，这边只能写唯一值(warkolm Mark Walkom Elastic Team Member May 24:I believe you can only input a single key there.)，链接在这里:elasticsearch的讨论区，还有这里:谷歌讨论组

7) 安装filebeat

安装

cd /usr/local/src
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.2.4-x86_64.rpm
rpm -vi filebeat-6.2.4-x86_64.rpm

配置文件：/etc/filebeat/filebeat.yml

#=========================== Filebeat prospectors =============================
filebeat.prospectors:
- type: log
  enabled: true
  tail_files: true
  paths:
    - /home/middleservice/blog_midd/logs/catalina.out
  fields:
    service: blog_midd
  multiline.pattern: '^[[:space:]]+(at|\.{3})\b|^Caused by:'
  multiline.negate: false
  multiline.match: after
#============================= Filebeat modules ===============================
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
#==================== Elasticsearch template setting ==========================
setup.template.settings:
  index.number_of_shards: 3
#================================ General =====================================
name: 86.24
tags: ["middleservice", "service"]
#================================ Outputs =====================================
output.redis:
  hosts: ["192.168.86.138"]
  password: "my_password"
  bulk_max_size: 1024
  key: "filebeat-midd"
  db: 0
  timeout: 5

fileds替换document_type

filebeat multiline example

filebeat modules list, 可查看启用的modules

general config docs

redis的datatype配置默认是list

Failed to RPUSH to redis list with write tcp i/o timeout错误解决

filebeat默认从日志文件开头开始收集日志，如果希望filebeat从文件末尾开始收集日志，需要在日志源处配置tail_files: true。同时，filebeat会维护一个registry文件，来记录filebeat读取日志的位置，如果是中途增加了tail_files: true配置，需要关闭filebeat服务，删除这个registry文件，然后重新打开filebeat服务才可以。 rpm格式安装的filebeat的registry文件位于:/var/lib/filebeat/registry。详细日志可以参考：Update the registry file

最新文章