Home | 简体中文 | 繁体中文 | 杂文 | 知乎专栏 | 51CTO学院 | CSDN程序员研修院 | Github | OSChina 博客 | 腾讯云社区 | 阿里云栖社区 | Facebook | Linkedin | Youtube | 打赏(Donations) | About
知乎专栏多维度架构

第 8 章 Elasticsearch

目录

8.1. 安装 Elasticsearch
8.1.1. Helm Chart 安装 Elasticsearch
8.1.2. Elastic Cloud on Kubernetes
8.1.3. docker-compose 安装
8.1.4. Kubernetes
8.1.5. netkiller-devops 编排 elasticsearch
8.1.6. 安装指定版本的 Elasticsearch
8.1.7. Plugin
8.1.7.1. elasticsearch-analysis-ik
8.1.7.2. elasticsearch-analysis-pinyin
8.1.8. netkiller-devops 编排 Kubernetes
8.2. 文档API
8.2.1. 快速上手
8.2.2. 写入 PUT/POST
8.2.3. 获取 GET
8.2.3.1. _source
8.2.4. 检查记录是否存在
8.2.5. 删除 Delete
8.2.6. 参数
8.2.6.1. pretty 格式化 json
8.3. 搜索
8.3.1. URL 搜索
8.3.2. 分页
8.4. Query DSL
8.4.1. match 匹配
8.4.2. multi_match 多字段匹配
8.4.3. Query bool 布尔条件
8.4.3.1. must
8.4.3.2. should
8.4.3.3. must_not
8.4.4. filter 过滤
8.4.5. sort 排序
8.4.6. _source
8.4.7. highlight 高亮处理
8.5. 集群管理
8.5.1. 节点健康状态
8.5.2. 节点http状态
8.5.3. 查看master节点
8.5.4. 查看索引的节点分布
8.5.5. 索引的开启与关闭
8.5.5.1. _open
8.5.5.2. _close
8.6. 中文分词插件管理
8.6.1. 通过 elasticsearch-plugin 命令安装分词插件
8.6.2. 手工安装插件
8.6.3. 创建索引
8.6.4. 删除索引
8.6.5. 配置索引分词插件
8.6.5.1. 测试分词效果
8.7. 索引管理
8.7.1. 查看索引
8.7.2. 删除索引
8.8. 映射
8.8.1. 查看 _mapping
8.8.2. 删除 _mapping
8.8.3. 创建 _mapping
8.8.4. 更新 mapping
8.8.5. 修改 _mapping
8.8.6. 数据类型
8.8.6.1. date
8.9. Alias management 别名管理
8.9.1. 查看索引别名
8.9.2. 创建索引别名
8.9.3. 修改别名
8.9.4. 删除别名
8.10. Example
8.10.1. 新闻资讯应用案例
8.10.2. 文章搜索案例
8.11. Migrating MySQL Data into Elasticsearch using logstash
8.11.1. 安装 logstash
8.11.2. 配置 logstash
8.11.3. 启动 Logstash
8.11.4. 验证
8.11.5. 配置模板
8.11.5.1. 全量导入
8.11.5.2. 多表导入
8.11.5.3. 通过 ID 主键字段增量复制数据
8.11.5.4. 通过日期字段增量复制数据
8.11.5.5. 指定SQL文件
8.11.5.6. 参数传递
8.11.5.7. 控制返回JDBC数据量
8.11.5.8. 输出到不同的 Elasticsearch 中
8.11.5.9. 日期格式转换
8.11.5.10. example
8.11.6. 解决数据不对称问题
8.11.7. 修改 Mapping
8.12. ElasticHD
8.13. 安装 Elasticsearch 早起版本
8.13.1. 6.x 安装
8.13.1.1. 发现配置
8.13.1.1.1. 即将废弃
8.13.2. 单机模式 (适用于开发环境) 5.x
8.13.3. Elasticsearch Cluster 5.x
8.13.3.1. 负载均衡配置
8.13.4. RPM 安装
8.13.5. YUM 安装
8.13.6. 测试安装是否正常
8.13.7. Plugin 插件管理
8.13.7.1. 手工安装插件
8.13.7.2. plugin 命令
8.13.7.3. 插件测试
8.14. FAQ
8.14.1. Plugin [analysis-ik] is incompatible with Elasticsearch [2.3.5]. Was designed for version [2.3.4]
8.14.2. plugin [analysis-ik] is incompatible with version [5.6.1]; was designed for version [5.5.2]
8.14.3. mapper_parsing_exception: failed to parse [ctime]
8.14.4. 配置 JAVA_HOME
8.14.5. memory locking requested for elasticsearch process but memory is not locked

http://www.elasticsearch.org/

8.1. 安装 Elasticsearch

8.1.1. Helm Chart 安装 Elasticsearch

			
Add the Elastic Helm Chart Repo: helm repo add elastic https://helm.elastic.co
Install Elasticsearch: helm install --name elasticsearch elastic/elasticsearch
Install Kibana: helm install --name kibana elastic/kibana			
			
		

8.1.2. Elastic Cloud on Kubernetes

https://www.elastic.co/guide/en/cloud-on-k8s/current/index.html

8.1.3. docker-compose 安装

			
version: '3.8'
services:
  es01:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
    container_name: es01
    environment:
      - node.name=es01
      - cluster.name=elasticsearch-cluster
      - bootstrap.memory_lock=true
      - discovery.zen.ping.unicast.hosts=es01,es02,es03
      - discovery.zen.minimum_master_nodes=2
      - discovery.zen.ping_timeout=5s
      - node.master=true
      - node.data=true
      - node.ingest=false
      - ES_JAVA_OPTS=-Xms256m -Xmx256m
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - data01:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
      - 9300:9300
    networks:
      - elastic
  es02:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
    container_name: es02
    environment:
      - node.name=es02
      - cluster.name=elasticsearch-cluster
      - bootstrap.memory_lock=true
      - discovery.zen.ping.unicast.hosts=es01,es02,es03
      - discovery.zen.minimum_master_nodes=2
      - discovery.zen.ping_timeout=5s
      - node.master=true
      - node.data=true
      - node.ingest=false
      - ES_JAVA_OPTS=-Xms256m -Xmx256m
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - data02:/usr/share/elasticsearch/data
    networks:
      - elastic
    depends_on:
      - es01  
  es03:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
    container_name: es03
    environment:
      - node.name=es03
      - cluster.name=elasticsearch-cluster
      - bootstrap.memory_lock=true
      - discovery.zen.ping.unicast.hosts=es01,es02,es03
      - discovery.zen.minimum_master_nodes=2
      - discovery.zen.ping_timeout=5s
      - node.master=true
      - node.data=true
      - node.ingest=true
      - ES_JAVA_OPTS=-Xms256m -Xmx256m
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - data03:/usr/share/elasticsearch/data
    networks:
      - elastic
    depends_on:
      - es01
  kibana:
    image: docker.elastic.co/kibana/kibana:7.9.2
    container_name: kibana
    environment:
      # SERVER_NAME: kibana.example.org
      ELASTICSEARCH_HOSTS: http://es01:9200
    ports:
      - 5601:5601
    networks:
      - elastic
    depends_on:
      - es01
volumes:
  data01:
    driver: local
  data02:
    driver: local
  data03:
    driver: local

networks:
  elastic:
    driver: bridge   			
			
		

查看节点信息

			
neo@MacBook-Pro-Neo ~/workspace/docker % curl "http://localhost:9200/_cat/nodes?v&pretty"
ip         heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
172.19.0.4           48          86  35    4.86    2.90     1.84 dilmrt    *      es03
172.19.0.3           67          86  35    4.86    2.90     1.84 dlmrt     -      es02
172.19.0.2           45          86  35    4.86    2.90     1.84 dlmrt     -      es01			
			
		

8.1.4. Kubernetes

			
apiVersion: v1
kind: ConfigMap
metadata:
  name: elasticsearch-logging-config
  namespace: kube-public
  labels:
    app: elasticsearch-logging
data:
  limits.conf: |-
    elasticsearch soft memlock unlimited
    elasticsearch hard memlock unlimited
    elasticsearch hard nofile 65536
    elasticsearch soft nofile 65536
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch-logging
  namespace: kube-public
  labels:
    app: elasticsearch-logging
spec:
  serviceName: elasticsearch-logging
  replicas: 1
  selector:
    matchLabels:
      app: elasticsearch-logging
  template:
    metadata:
      labels:
        app: elasticsearch-logging
    spec:
      initContainers:
      - name: elasticsearch-logging-init
        image: alpine:latest
        imagePullPolicy: IfNotPresent
        command: ["/sbin/sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      containers:
      - name: elasticsearch-logging
        image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            cpu: 200m
            # memory: "1Gi"
          requests:
            cpu: 200m
            # memory: "1Gi"
        env:
        - name: "NAMESPACE"
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: node.name
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: "cluster.name"
          value: "elasticsearch-cluster"
        - name: "bootstrap.memory_lock"
          # value: "true"
          value: "false"
        - name: "discovery.seed_hosts"
          value: "elasticsearch-logging-0,elasticsearch-logging-1,elasticsearch-logging-2"
        - name: "cluster.initial_master_nodes"
          value: "elasticsearch-logging-0"
        - name: "discovery.find_peers_interval"
          value: "5s"

        - name: "gateway.expected_nodes"
          value: "2"
        - name: "gateway.expected_master_nodes"
          value: "1"

        - name: "http.cors.enabled"
          value: "true"
        - name: "http.cors.allow-origin"
          value: "*"

        - name: "ES_JAVA_OPTS"
          value: "-Xms1g -Xmx1g"
        - name: RLIMIT_MEMLOCK
          value: "unlimited"
        ports:
        - containerPort: 9200
          name: restful
          protocol: TCP
        - containerPort: 9300
          name: transport
          protocol: TCP
        # readinessProbe:
        #     httpGet:
        #       scheme: HTTP
        #       path: /_cluster/health?local=true
        #       port: 9200
        #     initialDelaySeconds: 5  
        # livenessProbe:
        #   tcpSocket:
        #     port: transport
        #   initialDelaySeconds: 20
        #   periodSeconds: 10
        volumeMounts:
        - name: elasticsearch-config
          mountPath: /etc/security/limits.conf
          subPath: limits.conf
        - name: elasticsearch-data
          mountPath: /data
      volumes:
      - name: elasticsearch-data
        emptyDir: {}
        # hostPath:
          # path: /data
      - name: elasticsearch-config
        configMap:
          name: elasticsearch-logging-config
---
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch-logging
  namespace: kube-public
  labels:
    k8s-app: elasticsearch-logging
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "Elasticsearch"
spec:
  selector:
    app: elasticsearch-logging
  type: NodePort
  # type: ClusterIP
  # clusterIP: None
  ports:
  - name: restful
    port: 9200
    protocol: TCP
    targetPort: restful
    nodePort: 30092
  - name: transport
    port: 9300
    targetPort: transport
			
			
		

8.1.5. netkiller-devops 编排 elasticsearch

创建 docker.py 文件

			
#!/usr/bin/env python3
from netkiller.docker import *
			
volume = Volumes('elasticsearch')
			
elasticsearch = Services('elasticsearch')
elasticsearch.container_name('elasticsearch').environment([
	'discovery.type=single-node',
	'ES_JAVA_OPTS=-Xms512m -Xmx2048m'
])
elasticsearch.image('docker.elastic.co/elasticsearch/elasticsearch:7.15.0').ports(['9200:9200','9300:9300']).volumes(['elasticsearch:/usr/share/elasticsearch/data'])

experiment = Composes('experiment')
experiment.version('3.9')
experiment.volumes(volume)
experiment.services(elasticsearch)

if __name__ == '__main__':
	try:
		docker = Docker()
		docker.sysctl([{'vm.max_map_count':'262144'}])
		docker.environment(experiment)
		docker.main()
	except KeyboardInterrupt:
		print ("Crtl+C Pressed. Shutting down.")
			
		

			
[root@localhost ~]# python3 docker.py -e experiment -l
experiment :
     fluentd
     redis
     mongo
     mysql
     elasticsearch
     
[root@localhost ~]# python3 docker.py -e experiment up elasticsearch
Starting elasticsearch ... done

[root@localhost ~]# python3 docker.py -e experiment ps elasticsearch
    Name                   Command               State                                         Ports                                       
-------------------------------------------------------------------------------------------------------------------------------------------
elasticsearch   /bin/tini -- /usr/local/bi ...   Up      0.0.0.0:9200->9200/tcp,:::9200->9200/tcp, 0.0.0.0:9300->9300/tcp,:::9300->9300/tcp			
			
		

测试

			
[root@localhost ~]# curl -s -X GET "localhost:9200/_cat/nodes?v=true&pretty"
ip         heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name
172.21.0.2           16          99   5    1.68    0.60     0.26 cdfhilmrstw *      aeedaff1ac15			
			
		

停止

			
[root@localhost ~]# python3 docker.py -e experiment stop elasticsearch
Stopping elasticsearch ... done			
			
		

8.1.6. 安装指定版本的 Elasticsearch

使用 yum 安装默认为最新版本,我们常常会遇到一个问题 elasticsearch-analysis-ik 的版本晚于 Elasticsearch。如果使用 yum 安装 Elasticsearch 可能 elasticsearch-analysis-ik 插件不支持这个版本,有些版本的 elasticsearch-analysis-ik 可以修改插件配置文件中的版本号,使其与elasticsearch版本相同,可以欺骗 elasticsearch 跳过版本不一致异常。

最佳的解决方案是去 elasticsearch-analysis-ik github 找到兼容的版本,安装我们安装 elasticsearch-analysis-ik 的版本需求来指定安装 elasticsearch

Versions

IK version	ES version
master	5.x -> master
5.6.0	5.6.0
5.5.3	5.5.3
5.4.3	5.4.3
5.3.3	5.3.3
5.2.2	5.2.2
5.1.2	5.1.2
1.10.1	2.4.1
1.9.5	2.3.5
1.8.1	2.2.1
1.7.0	2.1.1
1.5.0	2.0.0
1.2.6	1.0.0
1.2.5	0.90.x
1.1.3	0.20.x
1.0.0	0.16.2 -> 0.19.0			
			

最新版是 elasticsearch 5.6.1 但分词插件 elasticsearch-analysis-ik 仅能支持到 elasticsearch 版本是 5.6.0

root@netkiller /var/log % yum --showduplicates list elasticsearch | expand | tail
Repository epel is listed more than once in the configuration  
elasticsearch.noarch                 5.5.3-1                  elasticsearch-5.x     
elasticsearch.noarch                 5.6.0-1                  elasticsearch-5.x   
elasticsearch.noarch                 5.6.1-1                  elasticsearch-5.x 
			

安装 5.6.0

# yum install elasticsearch-5.6.0-1

Loaded plugins: fastestmirror, langpacks
Repository epel is listed more than once in the configuration
Loading mirror speeds from cached hostfile
Resolving Dependencies
--> Running transaction check
---> Package elasticsearch.noarch 0:5.6.0-1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

==========================================================================================================================================================================================================
 Package                                            Arch                                        Version                                      Repository                                              Size
==========================================================================================================================================================================================================
Installing:
 elasticsearch                                      noarch                                      5.6.0-1                                      elasticsearch-5.x                                       32 M

Transaction Summary
==========================================================================================================================================================================================================
Install  1 Package

Total download size: 32 M
Installed size: 36 M
Is this ok [y/d/N]: y
			

8.1.7. Plugin

Elasticsearch 提供了插件管理命令 elasticsearch-plugin

root@netkiller ~ % /usr/share/elasticsearch/bin/elasticsearch-plugin -h
A tool for managing installed elasticsearch plugins

Commands
--------
list - Lists installed elasticsearch plugins
install - Install a plugin
remove - removes a plugin from Elasticsearch

Non-option arguments:
command              

Option         Description        
------         -----------        
-h, --help     show help          
-s, --silent   show minimal output
-v, --verbose  show verbose output			
			

8.1.7.1. elasticsearch-analysis-ik

安装插件

root@netkiller ~ % /usr/share/elasticsearch/bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.5.1/elasticsearch-analysis-ik-5.5.1.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.5.1/elasticsearch-analysis-ik-5.5.1.zip
[=================================================] 100%   
-> Installed analysis-ik
				
curl -XPOST http://localhost:9200/index/fulltext/_mapping -d'
{
        "properties": {
            "content": {
                "type": "text",
                "analyzer": "ik_max_word",
                "search_analyzer": "ik_max_word"
            }
        }
    
}'			
				

8.1.7.2. elasticsearch-analysis-pinyin

https://github.com/medcl/elasticsearch-analysis-pinyin

8.1.8. netkiller-devops 编排 Kubernetes

参考《Netkiller Container 手札》