Table of Contents
- Introduction
Steps to Ingest Data from PostgreSQL to Elasticsearch using Logstash in Docker
2.1. Set up Docker (Here)
2.2. Setup ElasticSearch, Logstash and Kibana using Docker
2.3. Setup Postgresql using Docker
2.4. Create Logstash.conf
2.5 Start Docker Containers
2.6. Verify data ingestion
Introduction
In this tutorial, we'll show you how to use Logstash in a Docker environment to ingest data from PostgreSQL to Elasticsearch. You may gather, transform, and store data from numerous sources with Logstash, a potent data processing pipeline. Elasticsearch is a distributed search and analytics engine, and PostgreSQL is a well-known open-source relational database.
You may quickly create a containerized environment to speed up the data ingesting process by combining Logstash and Docker. This technique guarantees a smooth data transfer from PostgreSQL to Elasticsearch and provides flexibility, scalability, and isolation for running Logstash.
Let's now explore the procedures needed to complete this task.
Steps to Ingest Data from PostgreSQL to Elasticsearch using Logstash in Docker
2.1. Set up Docker (Here)
Before proceeding, ensure that Docker is installed and running on your system. Docker provides a convenient way to manage and deploy containers.
2.2. Setup ElasticSearch, Logstash and Kibana using Docker
You folder structure should look something like this :
To setup ELK create a new file inside elk folder named docker-compose.yml and paste the following content: (Check the folder structure)
version: "3.8"
services:
Elasticsearch:
image: elasticsearch:8.8.0
container_name: elasticsearch
restart: always
volumes:
- elastic_data:/usr/share/elasticsearch/data/
environment:
ES_JAVA_OPTS: "-Xmx256m -Xms256m"
discovery.type: single-node
xpack.security.enabled: false
xpack.security.enrollment.enabled: false
ports:
- "9200:9200"
- "9300:9300"
networks:
- blog-builder-backend
Logstash:
image: logstash:8.8.0
container_name: logstash
restart: always
volumes:
- ./logstash/:/logstash_dir
command: logstash -f /logstash_dir/logstash.conf
depends_on:
- Elasticsearch
ports:
- "9601:9600"
environment:
LS_JAVA_OPTS: "-Xmx256m -Xms256m"
networks:
- blog-builder-backend
Kibana:
image: kibana:8.8.0
container_name: kibana
restart: always
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_URL=http://elasticsearch:9200
depends_on:
- Elasticsearch
networks:
- blog-builder-backend
volumes:
elastic_data: {}
networks:
blog-builder-backend:
driver: bridge
2.2. Setup Postgres using Docker
To setup Postgresql and manage data within the database we need two docker images Postgresql and Pgadmin
Now lets create a postgres folder inside our project directory and create a file named docker-compose.yml and paste the following content: (Check the folder structure)
version: "3.8"
services:
postgres:
image: postgres
container_name: postgres
hostname: postgres
environment:
POSTGRES_DB: blogService
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
ports:
- 5432:5432
volumes:
- blog-service-db:/var/lib/postgresql/data
networks:
- elk_blog-builder-backend
pgadmin:
image: dpage/pgadmin4
container_name: pgadmin
restart: always
depends_on:
- postgres
ports:
- "5050:80"
environment:
PGADMIN_DEFAULT_EMAIL: admin@admin.com
PGADMIN_DEFAULT_PASSWORD: admin
# volumes:
# - user-service-pgadmin:/var/lib/pgadmin
networks:
- elk_blog-builder-backend
volumes:
blog-service-db:
networks:
elk_blog-builder-backend:
external: true
2.4. Create Logstash.conf
To configure Logstash for ingesting data from PostgreSQL to Elasticsearch, you need to create a Logstash configuration file. This file defines the input source, output destination, and any necessary transformations.
2.4.1 Create a folder inside elk directory named logstash
2.4.2 Save the logstash.conf
file inside the logstash folder that we created just now (Check the folder structure)
2.4.3 Now paste the following code inside logstash.conf
input {
jdbc {
jdbc_connection_string => "jdbc:postgresql://postgres:5432/blogService"
jdbc_user => "postgres"
jdbc_password => "postgres"
jdbc_driver_library => "/logstash_dir/postgresql-42.6.0.jar"
jdbc_driver_class => "org.postgresql.Driver"
schedule => "* * * * *"
statement => "select * from blogdb where updated_at >= :sql_last_value order by updated_at asc"
}
}
output {
stdout {
codec => rubydebug
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "blogs"
document_id => "%{[id]}"
}
}
2.4.4 Now download the latest postgresql jdbc jar plugin here and copy it to the logstash directory
2.5. Start Docker containers
Go to the elk folder and run
docker-compose up
Similarly go to the postgres folder and run
docker-compose up
Wait till all containers are up and running
2.6. Verify data ingestion
After starting the Logstash container, check its logs to ensure there are no errors. Run the following command to view the container logs:
docker logs logstash
Once the Logstash container is running without any issues,go to the Pgdmin and create a database and a table with name specified in logstash.conf file (currently its blogService and blogdb respectively ). Each time you create or update or delete to the table in postgres database table you will see an updated logs in logstash and then access your Elasticsearch instance and verify that the data has been successfully ingested into the specified index(in current config its named as blog).
Congratulations! You have now successfully set up Logstash in Docker to ingest data from PostgreSQL to Elasticsearch. This process allows you to efficiently transfer and store data, enabling powerful search and analytics capabilities using Elasticsearch.
Comments
Post a Comment