Kafka troubleshooting
kafkaBin="/opt/veridiumid/kafka/bin/"
kafkaConfig="/opt/veridiumid/kafka/config/"
zkConn=`grep "zookeeper.connect=" ${kafkaConfig}/server.properties | awk -F'=' '{print $2}'`
zkIP=`grep "zookeeper.connect=" ${kafkaConfig}/server.properties | awk -F'=' '{print $2}' | awk -F':' '{print $1}'`
topic=accounts-data
### get configurations for topic
${kafkaBin}/kafka-topics.sh --describe --topic $topic --zookeeper ${zkConn}
### see how many events are in the topic
${kafkaBin}/kafka-console-consumer.sh --from-beginning --bootstrap-server $zkIP:9092 --property print.timestamp=true --topic $topic
### modify retention ms for account-data to 1 second
${kafkaBin}/kafka-configs.sh --zookeeper $zkConn --entity-type topics --entity-name $topic --alter --add-config retention.ms=1000
### remove property added
${kafkaBin}/kafka-configs.sh --zookeeper $zkConn --entity-type topics --entity-name $topic --alter --delete-config retention.ms
### NOT RECOMMENDED: delete and recreate a topic
${kafkaBin}/kafka-topics.sh --zookeeper ${zkConn} --delete --topic mail-events
${kafkaBin}/kafka-topics.sh --zookeeper ${zkConn} --create --topic mail-events --partitions 10 --replication-factor 3
${kafkaBin}/kafka-configs.sh --zookeeper ${zkConn} --entity-type topics --entity-name mail-events --alter --add-config retention.ms=1800000
###
consumer='mail-notifications-sending'
###${kafkaBin}/kafka-consumer-groups.sh --bootstrap-server ${zkConn} --delete --group ${consumer}
consumer='mail-notifications-sending'
${kafkaBin}/kafka-consumer-groups.sh --bootstrap-server $zkIP:9092 --group ${consumer} --describe
topic='mail-events'
### see how many events are in the topic
${kafkaBin}/kafka-console-consumer.sh --from-beginning --bootstrap-server $zkIP:9092 --property print.timestamp=true --topic $topic
General statistics flow:
the process reads from a topic (the configurations are in reference.conf in the code
CODEreporting { kafka { applicationId = "accounts-dashboard-reporting-bioengines" commitOffsetReset = "earliest" inputTopic = "accounts-data" outputTopic = "accounts-dashboard-data-bioengines" } tests { timeout = 60 } streams { punctuateTime = 10 writeToCassandraNumThreads = 10 } }
the statistics process takes the message from inputTopic and writei t in the database. If successful, it put’s it in the outputTopic.
to get more informations about statistics processes, files from /opt/veridiumid/statistics/conf should be modied.
the kafka configurations can be seen in /etc/veridiumid/kafka/server.properties
log.retention.hours = 168 (7 days) this is for the segment .. (so it can be older messages)
log.segment.bytes = 1Gb
a message is read by the consumer, but the messages are kept in kafka for longer periods.
when the process is starting, it reads the oldest unread message.
The processing is done in chunks of messages; no special aggregations are done, so 128Mb should be sufficient
How to modify parameters:
/vid-app/1.0.0/kafka/bin/kafka-configs.sh --zookeeper 10.79.1.204:2181 --entity-type topics --entity-name mail-events --alter --add-config retention.ms=300000
in case that the following error appears in log_cleaner.log or other error related to kafka.log.LogCleaner
, most probably it means the compaction task is stopped at all and /vid-app/logs/kafka/ continuously grows.
https://issues.apache.org/jira/browse/KAFKA-8335
https://medium.com/@ylambrus/youve-got-a-friend-logcleaner-1fac1d7ec04f
The workaround would be to get rid of old data.
Option 1:
## stop kafka
service vid_kafka stop
## delete file
/vid-app/logs/kafka/cleaner-offset-checkpoint
## start kafka
service vid_kafka stop
Option 2 - without stopping the services.
This option can be used also for kafka disk cleanup.
## setup env
kafkaBin="/opt/veridiumid/kafka/bin/"
kafkaConfig="/opt/veridiumid/kafka/config/"
zkConn=`grep "zookeeper.connect=" ${kafkaConfig}/server.properties | awk -F'=' '{print $2}'`
zkIP=`grep "zookeeper.connect=" ${kafkaConfig}/server.properties | awk -F'=' '{print $2}' | awk -F':' '{print $1}'`
topic=__consumer_offsets
${kafkaBin}/kafka-configs.sh --zookeeper $zkConn --entity-type topics --entity-name $topic --alter --add-config cleanup.policy=delete
topic=__transaction_state
${kafkaBin}/kafka-configs.sh --zookeeper $zkConn --entity-type topics --entity-name $topic --alter --add-config cleanup.policy=delete
###################### restart kafka and wait up to 5 minutes( sudo su & service ver_kafka restart ) , so the cleanup should e executed; follow the logs
check the disk size from /vid-app/dyn/logs and run du-sh .
## file system size should be lower now##
###################### set back the kafka parameters
topic=__consumer_offsets
${kafkaBin}/kafka-configs.sh --zookeeper $zkConn --entity-type topics --entity-name $topic --alter --add-config cleanup.policy=compact
topic=__transaction_state
${kafkaBin}/kafka-configs.sh --zookeeper $zkConn --entity-type topics --entity-name $topic --alter --add-config cleanup.policy=compact