frappe Press site analytics - debugging filebeat

In our experience, frappe Press's site analytics is the hardest part to get right. Out of the box it is extremely badly configured. Here are a few tips we've learned.

 · 2 min read

frappe Press analytics - filebeat debugging

Press is the frappe app that they use to provide their frappe cloud hosting service. The Log Server is one of many components that the app ties together; it keeps track of the site usage.

Note: this applies to filebeat version 8.16.0

The Log Server makes use of Elasticsearch and filebeat to keep track of site activity in a rapidly searchable form. Unfortunately, out of the box, the server is not efficiently configured and uses an older version. So we upgraded to the latest version of elasticsearch and filebeat which was 8.16.0. This is useful because it supports data streams which are a bit easier to deal with than indices. We also removed unnecessary data and configured an Index Lifecycle Policy to discard old data.

We found a problem where filebeat wasn't uploading data to the Elasticsearch data stream. Yet there was nothing reported in any error logs, and all the services were running fine!

We first checked the input monitor.json.log to make sure data was being reported by the bench, and it was.

So to start debugging filebeat, we enabled the http server to give us some statistics which might tell us what was happening. In /etc/filebeat/filebeat.yml :

# ============================== Monitoring ============================
http.enabled: true
http.host: "localhost"
http.port: 5555

Then we could dump the statistics using:

curl -X GET "http://localhost:5555/stats?pretty"

This showed us that the pipeline was rapidly retrying data, but there were no statistics incrementing in the output section. So filebeat was stuck somehow, but filebeat test config said everything was fine?

The command that gave us more clues was:

filebeat setup --index-management

This gave us errors about the ILM policy json that we had saved off from elasticsearch, but whose format wasn't acceptable for filebeat to upload. We had to remove the parent filebeat object container, and the following fields:

in_use_by
modified_date
version

Once the above command ran successfully, a flood of data hit the elasticsearch index!

Flexible, low-cost software to run any business

Professional self-serve hosting for everyone


No comments yet.

Add a comment
Ctrl+Enter to add comment