Log PrestoSQL/Trino Queries
PrestoSQL/Trino keeps information about queries for limited time and count and we can view and search on this information with Trino UI.
We can set parameters to how long and count that is required to Trino keeps queries statistics with below parameters in config.properties file:
query.min-expire-age=60m
query.max-history=50
But the problem is that if by any reason we restart Trino/PrestoSQL coordinator, we lost the history of this information. for solving this issue we can use a plugin that name is trino-querylog, this plugin as s listener logs all information about queries and we can send this information to storages like Elasticsearch and visualize and search on this data.
Tools that used in this article :
IntelliJ IDEA, to build plugin
Trino 355
Download release from this path : https://github.com/hassanahmadkhani/trino-querylog/blob/main/target/trino-querylog-0.4.0-SNAPSHOT-dist.zip
Or build from code : https://github.com/hassanahmadkhani/trino-querylog
Steps:
- Create a directory with name trino-querylog in trino_installation_location/plugin
- Extract and put trino-querylog-0.4.0-SNAPSHOT-dist.zip content in trino-querylog directory
- create event-listener.properties file in trino_installation_location/etc
Put below parameters in this configuration file :
we only log the query start event, also we can log query end event for duration calculation or something like this. also we can log split details for a query. as you know queries for execution in PrestoSQL/Trino divides into splits.
4. Create log4j configuration file with name querylog-log4j2.xml in trino_installation_location/etc directory.
Put below configurations in this file :
5. Restart Trino, before running any query this query will be log in log path in JSON format, below is the sample format of this file :
We can send this logs to Elastic or MongoDB or other storages engines to index and search on history of logs. we can use multiple options for this scenario, for example : Using NiFi/Fluentd/Flume to stream this logs and put in Elastic/MongoDB/Solr/Kafka/… per use case
Conclusion :
With Trino query log we can store Trino queries history for long term and use for analysis, by default after Trino restart we lost this information.