I have received many calls because dashboards and http requests suddenly stopped working. I have monitored the server load and it stays normal even a bit lower than before. I also monitored the request times in the Dashboards from my browser’s inspector, as well as from external agents like POSTMAN, the waiting times are absurdly large (This hasn’t happened before).
I see this echoing in the Buckets view in the Thinger client (Which makes sense), everything seems to be much slower and responds seemingly arbitrarily.
I have analyzed the behavior of the requests (When they are successful) and it seems that the server responds just after the falling edge of the “device” signal in the first graph attached below.
Honestly I prefer a previous but stable version that is with many valuable additions but unstable. IS IT POSSIBLE TO GO BACK TO A PREVIOUS VERSION UNTIL IT IS STABLE?
I’m sure the problems started when they upgraded my instance.
We have created some services with AWS LAMBDA that query data to the Buckets, I attach an image that shows how the number of query failures increased abruptly (It must coincide with the moment of updating my instance).
can you review the lambda statistics right now? We have restarted your database. The migration took some OS resources, so it can be the cause of failure. However, we are noticing that your are getting too close to your SMALL instance limits based on the number and size of your buckets. Please, consider optimizing your deployment by:
- Query aggregated data in your dashboards, i.e., a mean by day, hour, etc. Sometimes it is not required to load thousands of data points in a dashboard. This can reduce dashboard loading, server load, etc.
- Reduce data ingestion rate
- Apply a bucket retention policy from your user account limits to automatically remove old data (i.e., keep only last 6 or 3 months or what is required in your use case)
- Store only time series data on buckets (i.e., numbers, floats, small strings, etcs).
Take into account that every data point in the bucket takes some RAM for indexing and processing. It is critical on SMALL instances, which are just for prototyping (1GB RAM).
What they did seems to work.
There are things that I am not proud of in the implementation but we had to solve it. Soon we will launch a massive update of our devices (We have built an architecture together with AWS, to do it since at the moment you do not offer that service). The price of the medium plan is too big a jump in costs for us, we can’t move to that plan yet.
It would be very valuable if they allowed to add resources (more RAM, more Disk, another processor) to the instance, so that the capacity of the instance grows as the company and the number of devices grow… if they allow that, it will be more Easy to reach your average plan with all the advantages associated with this (Branding, domains, users, collaborators).