Limiting time data is cached
AnsweredWe may have a slightly unique situation. We have a single server implementation (build and query server are the same system) with 20 active cubes. We by necessity have a few wide tables so some dashboards can legitimately use a large amount of memory. Things generally work OK. Query frequency of cubes is sporadic, and most of the time it's just a single cube that is being actively used.
At times though we run into issues where a cube had a query against it that causes it to use say 20 GB of memory, and this happened at 2pm. Unless something else happens, like a build, or another query against that cube or another cube, that needs memory that cube will hold it's memory for hours. I understand why, having things cached like that makes future usage more responsive. If however someone else runs a query when a build is doing heavy I/O (seems to happen when it finishes loading a data set for example) and another cube has been holding memory for hours buy hasn't been accessed, things can go bad. That situation seems to lead to SiSense not being able to flush the cache that the old cube was holding fast enough. Swap, on a different drive, gets invoked, it seems windows memory management will try and get into the game as well. It can result in failed queries, or worse kill a build in progress.
If there were settings that let us clear memory cache that a cube is holding, say within 30 minutes of it not being accessed again or whatever, we could greatly reduce the instances of this happening. Does anyone know of a way to do this? Cubes seem to hold memory forever unless something else requests it, but if you are in a situation where the system is under load for some reason that can't always happen fast enough. With a large number of cubes being accessed, it's rare that cache is really utilized for anything other than dashboard development anyway because it has to flush the old and load a new. So end user experience in loading a dashboard won't really change and they are still quite happy with how that all works even from a cold cube in most cases.
-
Hi Kevin,
Fantastic analysis and description of your situation!
There are 4 ways you could approach this issue, and they aren't mutually exclusive:
- Separate the build process from the query engine by creating a dedicated build node, which would decrease instances of two parts competing for a limited resource and isolate the builds from the more volatile environment, ensuring that even if a conflict occurs builds will not be impacted.
- Create automation to monitor cube usage, and use Sisense APIs to stop or restart cubes that aren't in use. Cubes clear their cache when they are stopped (the entire process is shut down), and they do auto-start when queried. You could achieve this by either listening to JAQL queries incoming to the server or possibly by other means such as monitoring certain logs.
- Contact Sisense Support staff to investigate why this conflicts occur - as you've mentioned yourself, generally Sisense will flush the lesser-used cache to make room for new queries, and the issues you're experiencing stem from a conflict in resource management under specific conditions.
- Submit a feature request for a configurable cache life per cube
Thanks,
1 -
Hi Moti,
We also see this issue where a single cube use then hogs memory, you mention 'listening to JAQL queries incoming to the server' - can you give pointers how that could be done?
Ian
0
Please sign in to leave a comment.
Comments
2 comments