Issue with Large Data Limit and Advanced Calculations in ods-adv-analysis and ods-aggregation Widgets
Hello everyone,
I've been working with the ods-adv-analysis widget and noticed that the output variable seems to be limited to a maximum of 20,000 elements. This constraint is making it difficult to handle large datasets, which are crucial for my project.
Additionally, while the ods-aggregation widget works well for basic functions, it only offers simple calculations. For my use case, I need to perform more advanced operations on larger datasets.
Does anyone know if there are options or settings to bypass this limit or to perform more complex calculations on large data sets? Any guidance would be greatly appreciated!
@Antoine , indeed, I understand now as you make a 2 step aggregation with ods-subaggregation widget. And therefore you suffer from this limit.Â
Â
The big problem here is not the limit from the API. It’s the fact that the API don’t provide a two step aggregation that would suits your need.Â
Because when you do this kind of trick, you must keep in mind that you’ll download the entire output from the API (20 000 rows) and then, parse them and compute another aggregation.Â
It the API limit would be 100K or even with no limit, the page would be frozen and the experience would be very slow or even too long to load for the user.Â
So this kind of usage should be avoided and we often need to find another way to proceed.Â
In your case it’s clearly a limitation of the API, and the only way to resolve this would be to have an intermediary dataset based on the first step of aggregation : 1 record would be the SUM(power) by hour for 1 year of data for exemple (as in one year you have 24*31*12 hours, that is less than 20 000).
Or you limit your dashboard to get the biggest hour for only 1 year, by filtering your context that perform the analysis.Â
I clearly understand that it’s not ideal, I shared this topic internally and we have these usages in mind regarding API evolutions.Â
Â
Â
@Guillaume Perrin-Fabre ods-adv-analysis has two limits depending on the type of query.
When you list items, as it relies on /records endpoint, you have the 100 hits limitation. Asking for more will return an error from the API (I encourage you to always have the console opened to see the 400 errors in the console or network tab and read the API error as it’s explicit)
When you perform a groupby, you go through the aggregation endpoint/feature and therefore you’ll have the 20K hits limitation.
In your case, if you want to list and go beyond the 100 limit, add a groupby with all the fields listed in the select clause, it will output the same thing, but without the limitation.Â
Thanks @frederic.passaniti , your answer is much appreciated! I understand that 2-step aggregation is indeed tricky, and I hope to see the API enhanced for this type of usage.
For now, I will explore your suggestion to create an intermediate dataset.