Issue with Large Data Limit and Advanced Calculations in ods-adv-analysis and ods-aggregation Widgets

Hello everyone,

I've been working with the ods-adv-analysis widget and noticed that the output variable seems to be limited to a maximum of 20,000 elements. This constraint is making it difficult to handle large datasets, which are crucial for my project.

Additionally, while the ods-aggregation widget works well for basic functions, it only offers simple calculations. For my use case, I need to perform more advanced operations on larger datasets.

Does anyone know if there are options or settings to bypass this limit or to perform more complex calculations on large data sets? Any guidance would be greatly appreciated!

Thanks in advance for your help!

Auto-translation 🪄

Bonjour à tous, J'ai travaillé avec le widget ods-adv-analysis et j'ai remarqué que la variable de sortie semble être limitée à un maximum de 20 000 éléments. Cette contrainte rend difficile la gestion de grands ensembles de données, qui sont essentiels pour mon projet. De plus, bien que le widget ods-aggregation fonctionne bien pour les fonctions de base, il n'offre que des calculs simples. Pour mon cas d'utilisation, je dois effectuer des opérations plus avancées sur des ensembles de données plus volumineux. Quelqu'un sait-il s'il existe des options ou des paramètres pour contourner cette limite ou pour effectuer des calculs plus complexes sur de grands ensembles de données ? Toute aide serait grandement appréciée ! Merci d'avance pour votre aide !

Page 1 / 1

Hi,

ods-adv-analysis can replace ods-aggregation for many usages, as you can count/group/aggregate etc…

Can you give me an example of limitation you encounter please ? We’ll maybe find a workaround.

Auto-translation 🪄

Bonjour, ods-adv-analysis peut remplacer ods-aggregation pour de nombreuses utilisations, car vous pouvez compter/grouper/agréger etc… Pouvez-vous me donner un exemple de limitation que vous rencontrez s'il vous plaît ? Nous trouverons peut-être une solution de contournement.

Hi Frederic,

Thank you for your response!

The issue I’m facing specifically occurs with large datasets that exceed 20,000 elements. Otherwise, the ods-adv-analysis widget works perfectly.

Here’s the code I’m currently using:

<div ods-adv-analysis="analysis" ods-adv-analysis-context="mycontext" ods-adv-analysis-select="SUM(value) as power" ods-adv-analysis-group-by="hour"> <div ods-subaggregation="analysis" ods-subaggregation-serie-pmax="MAX(power)"> {{results<0].pmax}} </div> </div>

However, when I try to add something like {{analysis.length}}, I notice that the length never goes beyond 20,000.

Thanks again for your help!

Hi everyone,
I am struggling also with the number of records in an ods-adv-analysis.
If I don’t specify a limit, I have only 1 record :

ods-adv-analysis="maintenances"
ods-adv-analysis-context="kaptamaintenance"
ods-adv-analysis-select="nom,date_end,cause,nbchgtsonde,nbchgtgsm,nbchgtbattery,nbnettoyage,nbchgtsim,chgt_freq"

kaptamaintenance :
<br />

If I specify a ods-adv-analysis-limit="100" then I have 100 records

If I specify a ods-adv-analysis-limit="1000" I don’t have anything:

Can you help me with this?

Auto-translation 🪄

Bonjour à tous, je suis également aux prises avec le nombre d’enregistrements dans une analyse ods-adv. Si je ne spécifie pas de limite, je n’ai qu’un seul enregistrement : ods-adv-analysis="maintenances"ods-adv-analysis-context="kaptamaintenance"ods-adv-analysis-select="nom,date_end,cause,nbchgtsonde,nbchgtgsm,nbchgtbattery,nbnettoyage,nbchgtsim,chgt_freq"kaptamaintenance :
Si je spécifie un ods-adv-analysis-limit="100" alors j’ai 100 enregistrements Si je spécifie un ods-adv-analysis-limit="1000" je n’ai rien : Pouvez-vous m’aider avec ça ?

@Antoine , indeed, I understand now as you make a 2 step aggregation with ods-subaggregation widget. And therefore you suffer from this limit.

The big problem here is not the limit from the API. It’s the fact that the API don’t provide a two step aggregation that would suits your need.

Because when you do this kind of trick, you must keep in mind that you’ll download the entire output from the API (20 000 rows) and then, parse them and compute another aggregation.

It the API limit would be 100K or even with no limit, the page would be frozen and the experience would be very slow or even too long to load for the user.

So this kind of usage should be avoided and we often need to find another way to proceed.

In your case it’s clearly a limitation of the API, and the only way to resolve this would be to have an intermediary dataset based on the first step of aggregation : 1 record would be the SUM(power) by hour for 1 year of data for exemple (as in one year you have 24*31*12 hours, that is less than 20 000).

Or you limit your dashboard to get the biggest hour for only 1 year, by filtering your context that perform the analysis.

I clearly understand that it’s not ideal, I shared this topic internally and we have these usages in mind regarding API evolutions.

@Guillaume Perrin-Fabre ods-adv-analysis has two limits depending on the type of query.

When you list items, as it relies on /records endpoint, you have the 100 hits limitation. Asking for more will return an error from the API (I encourage you to always have the console opened to see the 400 errors in the console or network tab and read the API error as it’s explicit)

When you perform a groupby, you go through the aggregation endpoint/feature and therefore you’ll have the 20K hits limitation.

In your case, if you want to list and go beyond the 100 limit, add a groupby with all the fields listed in the select clause, it will output the same thing, but without the limitation.

Auto-translation 🪄

@Antoine, en effet, je comprends maintenant que tu fasses une agrégation en 2 étapes avec le widget ods-subaggregation. Et donc tu souffres de cette limite. Le gros problème ici n'est pas la limite de l'API. C'est le fait que l'API ne fournit pas d'agrégation en deux étapes qui répondrait à tes besoins. Car lorsque tu fais ce genre d'astuce, tu dois garder à l'esprit que tu vas télécharger l'intégralité de la sortie de l'API (20 000 lignes) puis les analyser et calculer une autre agrégation. Si la limite de l'API était de 100 Ko ou même sans limite, la page serait gelée et l'expérience serait très lente voire trop longue à charger pour l'utilisateur. Ce genre d'utilisation doit donc être évité et nous devons souvent trouver un autre moyen de procéder. Dans votre cas, il s’agit clairement d’une limitation de l’API, et la seule façon de résoudre ce problème serait d’avoir un jeu de données intermédiaire basé sur la première étape d’agrégation : 1 enregistrement serait la SOMME (puissance) par heure pour 1 an de données par exemple (comme dans une année vous avez 24*31*12 heures, cela

Super clear, Thanks @frederic.passaniti!

Auto-translation 🪄

Super clair, merci @frederic.passaniti !

Thanks @frederic.passaniti , your answer is much appreciated!
I understand that 2-step aggregation is indeed tricky, and I hope to see the API enhanced for this type of usage.

For now, I will explore your suggestion to create an intermediate dataset.

Auto-translation 🪄

Merci @frederic.passaniti , votre réponse est très appréciée ! Je comprends que l'agrégation en 2 étapes est en effet délicate, et j'espère voir l'API améliorée pour ce type d'utilisation. Pour l'instant, je vais explorer votre suggestion de créer un ensemble de données intermédiaire.

Sign up

Already have an Opendatasoft account ?

Login to the community

Already have an Opendatasoft account ?

Scanning file for viruses.

This file cannot be downloaded