Back

[SOLVED] Compare large data

  • 0
  • Databases
Drake
17 Feb, 2023, 00:26

If those are IDs, you could iterate over the new data and issue a get request for each one. Get document API calls are cached so they should be faster than list documents from a server side processing standpoint

TL;DR
The user was comparing large amounts of data and wanted to optimize the process. They considered using a string attribute with a JSON lookup table or a string array attribute to handle the data. There were suggestions to use Meilisearch for smarter search capabilities and to combine it with Appwrite. It was mentioned that batching GraphQL requests could handle large volume data better. The solution proposed was to use get requests for each ID as they are faster than listing documents.
Said H
17 Feb, 2023, 01:35

i see. unfortunately those are not documentID, but a custom ID. Which means, i can only use listDocuments. I will take note of this. I may consider converting the custom ID into document ID. Thanks for your input @Steven

VincentGe
17 Feb, 2023, 18:34

What if you enforced "unique" on an index and just failed the createDocument request?

VincentGe
17 Feb, 2023, 18:34

I wonder if the performance would be better with less calls 🤣

Said H
23 Feb, 2023, 09:57

Thanks @VincentGe .the thing is, I'm not trying to insert doc, but merely doing a verification whether any of my existing papers have been removed.

VincentGe
23 Feb, 2023, 15:02

🤔 interesting. Large volume data handling is really something we haven't done a ton with. I wonder if we can batch this operation with graphql

VincentGe
23 Feb, 2023, 15:02

@Steven From what I remember, one of the cool things about our new graphql api is that we can batch many calls at once 👀

Drake
23 Feb, 2023, 15:16

Ya but in this case, there's a bit too much data to batch in one graphql request

VincentGe
23 Feb, 2023, 15:38

🤔

Olivier Pavie
27 Feb, 2023, 05:12

Eu… just a random thought, @Said H why don’t you combine your Appwrite instance with something like meilisearch? It’s optimal for searching and is very powerful at that…. It could help lessen the burden on Appwrite?

Said H
27 Feb, 2023, 05:49

hey @Olivier Pavie , thank you for the suggestion. I will take a look at meilisearch 🙂

Said H
2 Mar, 2023, 04:21

I just learned the tutorial of Meilisearch @Olivier Pavie, and i noticed that we are feeding our json data to the system. That gives me an idea that we too can do that in appwrite, yeah?

  1. We can create a String attribute and set the length to the max, then dump our json object string containing >5k records inside. So, we will only need to do 1x query call to AppWrite. We can then do the record comparison afterward
  2. or, we can create a String attribute and treat that as an Array. Insert the array record 5k times. I'm not sure though whether there is any limit to array record we can insert per doc. Would you have any idea @Steven or @VincentGe ?
Drake
2 Mar, 2023, 04:24

It depends on what you want to do. I would assume Meilisearch's full text search is a little smarter than mariadb's

Said H
2 Mar, 2023, 04:27

it has smarter search, but i wont need that. i just need to be able to read the record

Said H
2 Mar, 2023, 04:28

i am inclined to create a string array attribute to handle these large data. but i worry there is a size limit to that

Said H
2 Mar, 2023, 04:29

i can do a 1x call to the doc, then process the comparison in the code. it'

Said H
2 Mar, 2023, 04:29

it's a fast comparison, since each record will have its unique ID

Said H
2 Mar, 2023, 04:29

so, i will end up having 1 doc containing 5k array records

Said H
2 Mar, 2023, 04:30

but i only need to do 1x call to this 1 doc

Said H
2 Mar, 2023, 04:31

i can give it a try, if you are unsure whether there is a limit to the max array record length 🙂

Drake
2 Mar, 2023, 04:32

It might be best to use a very large string attribute and store a JSON string where the JSON string is essentially a lookup table {"id1": 1, "id2": 1}

Said H
2 Mar, 2023, 04:32

i see. this is fine too

Said H
2 Mar, 2023, 04:33

i can use this. and the comparison will work in my case as well

Said H
2 Mar, 2023, 04:33

alright. thanks everyone 🙂

Drake
7 Mar, 2023, 19:43

[SOLVED] Compare large data

Reply

Reply to this thread by joining our Discord

Reply on Discord

Need support?

Join our Discord

Get community support by joining our Discord server.

Join Discord

Get premium support

Join Appwrite Pro and get email support from our team.

Learn more