Back

Upload same chunk is counted multiple times

  • 0
  • Self Hosted
  • Storage
Joshi
15 Jul, 2024, 09:46

I'm currently trying to implement a "retry" feature. When a request fails and certain conditions are met (ex. rate limit triggered, connection lost, server down, etc.) it will retry the request. It works perfectly but not for the Appwrite Storage services.

When we are currently in the process of uploading a file chunk and the connection drops then obviously the file is corrupted. That's fine because we can reupload the file chunk and continue from there. This is where the issue arises.

Appwrite uses Utopia PHP - Storage for its storage implementation: https://github.com/utopia-php/storage/blob/main/src/Storage/Device/Local.php

How it works in short: When we upload a file that is bigger than the chunk size then it will split the file up in individual chunk and writes the number of the file chunk to the "<filename>_chunks.log" file.

I tried the retry feature by turning the internet off on my device and then turning if on again. It detects that the connection has been lost and will keep retrying from there on. If the connection drops while we were uploading a file chunk it will retry the same file chunk again, otherwise it will continue with the next file chunk. Reuploading the chunk that has been corrupted does in fact override the file chunk on the server. But it writes the number of the chunk in the log file("<filename>_chunks.log") down again.

TypeScript
2
[...]
6
7
7 <- Duplicate
8
9
[...]
54
55
56
56 <- Duplicate
57
[...]
109
110
110 <- Duplicate
111
112

This is how the logs look like. The connection dropped when I was uploading the 7th file chunk. So my App tries to reupload the same chunk. Appwrite overrides the file chunk and writes 7 in the log file again. Same for 56 and 110.

On the 112th file chunk upload I get an 500 "general_unknown" error back. Because Appwrite only expects 112 chunks for that file and when the log file reaches 112 lines it thinks the upload is finished and tries to build the file.

This method writes a log entry for each chunk received:

TypeScript
if (! file_put_contents($tmp, "$chunk\n", FILE_APPEND)) {
    throw new Exception('Can\'t write chunk log '.$tmp);
} ```

Counts the received chunks:
```php
$chunksReceived = count(file($tmp));

Finally, it checks if all chunks have been received:

TypeScript
if ($chunks === $chunksReceived) {
    $this->joinChunks($path, $chunks);
    return $chunksReceived;
}

Final response after uploading all file chunks:

TypeScript
{
   "$id":"6694ebdf60a3886824fe",
   "bucketId":"x",
   "$createdAt":"2024-07-15T09":"29":"06.218+00":00,
   "$updatedAt":"2024-07-15T09":"31":"43.885+00":00,
   "$permissions":[
      "read(""any"")"
   ],
   "name":"Magic of Hong Kong. Mind-blowing cyberpunk drone video of the craziest Asia’s city by Timelab.pro.mp4",
   "signature": null,
   "mimeType": null,
   "sizeOriginal":583648850,
   "chunksTotal":112,
   "chunksUploaded":115
}```
That's why we get 115 chunksUploaded in the final response, because Utopia-PHP Storage counts each line as one chunk. When signature and mimeType are null it's safe to assume that the final file is corrupted and could not be build.

Error in appwrite container:
```Warning: file_get_contents(/storage/uploads/app-test/x/tmp_6694ebdf60a3886824fe.mp4/6694ebdf60a3886824fe.part.112): Failed to open stream: No such file or directory in /usr/src/code/vendor/utopia-php/storage/src/Storage/Device/Local.php on line 181
[Error] Timestamp: 2024-07-15T09:31:41+00:00
[Error] Method: POST
[Error] URL: /v1/storage/buckets/:bucketId/files
[Error] Type: Exception
[Error] Message: Failed to read chunk /storage/uploads/app-test/x/tmp_6694ebdf60a3886824fe.mp4/6694ebdf60a3886824fe.part.112
[Error] File: /usr/src/code/vendor/utopia-php/storage/src/Storage/Device/Local.php
[Error] Line: 183```

Just want to confirm if my thought process is right. I would open an github issue but I'm not sure if I should open it in the appwrite or utopia-php storage repo for better visibility.
TL;DR
Developers implement a retry feature for failed requests, but encounter a bug where uploading the same file chunk multiple times leads to issues with chunk counting and file corruption. The Utopia PHP Storage logs the chunk numbers and expects a specific number of chunks, leading to errors. The final file is marked as corrupted due to discrepancies in chunk counting and file building process. The error message indicates a failed read chunk operation. Consider opening an issue in either the Appwrite or Utopia PHP Storage repository for further assistance. Solution: Review and potentially update the chunk counting and handling mechanisms in the Utopia PHP Storage implementation to ensure accurate chunk tracking and
Joshi
15 Jul, 2024, 09:48

Not sure what this button is for but I think without solving this bug retrying the upload won't be possible

Reply

Reply to this thread by joining our Discord

Reply on Discord

Need support?

Join our Discord

Get community support by joining our Discord server.

Join Discord

Get premium support

Join Appwrite Pro and get email support from our team.

Learn more