As I just read from Appwrite Twitter account, you're cooking up a new runtime which is designed to perform machine learning tasks. Because Appwrite is scalable with Docker Swarm, does that mean that if a ML task is given to the swarm, then the task is automatically distributed across the swarm nodes to speed up the learning process? Technically, would this mean that we are able to easily scale ML tasks across multiple server machines? Also, would the ML runtime support distributed training of large language models like LLaMA?
Hi - The runtime will be basically a Python runtime with all system libraries needed to do machine learning things so it becomes easier for ML devs to work with it. Ofcourse it would be open to gather community feedback to see how we want to improve it and make it better. That being said, No, currently in first iteration of release, ML runtime will not support distributed training of large language models like LLaMA, but in next iterations (if community feedbacks suggest) it will be added.
It will scale exactly the same as any other Appwrite Function BUT it will proper access to host machine GPU, which is almost necessary for machine learning.
Thank you Jyoti! 🙂 I would strongly support the ability to distribute LLM training over multiple Appwrite swarm nodes.
Thanks for the feedback! We will definitely consider it ❤️
Recommended threads
- 408 Timeout / Curl Error 7 in Executor w...
Hey everyone, I am losing my mind over a routing loop/timeout issue on a fresh self-hosted setup. I have a single Linux VPS (IP: 45.141.37.105) and one domain (...
- functions returning error 401 in local
I updated to 1.9.0, and the functions that used to work fine in 1.8.1 are now giving me a 401 error. I can't seem to find a solution. If anyone is running versi...
- router_deployment_not_found
I updated my function a few times and now i am getting the error: router_deployment_not_found I even reverted back to my original code but i am still getting th...