The application of human-like techniques allows AI on edge devices to continue learning over time.

Published on

With the PockEngine training method, machine-learning models can efficiently and continuously learn from user data on edge devices like smartphones.

Such personalized deep-learning models could be used in intelligent computer chatbots that learn how to understand a particular user’s accent, while also using smart keyboards that are continually updated to adjust and effectively predict the next word each time based on a certain person’s typing history For this reason, there is need for repeated readjustment of a machine-learning model using fresh data to achieve the customization.

The data from the users is usually sent over to the cloud server because such devices as smartphones do not possess enough memory space and computing power for fine-tuning such a detailed process. However, data transmission consumes much power, and the possibility of confidential users’ data leakage into external cloud storage can be regarded as a danger.

A method was created by researchers from MIT, the MIT-IBM Watson AI Lab, and other places in which deep-learning models can easily adjust to novel sensor data right at an edge device level.

The PockEngine is their on-device training method that finds out which components should be replaced within a large machine-learning module to raise precision and store only these particular fragments. This bulk of computation is executed as the model is being prepared, thereby reducing computational overhead during runtime and increasing the speed of fine-tuning.

Compared with other approaches, PockEngine accelerated by a factor of 15 on some platforms. In addition, PockEngine did not degrade any model in terms of accuracy. The researchers also noticed that the fine-tuning of the popular AI chatbot was able to provide accurate answers to complex questions.

On-device fine-tuning may be useful for privacy protection, cost reduction, customization capability, and even lifelong learning but its practicability is very dubious. The resources are limited hence everything has to be done within the confines of those resources. We would like to be capable of undertaking both, training as well as inference on edge devices. According to Song Han, an associate professor in EECS, a member of the MIT-IBM Watson AI Lab, a distinguished scientist at NVIDIA, and senior author of an open-access paper detailing PockEngine, “now we can”

The paper is co-authored by lead author Ligeng Zhu, a PhD in EECS, together with other colleagues from MIT, MIT–IBM Watson AI Lab, and the University of California San Diego. This paper has just been published at the IEEE/ACM International Symposium on Microarchitecture.

In the realm of deep learning, models are constructed based on neural networks, comprising interconnected layers of nodes, or “neurons.” These neurons process data to generate predictions. When the model is set in motion, a process known as inference takes place. During inference, a data input, such as an image, traverses through the layers until a prediction, like an image label, is produced at the end. Notably, each layer doesn’t need to be stored after processing the input during inference.

However, during the training and fine-tuning phase, the model undergoes backpropagation. In backpropagation, the model is run in reverse after comparing the output to the correct answer. Each layer is updated as the model’s output approaches the correct answer. Since fine-tuning may require updating each layer, the entire model and intermediate results must be stored, making it more memory-intensive than inference.

Interestingly, not all layers contribute equally to accuracy improvement; even for crucial layers, the entire layer may not require updates. These unessential layers or parts thereof don’t need to be stored. Moreover, there might be no need to trace back to the initial layer for accuracy enhancement; the process could be halted midway.

PockEngine capitalizes on these insights to accelerate the fine-tuning process and reduce the computational and memory demands. The system sequentially fine-tunes each layer for a specific task, measuring accuracy improvement after each layer adjustment. This approach allows PockEngine to discern the contribution of each layer, assess trade-offs between accuracy and fine-tuning costs, and automatically determine the percentage of each layer that requires fine-tuning.

Han emphasizes, “This method aligns closely with the accuracy achieved through full backpropagation across various tasks and neural networks.”

Traditionally, generating the backpropagation graph constitutes heavy computational work that takes place at runtime. On its side, this process takes place at compile time before the PockEngine deploys it for use.

PockEngine discards portions of text codes to eliminate superfluous sections or parts of layers forming an uncomplicated diagram for runtime. This reduces it to a graph which it then optimizes further for effectiveness.

As everything just has to happen only once, there is minimal computational overhead at runtime.

“It is as if you were preparing for a hike in the woods. You sit down at home and make all arrangements—which hiking routes do you plan to take, which will you skip over,” Han explains.”

It trained models up to 15x faster than other solutions with no impact of accuracy when the researchers ported PockEngine to deep-learning models running on different edge devices like Apple M1 Chips, digital signal processors that typically are found inside modern smartphones and Raspberry Pi In addition, fine-tuning using PockEngine was performed at a vastly reduced memory requirement.

Furthermore, the team employed the technique on a large language model Llama-V2. In big language models fine-tuning includes multiple examples and it is important for such models to know how to talk with people, Han says. It is also crucial for the models involved in complex problem-solving and solution reasoning.

For example, when using PockEngine to fine-tune Llama-V2 models, they successfully answered the question, “What was Michael Jackson’s last album?” In contrast, models that weren’t fine-tuned failed to provide the correct answer. With the help of PockEngine, the time for each iteration of the fine-tuning process was reduced from about seven seconds to less than one second on an NVIDIA Jetson Orin, an edge GPU platform.

Looking ahead, the researchers aim to leverage PockEngine for fine-tuning even larger models designed to handle both text and images simultaneously.

“This research tackles the increasing efficiency challenges posed by the widespread adoption of large AI models like LLMs across various applications in diverse industries. It not only shows promise for edge applications incorporating larger models but also has the potential to reduce the cost associated with maintaining and updating large AI models in the cloud,” says Ehry MacRostie, a senior manager in Amazon’s Artificial General Intelligence division. MacRostie, who was not involved in this study, collaborates with MIT on related AI research through the MIT-Amazon Science Hub.

Support for this work was provided, in part, by the MIT-IBM Watson AI Lab, the MIT AI Hardware Program, the MIT-Amazon Science Hub, the National Science Foundation (NSF), and the Qualcomm Innovation Fellowship.

Comments

21 responses to “The application of human-like techniques allows AI on edge devices to continue learning over time.”

  1. mfe6k Avatar
    mfe6k

    amoxicillin online – order amoxicillin generic amoxil over the counter

  2. xh2ie Avatar

    fluconazole 200mg over the counter – fluconazole 100mg sale fluconazole price

  3. xlu59 Avatar

    lexapro 10mg cost – escitalopram 20mg ca lexapro 20mg pills

  4. 82ye7 Avatar

    cenforce 100mg drug – on this site buy cenforce tablets

  5. k4hpv Avatar

    cialis substitute – https://ciltadgn.com/ cialis or levitra

  6. 012bx Avatar

    cialis canada free sample – https://strongtadafl.com/# brand cialis australia

  7. ConnieAcags Avatar

    ranitidine sale – https://aranitidine.com/ buy generic zantac for sale

  8. 0v5bm Avatar

    order generic viagra online – 50 milligram viagra how to order viagra cheap

  9. hepjq Avatar

    This is the gentle of writing I truly appreciate. buy neurontin for sale

  10. ConnieAcags Avatar

    The thoroughness in this draft is noteworthy. clomid como se toma

  11. a0oek Avatar

    Greetings! Utter gainful par‘nesis within this article! It’s the little changes which wish espy the largest changes. Thanks a quantity quest of sharing! aranitidine

  12. ConnieAcags Avatar

    More articles like this would pretence of the blogosphere richer. https://ondactone.com/product/domperidone/

  13. ConnieAcags Avatar

    More articles like this would pretence of the blogosphere richer.
    https://doxycyclinege.com/pro/metoclopramide/

  14. ConnieAcags Avatar

    More posts like this would bring about the blogosphere more useful. http://3ak.cn/home.php?mod=space&uid=229264

  15. ConnieAcags Avatar

    purchase forxiga online cheap – site buy forxiga 10mg generic

  16. ConnieAcags Avatar

    order orlistat sale – https://asacostat.com/ orlistat 120mg pills

  17. ConnieAcags Avatar

    Proof blog you have here.. It’s obdurate to find great calibre writing like yours these days. I really recognize individuals like you! Withstand mindfulness!! http://www.gearcup.cn/home.php?mod=space&uid=146342

  18. Bzfcbam Avatar

    You can protect yourself and your dearest by way of being heedful when buying pharmaceutical online. Some pharmaceutics websites operate legally and put forward convenience, privacy, rate savings and safeguards as a replacement for purchasing medicines. buy in TerbinaPharmacy https://terbinafines.com/product/prednisone.html prednisone

  19. wd079 Avatar

    This is the kind of delivery I recoup helpful. TerbinaPharmacy

  20. texas poker online Avatar

    Thanks an eye to sharing. It’s first quality.

Leave a Reply

Your email address will not be published. Required fields are marked *