Bringing Language Models and Chat to Web Browsers with ML Compilation on TVM Unity

Bringing Language Models and Chat to Web Browsers with ML Compilation on TVM Unity

The mlc-ai/web-llm Github repository hosts a project that aims to bring language models and chat to web browsers with no server support. The project utilizes machine learning compilation (MLC) on TVM Unity, along with WebGPU, to enable native GPU executions on the browser. The project also offers support for client personal AI models with cost reduction, privacy protection, and enhanced personalization. Compression techniques, memory planning optimizations, and int4 quantization are used to make the models fit into memory. The TVM web runtime, Emscripten and TypeScript, and wasm port of SentencePiece tokenizer are also used.

The project is made possible thanks to the open-source ecosystem, including Apache TVM, Hugging Face, LLaMA, Alpaca, Vicuna, Dolly, WebAssembly, Emscripten, and WebGPU communities. The project is mainly done in Python, with a 600 loc JavaScript app for connecting things together. Comparison to Native GPU Runtime, Limitations, and Opportunities are also discussed in the repository.

Read More

Read More in AiShorts

DeepFloyd IF: A Novel State-of-the-Art Open-Source Text-to-Image Model by StabilityAI

DeepFloyd IF is a modular, state-of-the-art open-source text-to-image model composed of a frozen text encoder and three cascaded pixel diffusion modules....

DeepFloyd IF: A Novel State-of-the-Art Open-Source Text-to-Image Model by StabilityAI

Bark: Transformer-based Text-to-Audio Generation Model by Suno-ai

Bark is a transformer-based text-to-audio model created by Suno. It can generate highly realistic, multilingual speech as well as other types of audio, i...

Bark: Transformer-based Text-to-Audio Generation Model by Suno-ai

h2oGPT - The world's best open source GPT

h2oGPT is an open-source repository that provides code, data, and models for large language models or GPT. It includes code for preparing instruction dat...

h2oGPT - The world's best open source GPT

StableLM: Ongoing Development of Stability AI Language Models

This Github repository is dedicated to the ongoing development of Stability AI's StableLM series of language models, including the recently released Stab...

StableLM: Ongoing Development of Stability AI Language Models

BabyAGI-ASI: A Python Script for Autonomous and Self-Improving Agents

BabyAGI-ASI is a Python script example of a task-driven autonomous agent powered by OpenAI API designed to provide an assistant with tools to complete an...

BabyAGI-ASI: A Python Script for Autonomous and Self-Improving Agents

Bringing Language Models and Chat to Web Browsers with ML Compilation on TVM Unity

The mlc-ai/web-llm Github repository hosts a project that aims to bring language models and chat to web browsers with no server support. The project util...

Bringing Language Models and Chat to Web Browsers with ML Compilation on TVM Unity

Microsoft releases DeepSpeed-Chat for easy, fast, and affordable training of ChatGPT-like models

DeepSpeed-Chat is an end-to-end RLHF pipeline for training powerful ChatGPT-like models that can summarize, code, and translate with top results. Existin...

Microsoft releases DeepSpeed-Chat for easy, fast, and affordable training of ChatGPT-like models

JARVIS: A Collaborative System for Solving Complicated AI Tasks

JARVIS is a collaborative system that uses an LLM as the controller and numerous expert models as collaborative executors (from HuggingFace Hub) to solve...

JARVIS: A Collaborative System for Solving Complicated AI Tasks

VideoCrafter: A Toolkit for Text-to-Video Generation and Editing

VideoCrafter is an open-source video generation and editing toolbox for creating video content. It has three types of models: Base T2V for generic text-t...

VideoCrafter: A Toolkit for Text-to-Video Generation and Editing

Auto-GPT: An Autonomous GPT-4 Experiment

Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model. It is an attempt to make GPT-4 fully autonom...

Auto-GPT: An Autonomous GPT-4 Experiment

Twitter's Recommendation Algorithm: Source Code for Building Feeds of Content

The Twitter Recommendation Algorithm is an open-source set of services and jobs responsible for constructing and serving the Home Timeline. It includes d...

Twitter's Recommendation Algorithm: Source Code for Building Feeds of Content