Bringing Language Models and Chat to Web Browsers with ML Compilation on TVM Unity

Shreyash Rote / April 16, 2023 6:04am Github

The mlc-ai/web-llm Github repository hosts a project that aims to bring language models and chat to web browsers with no server support. The project utilizes machine learning compilation (MLC) on TVM Unity, along with WebGPU, to enable native GPU executions on the browser. The project also offers support for client personal AI models with cost reduction, privacy protection, and enhanced personalization. Compression techniques, memory planning optimizations, and int4 quantization are used to make the models fit into memory. The TVM web runtime, Emscripten and TypeScript, and wasm port of SentencePiece tokenizer are also used.

The project is made possible thanks to the open-source ecosystem, including Apache TVM, Hugging Face, LLaMA, Alpaca, Vicuna, Dolly, WebAssembly, Emscripten, and WebGPU communities. The project is mainly done in Python, with a 600 loc JavaScript app for connecting things together. Comparison to Native GPU Runtime, Limitations, and Opportunities are also discussed in the repository.

Deep-Live-Cam: Real-Time Face Swaps

Transform faces in videos effortlessly with Deep-Live-Cam. Enjoy real-time face swaps and create engaging content with just a single image. Dive into the...

GPT Engineer: Generate Codebase with AI Prompting

GPT Engineer is a Github repository that allows users to generate an entire codebase by providing a prompt which the AI then clarifies and builds upon. I...

Open Source Text-to-Video Synthesis Colab: Transforming Text into Video with AI

The Text-to-Video Synthesis Colab is a Github repository that includes various models for generating videos from text. Some of the models included are Po...

Roop: One-Click Deepfake: A New AI Program for Easy Face Swapping!

Roop is a software that allows users to replace faces in videos with one image of the desired face. There are two types of installations: basic, which is...

Qdrant - Vector Search Engine for the next generation of AI applications

Qdrant is a vector similarity search engine and vector database written in Rust, designed for extended filtering support, making it useful for various ne...

DeepFloyd IF: A Novel State-of-the-Art Open-Source Text-to-Image Model by StabilityAI

DeepFloyd IF is a modular, state-of-the-art open-source text-to-image model composed of a frozen text encoder and three cascaded pixel diffusion modules....

Bark: Transformer-based Text-to-Audio Generation Model by Suno-ai

Bark is a transformer-based text-to-audio model created by Suno. It can generate highly realistic, multilingual speech as well as other types of audio, i...

h2oGPT - The world's best open source GPT

h2oGPT is an open-source repository that provides code, data, and models for large language models or GPT. It includes code for preparing instruction dat...

StableLM: Ongoing Development of Stability AI Language Models

This Github repository is dedicated to the ongoing development of Stability AI's StableLM series of language models, including the recently released Stab...

BabyAGI-ASI: A Python Script for Autonomous and Self-Improving Agents

BabyAGI-ASI is a Python script example of a task-driven autonomous agent powered by OpenAI API designed to provide an assistant with tools to complete an...

Bringing Language Models and Chat to Web Browsers with ML Compilation on TVM Unity

The mlc-ai/web-llm Github repository hosts a project that aims to bring language models and chat to web browsers with no server support. The project util...

Microsoft releases DeepSpeed-Chat for easy, fast, and affordable training of ChatGPT-like models

DeepSpeed-Chat is an end-to-end RLHF pipeline for training powerful ChatGPT-like models that can summarize, code, and translate with top results. Existin...

JARVIS: A Collaborative System for Solving Complicated AI Tasks

JARVIS is a collaborative system that uses an LLM as the controller and numerous expert models as collaborative executors (from HuggingFace Hub) to solve...

VideoCrafter: A Toolkit for Text-to-Video Generation and Editing

VideoCrafter is an open-source video generation and editing toolbox for creating video content. It has three types of models: Base T2V for generic text-t...

Auto-GPT: An Autonomous GPT-4 Experiment

Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model. It is an attempt to make GPT-4 fully autonom...

Twitter's Recommendation Algorithm: Source Code for Building Feeds of Content

The Twitter Recommendation Algorithm is an open-source set of services and jobs responsible for constructing and serving the Home Timeline. It includes d...

Bringing Language Models and Chat to Web Browsers with ML Compilation on TVM Unity

More in Github

Deep-Live-Cam: Real-Time Face Swaps

GPT Engineer: Generate Codebase with AI Prompting

Open Source Text-to-Video Synthesis Colab: Transforming Text into Video with AI

Roop: One-Click Deepfake: A New AI Program for Easy Face Swapping!

Qdrant - Vector Search Engine for the next generation of AI applications

Read More in AiShorts

DeepFloyd IF: A Novel State-of-the-Art Open-Source Text-to-Image Model by StabilityAI

Bark: Transformer-based Text-to-Audio Generation Model by Suno-ai

h2oGPT - The world's best open source GPT

StableLM: Ongoing Development of Stability AI Language Models

BabyAGI-ASI: A Python Script for Autonomous and Self-Improving Agents

Bringing Language Models and Chat to Web Browsers with ML Compilation on TVM Unity

Microsoft releases DeepSpeed-Chat for easy, fast, and affordable training of ChatGPT-like models

JARVIS: A Collaborative System for Solving Complicated AI Tasks

VideoCrafter: A Toolkit for Text-to-Video Generation and Editing

Auto-GPT: An Autonomous GPT-4 Experiment

Twitter's Recommendation Algorithm: Source Code for Building Feeds of Content