Candle: Torch Replacement in Rust

candle

discord server
Latest version
Documentation
License

Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support)
and ease of use. Try our online demos:
whisper,
llama2.

let a=Tensor::randn(0f32, 1., (2, 3), &Device::Cpu)?;
let b=Tensor::randn(0f32, 1., (3, 4), &Device::Cpu)?;

let c=a.matmul(&b)?;
println!("{c}");

Check out our examples

Check out our examples:

  • Whisper: speech recognition model.
  • Llama and Llama-v2: general LLM.
  • Falcon: general LLM.
  • Bert: useful for sentence embeddings.
  • StarCoder: LLM specialized to code
    generation.
  • Stable Diffusion: text to
    image generative model.
  • DINOv2: computer vision model trained
    using self-supervision (can be used for imagenet classification, depth
    evaluation, segmentation).

Run them using the following commands:

cargo run --example whisper --release
cargo run --example llama --release
cargo run --example falcon --release
cargo run --example bert --release
cargo run --example bigcode --release
cargo run --example stable-diffusion --release -- --prompt "a rusty robot holding a fire torch"
cargo run --example dinov2 --release -- --image path/to/myinput.jpg

In order to use CUDA add --features cuda to the example command line. If
you have cuDNN installed, use --features cudnn for even more speedups.

There are also some wasm examples for whisper and
llama2.c. You can either build them with
trunk or try them online:
whisper,
llama2.

For llama2, run the following command to retrieve the weight files and start a
test server:

cd candle-wasm-examples/llama2-c
wget https://huggingface.co/spaces/lmz/candle-llama2/resolve/main/model.bin
wget https://huggingface.co/spaces/lmz/candle-llama2/resolve/main/tokenizer.json
trunk serve --release --public-url /candle-llama2/ --port 8081

And then head over to
http://localhost:8081/candle-llama2.

Features

  • Simple syntax, looks and feels like PyTorch.
  • Backends.
    • Optimized CPU backend with optional MKL support for x86 and Accelerate for macs.
    • CUDA backend for efficiently running on GPUs, multiple GPU distribution via NCCL.
    • WASM support, run your models in a browser.
  • Included models.
    • LLMs: Llama v1 and v2, Falcon, StarCoder.
    • Whisper (multi-lingual support).
    • Stable Diffusion.
    • Computer Vision: DINOv2.
  • Serverless (on CPU), small and fast deployments.
  • Quantization support using the llama.cpp quantized types.

How to use

Cheatsheet:

Using PyTorchUsing Candle
Creationtorch.Tensor([[1, 2], [3, 4]])Tensor::new(&[[1f32, 2.], [3., 4.]], &Device::Cpu)?
Creationtorch.zeros((2, 2))Tensor::zeros((2, 2), DType::F32, &Device::Cpu)?
Indexingtensor[:, :4]tensor.i((.., ..4))?
Operationstensor.view((2, 2))tensor.reshape((2, 2))?
Operationsa.matmul(b)a.matmul(&b)?
Arithmetica + b&a + &b
Devicetensor.to(device="cuda")tensor.to_device(&Device::Cuda(0))?
Dtypetensor.to(dtype=torch.float16)tensor.to_dtype(&DType::F16)?
Savingtorch.save({"A": A}, "model.bin")candle::safetensors::save(&HashMap::from([("A", A)]), "model.safetensors")?
Loadingweights=torch.load("model.bin")candle::safetensors::load("model.safetensors", &device)

Structure

FAQ

Why should I use Candle?

Candle’s core goal is to make serverless inference possible. Full machine learning frameworks like PyTorch
are very large, which makes creating instances on a cluster slow. Candle allows deployment of lightweight
binaries.

Secondly, Candle lets you remove Python from production workloads. Python overhead can seriously hurt performance,
and the GIL is a notorious source of headaches.

Finally, Rust is cool! A lot of the HF ecosystem already has Rust crates, like safetensors and tokenizers.

Other ML frameworks

  • dfdx is a formidable crate, with shapes being included
    in types. This prevents a lot of headaches by getting the compiler to complain about shape mismatches right off the bat.
    However, we found that some features still require nightly, and writing code can be a bit daunting for non rust experts.

    We’re leveraging and contributing to other core crates for the runtime so hopefully both crates can benefit from each
    other.

  • burn is a general crate that can leverage multiple backends so you can choose the best
    engine for your workload.

  • tch-rs Bindings to the torch library in Rust. Extremely versatile, but they
    bring in the entire torch library into the runtime. The main contributor of tch-rs is also involved in the development
    of candle.

Common Errors

Missing symbols when compiling with the mkl feature.

If you get some missing symbols when compiling binaries/tests using the mkl
or accelerate features, e.g. for mkl you get:

 =note: /usr/bin/ld: (....o): in function `blas::sgemm':
          .../blas-0.22.0/src/lib.rs:1944: undefined reference to `sgemm_' collect2: error: ld returned 1 exit status

 =note: some `extern` functions couldn't be found; some native libraries may need to be installed or have their path specified
 =note: use the `-l` flag to specify native libraries to link
 =note: use the `cargo:rustc-link-lib` directive to specify the native libraries to link with Cargo

or for accelerate:

Undefined symbols for architecture arm64:
            "_dgemm_", referenced from:
                candle_core::accelerate::dgemm::h1b71a038552bcabe in libcandle_core...
            "_sgemm_", referenced from:
                candle_core::accelerate::sgemm::h2cf21c592cba3c47 in libcandle_core...
          ld: symbol(s) not found for architecture arm64

This is likely due to a missing linker flag that was needed to enable the mkl library. You
can try adding the following for mkl at the top of your binary:

extern crate intel_mkl_src;

or for accelerate:

extern crate accelerate_src;

Cannot run llama example : access to source requires login credentials

Error: request error: https://huggingface.co/meta-llama/Llama-2-7b-hf/resolve/main/tokenizer.json: status code 401

This is likely because you’re not permissioned for the llama-v2 model. To fix
this, you have to register on the huggingface-hub, accept the llama-v2 model
conditions
, and set up your
authentication token. See issue
#350 for more details.

Tracking down errors

You can set RUST_BACKTRACE=1 to be provided with backtraces when a candle
error is generated.

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

Related Posts
A theme series every week 3: Very stylish and Minimal design Classy UI MIUI Theme! thumbnail

A theme series every week 3: Very stylish and Minimal design Classy UI MIUI Theme!

MIUI, cihazınızı kendi tarzınıza göre kişiselleştirebileceğiniz ve her hafta farklı tema ile haftanıza renk katabileceğiniz bir arayüz. Bu konuda iyi bir arayüz olan MIUI arayüzünü kişileştirmek için sizlere her hafta birbirinden farklı güzel temalar ile merhaba diyoruz. Bugün, Xiaomi, Redmi ve Poco cihazınızın her zamankinden daha güzel görünmesini sağlayacak harika bir MIUI temasını sizinle paylaşacağız. Classy UI…
Read More
The vivo V27 strikes the balance of performance and style thumbnail

The vivo V27 strikes the balance of performance and style

vivo’s V series have always been big about design and camera features. This time, the vivo V27 is set to impress once again as vivo has brought big upgrades to the table, including an all-new stunning design with a color changing back, an Aura Light for beautiful low light portraits, all-new high-performance processor, and a
Read More
Guo Mingchi: The 2022 iPad Air still uses LCD material thumbnail

Guo Mingchi: The 2022 iPad Air still uses LCD material

采用 mini LED 的 11 英寸 iPad Pro 有望在 2022 年发布。而由于 OLED 被定位为高端显示技术,据估计,OLED iPad Air “可能不利于 11 英寸 mini LED iPad Pro 的高端定位和出货量”。其技术原因显然是归结于生产,因为"性能和成本无法满足苹果的要求"。虽然郭老师认为 mini LED 在 2023 年前将是“iPad的关键卖点之一”,但目前的成本结构将使“中低端 iPad 机型采用”这种显示屏成为挑战。因此,该公司认为 MacBook Pro 将主要推动 mini LED 的出货量。
Read More
The new James Bond is here.  The magnificent and moving film No Time to Die will be presented to the audience on a unique screen thumbnail

The new James Bond is here. The magnificent and moving film No Time to Die will be presented to the audience on a unique screen

Daniel Craig v nové bondovce Není čas zemřítFoto: Forum Film Sedmnáct měsíců čekání je u konce a Není čas zemřít vyráží do kin. Pro sérii filmů s Jamesem Bondem bylo až kosmickou shodou náhod, když poslední snímek s Danielem Craigem v hlavní roli byl loni prvním, jehož premiéra se odložila. Jakoby se někdo nechtěl rozloučit…
Read More
Iubirea şi romantismul sunt cele mai căutate în aplicaţia Ask FM, găsită în Huawei AppGallery thumbnail

Iubirea şi romantismul sunt cele mai căutate în aplicaţia Ask FM, găsită în Huawei AppGallery

Ask FM este o aplicaţie în care utilizatorii pot pune întrebări şi primi răspunsuri, într-o experienţă transformată în reţea de socializare. Ei bine aflaţi că printre cele mai populare căutări ale utilizatorilor din Europa de pe Ask FM se numără iubirea şi romantismul în ultimul an şi jumătate. Aplicaţia se află inclusiv în Huawei AppGallery…
Read More
Index Of News
Total
0
Share