Compute normalized text embeddings and print cosine similarity.
Download GGUF files from HuggingFace Hub.
D bindings and wrappers for llama.cpp.
Multimodal inference CLI — feed an image (and optional text) to a vision model.
Demonstrate context-state save and load for reproducible generation.
Minimal text-completion example. Usage: simple -m model.gguf [-n n_predict] [-ngl n_gpu_layers] [prompt]
Print each token id and its string piece for the given text.