| Title: | Ellmer-Native llama.cpp Chats for R |
|---|---|
| Description: | Provides an ellmer-style chat interface backed by native llama.cpp inference. The package vendors llama.cpp, exposes a chat_llamacpp() constructor for local GGUF models, supports token streaming, basic tool-calling loops, and helpers for downloading a curated default model. |
| Authors: | Alex Kraieski [aut, cre] |
| Maintainer: | Alex Kraieski <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.2 |
| Built: | 2026-05-11 07:15:16 UTC |
| Source: | https://github.com/arkraieski/llamacppR |
Creates a local chat object backed by native llama.cpp inference while
following the ellmer chat API style.
chat_llamacpp( system_prompt = NULL, model, seed = NULL, params = ellmer::params(), echo = c("none", "output", "all"), n_ctx = 2048L, n_batch = n_ctx, n_threads = 0L, n_gpu_layers = 0L )chat_llamacpp( system_prompt = NULL, model, seed = NULL, params = ellmer::params(), echo = c("none", "output", "all"), n_ctx = 2048L, n_batch = n_ctx, n_threads = 0L, n_gpu_layers = 0L )
system_prompt |
Optional system prompt. |
model |
Path to a local GGUF model file. |
seed |
Optional seed forwarded to llama.cpp sampling. |
params |
An |
echo |
Whether to echo generated output. |
n_ctx |
Context size. |
n_batch |
Batch size used for prompt evaluation. |
n_threads |
CPU threads used by llama.cpp. |
n_gpu_layers |
Number of layers to offload to GPU when supported. |
Returns the local cache path used by llamacppR for one of the curated default GGUF model presets.
llamacpp_default_model_path(model = c("3b", "0.5b", "starcoder", "deepseek"))llamacpp_default_model_path(model = c("3b", "0.5b", "starcoder", "deepseek"))
model |
Which curated default model path to return. |
Downloads a curated GGUF model from Hugging Face and returns the local path.
llamacpp_download_default_model( model = c("3b", "0.5b", "starcoder", "deepseek"), path = NULL, force = FALSE )llamacpp_download_default_model( model = c("3b", "0.5b", "starcoder", "deepseek"), path = NULL, force = FALSE )
model |
Which curated default model to download. |
path |
Destination path for the downloaded model. |
force |
Whether to overwrite an existing file. |
Downloads one of the curated model presets shipped with llamacppR.
llamacpp_download_model( model = c("qwen_3b", "qwen_0_5b", "starcoder", "deepseek"), path = NULL, force = FALSE )llamacpp_download_model( model = c("qwen_3b", "qwen_0_5b", "starcoder", "deepseek"), path = NULL, force = FALSE )
model |
Preset id or alias. |
path |
Destination path for the downloaded model. |
force |
Whether to overwrite an existing file. |
Validates the magic bytes at the start of a file to determine whether it looks like a GGUF model.
llamacpp_is_gguf(path)llamacpp_is_gguf(path)
path |
Path to inspect. |
Lists GGUF files found in the local llamacppR cache directory and marks whether they match one of the curated default model presets.
llamacpp_list_models(path = llamacpp_cache_dir(), recursive = TRUE)llamacpp_list_models(path = llamacpp_cache_dir(), recursive = TRUE)
path |
Directory to scan for GGUF files. |
recursive |
Whether to scan subdirectories recursively. |
Loads a GGUF model through native llama.cpp bindings and returns basic metadata.
llamacpp_model_info( model, n_ctx = 2048L, n_batch = n_ctx, n_threads = 0L, n_gpu_layers = 0L )llamacpp_model_info( model, n_ctx = 2048L, n_batch = n_ctx, n_threads = 0L, n_gpu_layers = 0L )
model |
Path to a GGUF file. |
n_ctx |
Context size used when opening the model. |
n_batch |
Batch size used when opening the model. |
n_threads |
Number of CPU threads. |
n_gpu_layers |
Number of GPU layers to offload when supported. |
Returns the local cache path used by llamacppR for a curated model preset.
llamacpp_model_path(model = c("qwen_3b", "qwen_0_5b", "starcoder", "deepseek"))llamacpp_model_path(model = c("qwen_3b", "qwen_0_5b", "starcoder", "deepseek"))
model |
Preset id or alias. |
Returns the curated model catalog shipped with llamacppR, including stable preset ids, aliases, filenames, approximate sizes, and short descriptions.
llamacpp_model_presets()llamacpp_model_presets()
Explicitly releases the native llama.cpp model and context associated with a chat or session object.
llamacpp_unload(x)llamacpp_unload(x)
x |
A chat object created by |