LlamaContext

decode int decode(llama_batch batch): Decodes a token batch; returns 0 on success.
encode int encode(llama_batch batch): Encodes a batch (encoder-decoder models); returns 0 on success.
getEmbeddings float[] getEmbeddings(): All output embeddings packed contiguously. Valid after decode; shape is [n_outputs * nEmbd]. Returns null when pooling is LLAMA_POOLING_TYPE_NONE or for generative models.
getEmbeddingsIth float[] getEmbeddingsIth(int i): Embeddings for the ith output token (-1 = last). Returns null for invalid index.
getEmbeddingsSeq float[] getEmbeddingsSeq(llama_seq_id seqId): Pooled embeddings for a sequence. Returns null when pooling is LLAMA_POOLING_TYPE_NONE.
getLogits const(float)[] getLogits(int idx): Logits at output position idx (-1 = last). Valid until the next decode call.
memoryClear void memoryClear(bool data): Clear the KV cache. Pass data = true to also zero-fill memory.
printPerf void printPerf(): Undocumented in source. Be warned that the author may not have intended to support it.
stateGetData size_t stateGetData(ubyte[] dst): Copy the current state into dst. Returns the number of bytes written.
stateGetSize size_t stateGetSize(): Byte count of the current state. Use this to size a buffer before stateGetData.
stateLoadFile bool stateLoadFile(string path, llama_token[] tokensOut, size_t* tokenCount): Load state from a session file. On success tokensOut is filled and tokenCount holds the number of tokens read; returns true.
stateSaveFile bool stateSaveFile(string path, const(llama_token)[] tokens): Save the state to a session file, recording tokens as the session prompt. Returns true on success.
stateSeqGetData size_t stateSeqGetData(ubyte[] dst, llama_seq_id seqId): Copy sequence seqId's KV cache into dst. Returns bytes written.
stateSeqGetSize size_t stateSeqGetSize(llama_seq_id seqId): Byte count required to snapshot sequence seqId.
stateSeqSetData size_t stateSeqSetData(const(ubyte)[] src, llama_seq_id destSeqId): Restore a KV snapshot from src into sequence destSeqId. Returns bytes consumed; 0 means failure.
stateSetData size_t stateSetData(const(ubyte)[] src): Restore the state from src. Returns the number of bytes consumed.

Mixins

__anonymous mixin Owned!(llama_context, llama_free): Undocumented in source.

Properties

memory llama_memory_t memory [@property getter]: Raw memory handle. Use for sequence management (copy, remove, shift, etc.).
nCtx uint nCtx [@property getter]: Undocumented in source. Be warned that the author may not have intended to support it.
poolingType int poolingType [@property getter]: Active pooling type as an int (compare to LLAMA_POOLING_TYPE_* constants).

Static functions

fromModel LlamaContext fromModel(LlamaModel model, llama_context_params params): Create from explicit params.
fromModel LlamaContext fromModel(LlamaModel model, uint nCtx, uint nBatch): Create from a window size and batch size.

Mixed In Members

From `mixin Owned!(llama_context, llama_free)`

this this(): Undocumented in source.
this(this) this(this): Undocumented in source.
~this ~this(): Undocumented in source.
opCast bool opCast(): True when the handle holds a non-null pointer.
ptr T* ptr [@property getter]: Raw C pointer (mutable).
ptr const(T)* ptr [@property getter]: Raw C pointer (const view).

LlamaContext

Members

Functions

Mixins

Properties

Static functions

Mixed In Members

From `mixin Owned!(llama_context, llama_free)`

Meta

Source

LlamaContext

From mixin Owned!(llama_context, llama_free)

From `mixin Owned!(llama_context, llama_free)`