LlamaContext

A llama_context that frees itself on destruction.

Members

Functions

decode
int decode(llama_batch batch)

Decodes a token batch; returns 0 on success.

encode
int encode(llama_batch batch)

Encodes a batch (encoder-decoder models); returns 0 on success.

getEmbeddings
float[] getEmbeddings()

All output embeddings packed contiguously. Valid after decode; shape is [n_outputs * nEmbd]. Returns null when pooling is LLAMA_POOLING_TYPE_NONE or for generative models.

getEmbeddingsIth
float[] getEmbeddingsIth(int i)

Embeddings for the ith output token (-1 = last). Returns null for invalid index.

getEmbeddingsSeq
float[] getEmbeddingsSeq(llama_seq_id seqId)

Pooled embeddings for a sequence. Returns null when pooling is LLAMA_POOLING_TYPE_NONE.

getLogits
const(float)[] getLogits(int idx)

Logits at output position idx (-1 = last). Valid until the next decode call.

memoryClear
void memoryClear(bool data)

Clear the KV cache. Pass data = true to also zero-fill memory.

printPerf
void printPerf()
Undocumented in source. Be warned that the author may not have intended to support it.
stateGetData
size_t stateGetData(ubyte[] dst)

Copy the current state into dst. Returns the number of bytes written.

stateGetSize
size_t stateGetSize()

Byte count of the current state. Use this to size a buffer before stateGetData.

stateLoadFile
bool stateLoadFile(string path, llama_token[] tokensOut, size_t* tokenCount)

Load state from a session file. On success tokensOut is filled and tokenCount holds the number of tokens read; returns true.

stateSaveFile
bool stateSaveFile(string path, const(llama_token)[] tokens)

Save the state to a session file, recording tokens as the session prompt. Returns true on success.

stateSeqGetData
size_t stateSeqGetData(ubyte[] dst, llama_seq_id seqId)

Copy sequence seqId's KV cache into dst. Returns bytes written.

stateSeqGetSize
size_t stateSeqGetSize(llama_seq_id seqId)

Byte count required to snapshot sequence seqId.

stateSeqSetData
size_t stateSeqSetData(const(ubyte)[] src, llama_seq_id destSeqId)

Restore a KV snapshot from src into sequence destSeqId. Returns bytes consumed; 0 means failure.

stateSetData
size_t stateSetData(const(ubyte)[] src)

Restore the state from src. Returns the number of bytes consumed.

Mixins

__anonymous
mixin Owned!(llama_context, llama_free)
Undocumented in source.

Properties

memory
llama_memory_t memory [@property getter]

Raw memory handle. Use for sequence management (copy, remove, shift, etc.).

nCtx
uint nCtx [@property getter]
Undocumented in source. Be warned that the author may not have intended to support it.
poolingType
int poolingType [@property getter]

Active pooling type as an int (compare to LLAMA_POOLING_TYPE_* constants).

Static functions

fromModel
LlamaContext fromModel(LlamaModel model, llama_context_params params)

Create from explicit params.

fromModel
LlamaContext fromModel(LlamaModel model, uint nCtx, uint nBatch)

Create from a window size and batch size.

Mixed In Members

From mixin Owned!(llama_context, llama_free)

this
this()
Undocumented in source.
this(this)
this(this)
Undocumented in source.
~this
~this()
Undocumented in source.
opCast
bool opCast()

True when the handle holds a non-null pointer.

ptr
T* ptr [@property getter]

Raw C pointer (mutable).

ptr
const(T)* ptr [@property getter]

Raw C pointer (const view).

Meta