All output embeddings packed contiguously. Valid after decode; shape is [n_outputs * nEmbd]. Returns null when pooling is LLAMA_POOLING_TYPE_NONE or for generative models.
See Implementation
All output embeddings packed contiguously. Valid after decode; shape is [n_outputs * nEmbd]. Returns null when pooling is LLAMA_POOLING_TYPE_NONE or for generative models.