A randomly initialized checkpoint of a 252M custom transformer architecture with two linear transformations from the llama2-70b embeddings to 1024-dimensional space from 8192-d and then back from 1024-d to 8192-d for the llama2-70b language modelling head.

To be trained

Downloads last month
10
Safetensors
Model size
0.8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support