papers
updated
Clinical Text Summarization: Adapting Large Language Models Can
Outperform Human Experts
Paper
• 2309.07430
• Published
• 28
MindAgent: Emergent Gaming Interaction
Paper
• 2309.09971
• Published
• 12
Cure the headache of Transformers via Collinear Constrained Attention
Paper
• 2309.08646
• Published
• 14
Contrastive Decoding Improves Reasoning in Large Language Models
Paper
• 2309.09117
• Published
• 40
Uncovering mesa-optimization algorithms in Transformers
Paper
• 2309.05858
• Published
• 14
Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning
Paper
• 2312.14878
• Published
• 15
Time is Encoded in the Weights of Finetuned Language Models
Paper
• 2312.13401
• Published
• 20
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Paper
• 2312.12456
• Published
• 45
HAAR: Text-Conditioned Generative Model of 3D Strand-based Human
Hairstyles
Paper
• 2312.11666
• Published
• 13
Cascade Speculative Drafting for Even Faster LLM Inference
Paper
• 2312.11462
• Published
• 10
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper
• 2312.10003
• Published
• 44
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective
Depth Up-Scaling
Paper
• 2312.15166
• Published
• 61
Astraios: Parameter-Efficient Instruction Tuning Code Large Language
Models
Paper
• 2401.00788
• Published
• 23
Improving Text Embeddings with Large Language Models
Paper
• 2401.00368
• Published
• 82
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper
• 2401.01055
• Published
• 55
Hyena Hierarchy: Towards Larger Convolutional Language Models
Paper
• 2302.10866
• Published
• 7
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper
• 2401.02954
• Published
• 53
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention
and Distributed KVCache
Paper
• 2401.02669
• Published
• 17
LLaMA Pro: Progressive LLaMA with Block Expansion
Paper
• 2401.02415
• Published
• 54
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper
• 2401.02412
• Published
• 38
DocLLM: A layout-aware generative language model for multimodal document
understanding
Paper
• 2401.00908
• Published
• 189
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper
• 2401.01325
• Published
• 27
A Comprehensive Study of Knowledge Editing for Large Language Models
Paper
• 2401.01286
• Published
• 21
Unicron: Economizing Self-Healing LLM Training at Scale
Paper
• 2401.00134
• Published
• 13
The Internal State of an LLM Knows When its Lying
Paper
• 2304.13734
• Published
• 2
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper
• 2312.00752
• Published
• 150
PromptBench: A Unified Library for Evaluation of Large Language Models
Paper
• 2312.07910
• Published
• 16
SparQ Attention: Bandwidth-Efficient LLM Inference
Paper
• 2312.04985
• Published
• 40
Boundary Attention: Learning to Find Faint Boundaries at Any Resolution
Paper
• 2401.00935
• Published
• 18
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper
• 2309.14717
• Published
• 46
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table
Understanding
Paper
• 2401.04398
• Published
• 25
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper
• 2208.07339
• Published
• 5
Sigmoid Loss for Language Image Pre-Training
Paper
• 2303.15343
• Published
• 11
Accelerating LLM Inference with Staged Speculative Decoding
Paper
• 2308.04623
• Published
• 26