[ICLR 2025] Distilled Decoding 1: One-step Sampling of Image Auto-regressive
【B站AI论坛】欢迎写代码、做产品的AI开发者发帖加入
[ICLR2025] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformer
[ICLR26] Cache-to-Cache: Direct Semantic Communication Between Large Language Mo
[HPEC2024] GLITCHES: GPU-FPGA LLM Inference Through a Collaborative Heterogeneou
[ICLR 26] NI Sampling: Accelerating Discrete DIffusion Sampling by Token Order
[MLSys2024] FlashDecoding++: Faster Large Language Model Inference with Asynchro
[lEEE RA-L] OmniDrones: An Efficient and Flexible Platform for Reinforcement
[AAAI2025] Enhancing Contrastive Learning Inspired by the Philosophy of ‘the Bli
[NeurIPS24] DiTFastAttn: Attention Compression for Diffusion Transformer Models
[ICRA26] JuggleRL: Mastering Ball Juggling with a Quadrotor via Deep Reinforceme
《稀疏存内计算电脑与架构》-岳金山
[EuroSys 26] STAlloc: Enhancing Memory Efficiency in Large-Scale Model Training
大语言模型量化简介
JuggleRL: Mastering Ball Juggling with a Quadrotor via Deep Reinforcement Learni
[NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary S
[EuroSys 26] Efficient and Adaptable Overlapping for Computation and Communicati
[ICLR26] PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT
生成模型的算法/模型层加速工作介绍
神经网络加速器网站简介 (中文)
[CoRL 2025] Mastering Multi-Drone Volleyball through Hierarchical Co-Self-Play R