About 50 results
Open links in new tab
  1. Qwen-VL: A Versatile Vision-Language Model for Understanding ...

    Sep 19, 2023 · In this work, we introduce the Qwen-VL series, a set of large-scale vision-language models (LVLMs) designed to perceive and understand both texts and images. Starting from the …

  2. Gated Attention for Large Language Models: Non-linearity, Sparsity,...

    Sep 18, 2025 · Gating mechanisms have been widely utilized, from early models like LSTMs and Highway Networks to recent state space models, linear attention, and also softmax attention. Yet, …

  3. TwinFlow: Realizing One-step Generation on Large Models with...

    Jan 26, 2026 · Recent advances in large multi-modal generative models have demonstrated impressive capabilities in multi-modal generation, including image and video generation. These models are …

  4. In this paper, we explore a way out and present the newest members of the open-sourced Qwen fam-ilies: Qwen-VL series. Qwen-VLs are a series of highly performant and versatile vision-language …

  5. LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation

    Jan 22, 2025 · Remarkably, LLaVA-MoD-2B surpasses Qwen-VL-Chat-7B with an average gain of 8.8\%, using merely $0.3\%$ of the training data and 23\% trainable parameters. The results …

  6. ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning...

    Sep 11, 2025 · Leveraging this framework, we train ML-Agent, driven by a 7B-sized Qwen-2.5 LLM for autonomous ML. Despite training on only 9 ML tasks, our 7B-sized ML-Agent achieves comparable …

  7. Function-to-Style Guidance of LLMs for Code Translation

    May 1, 2025 · Notably, our approach enables Qwen-1.5B to outperform prompt-enhanced Qwen-32B and GPT-4 on average across 20 diverse code translation scenarios. Lay Summary: Modern AI tools …

  8. Junyang Lin - OpenReview

    Junyang Lin Principal Researcher, Qwen Team, Alibaba Group Joined July 2019

  9. Allusive Adversarial Examples via Latent Space in Multimodal Large ...

    Sep 18, 2025 · To construct them, we introduce a practical learning framework that leverages cross-modal alignment and exploits the shared latent space of MLLMs. Empirical evaluation on LLaVA, …

  10. Mamba-3: Improved Sequence Modeling using State Space Principles

    Jan 26, 2026 · TL;DR: Mamba-3, an inference-first SSM that pushes on core SSM principles: improved discretization for better quality, complex dynamics for new capabilities, and MIMO updates for …