Willow Ventures

Fine-tuning LLMs with user-level differential privacy | Insights by Willow Ventures

Fine-tuning LLMs with user-level differential privacy | Insights by Willow Ventures

Optimizing Algorithms for Large Language Models (LLMs) In the fast-evolving world of artificial intelligence, optimizing algorithms for Large Language Models (LLMs) is essential to ensure performance and privacy. In this blog post, we’ll explore how to fine-tune algorithm implementations for better results. The Challenge of “Out-of-the-Box” Algorithms Running standard algorithms “out-of-the-box” for LLMs can lead […]

Weak-for-Strong (W4S): A Novel Reinforcement Learning Algorithm that Trains a weak Meta Agent to Design Agentic Workflows with Stronger LLMs | Insights by Willow Ventures

Weak-for-Strong (W4S): A Novel Reinforcement Learning Algorithm that Trains a weak Meta Agent to Design Agentic Workflows with Stronger LLMs | Insights by Willow Ventures

Introduction to Weak-for-Strong Harnessing (W4S) in Reinforcement Learning In recent advancements in artificial intelligence, researchers from Stanford, EPFL, and UNC have introduced the Weak-for-Strong Harnessing (W4S) framework. This innovative approach in Reinforcement Learning (RL) enables a lightweight meta-agent to design and optimize code workflows that leverage more potent executor models. What is Weak-for-Strong Harnessing (W4S)? […]

Privacy-preserving domain adaptation with LLMs for mobile applications | Insights by Willow Ventures

Privacy-preserving domain adaptation with LLMs for mobile applications | Insights by Willow Ventures

Enhancing Language Models with Privacy-Preserving Synthetic Data In the world of AI and machine learning, the success of language models hinges on the quality and quantity of data. A recent focus has been on using synthetic data to enhance these models while safeguarding user privacy. The Role of High-Quality Data in Machine Learning The effectiveness […]

RA3: Mid-Training with Temporal Action Abstractions for Faster Reinforcement Learning (RL) Post-Training in Code LLMs | Insights by Willow Ventures

RA3: Mid-Training with Temporal Action Abstractions for Faster Reinforcement Learning (RL) Post-Training in Code LLMs | Insights by Willow Ventures

Accelerating Reinforcement Learning: Unveiling RA3 and Mid-Training Insights Recent research from Apple introduces groundbreaking concepts in reinforcement learning (RL) through the launch of RA3 (Reasoning as Action Abstractions). This innovative approach highlights how mid-training can optimize RL post-training, offering a significant stride in code generation tasks. What Does the Research Present? This study presents a […]

Making LLMs more accurate by using all of their layers | Insights by Willow Ventures

Making LLMs more accurate by using all of their layers | Insights by Willow Ventures

Evaluating SLED Across Multiple LLMs: A Detailed Experiment In this post, we delve into the experiments conducted using the SLED method across various Large Language Models (LLMs). Our goal is to evaluate the flexibility and effectiveness of SLED as a decoding approach for different LLM families. Understanding the SLED Method SLED, short for Scaled Logits […]