Introducing Lux: The Next Evolution in Automated Computer Use
The OpenAGI Foundation has launched Lux, a groundbreaking foundation model that automates tasks across desktops and web browsers. With a remarkable success score of 83.6 on the Online Mind2Web benchmark, Lux outperforms competitors like Google Gemini and OpenAI, showing its potential to revolutionize how we interact with technology.
What is Lux?
Lux is not a chat model but a comprehensive computer use agent. It interprets natural language inputs, observes the screen, and performs low-level actions such as clicks, keystrokes, and scroll events. By leveraging rendered user interfaces instead of application-specific APIs, Lux can effectively manage browsers, editors, spreadsheets, and other desktop applications.
Developer Accessibility
Lux is accessible through the OpenAGI SDK and API console. The model is designed to handle a variety of tasks, including software quality assurance, deep research operations, social media management, online store management, and bulk data entry. It effectively sequences numerous UI actions to maintain alignment with natural language task descriptions.
Three Execution Modes for Enhanced Control
Lux introduces three execution modes, each balancing speed, autonomy, and control:
Actor Mode
Actor mode focuses on speed, executing actions in about one second per step. It’s ideal for straightforward tasks such as filling forms or generating reports.
Thinker Mode
Thinker mode addresses vague or multi-step tasks. It breaks down large objectives into smaller, manageable subtasks. This is useful for scenarios like conducting multi-page research or triaging lengthy email queues.
Tasker Mode
Tasker mode offers maximum determinism, allowing users to provide explicit Python scripts that Lux executes sequentially. This mode is beneficial for teams needing to enforce task graphs and failure policies in their workflows.
Performance Benchmarks
Lux achieves an impressive 83.6% success rate on the Online Mind2Web benchmark, significantly ahead of competitors: Gemini CUA at 69.0%, OpenAI Operator at 61.3%, and Claude Sonnet 4 at 61.0%. Each action is completed in approximately one second, making it around 10 times cheaper per token than OpenAI’s solution.
Innovative Training with Agentic Active Pre-training
Lux uses a unique training methodology called Agentic Active Pre-training, focusing on learning through interaction in digital environments. This approach contrasts traditional language model training, providing Lux with a robust understanding of screen-to-action behavior.
OpenAGI has also developed OSGym, an open-source data engine that can run numerous operating system replicas simultaneously, enhancing the training and evaluation of computer use agents.
Key Takeaways
- Lux leads the market as a foundation computer use model, boasting an 83.6% success rate on the Online Mind2Web benchmark.
- It offers three operational modes—Actor, Thinker, and Tasker—to cater to various task requirements.
- Lux completes actions in about one second, making it cost-effective for tasks requiring multiple steps.
- The innovative Agentic Active Pre-training sets Lux apart, focusing on experiential learning rather than static training methods.
- OSGym supports research and development by allowing extensive testing in parallel operating system environments.
Conclusion
Lux signifies a leap forward in automated computer use, providing developers with versatile tools for addressing various tasks while maintaining high performance and cost-effectiveness. As it continues to develop, Lux promises to reshape user interactions across digital platforms.
Related Keywords
- Automated systems
- AI agents
- OpenAGI Foundation
- Machine learning
- Computer use automation
- Natural language processing
- Performance benchmarks

