Unveiling seekdb: The AI Native Hybrid Search Database
In the rapidly evolving landscape of artificial intelligence, efficient data management is crucial. OceanBase has just released seekdb, a groundbreaking open-source database designed specifically for AI applications.
What is seekdb?
seekdb is a lightweight, embedded version of the OceanBase engine tailored for AI rather than general-purpose distributed systems. This single-node database operates in both embedded and client/server modes, staying fully compatible with MySQL drivers and SQL syntax.
Key Features of seekdb
- Embedded Database Support: Ideal for local and edge deployments.
- Standalone Operation: Can function independently without the need for distribution.
- No Distributed Support: Unlike the full OceanBase, seekdb does not handle distributed scenarios.
Core Feature: Hybrid Search
The primary innovation in seekdb is its hybrid search capability. This unique feature allows users to perform vector-based semantic retrieval, full-text keyword searches, and scalar filters simultaneously.
Hybrid Search Implementation
Seekdb utilizes a system package called DBMS_HYBRID_SEARCH with two main entry points:
- DBMS_HYBRID_SEARCH.SEARCH: Returns results in JSON format, sorted by relevance.
- DBMS_HYBRID_SEARCH.GET_SQL: Provides the SQL string executed for retrieval.
Comprehensive Data Model
seekdb merges various data formats in a single storage and indexing layer:
- Relational Data: Uses standard SQL for queries.
- Vector Searches: Effective for embedding-based retrieval.
- Full Text Searches: Efficient keyword and phrase queries.
- JSON and Spatial GIS Data: Supports diverse data queries.
In-Depth Vector and Text Handling
At its core, seekdb houses an advanced vector and full-text stack:
Vector Capabilities
- Support for dense and sparse vectors.
- Distance metrics including Manhattan, Euclidean, inner product, and cosine.
- In-memory and disk-based index types such as HNSW and IVF.
Full Text Search Features
- Supports keyword, phrase, and Boolean queries.
- Implements BM25 ranking for enhanced relevance.
- Offers various tokenizer modes for versatility.
AI Functions Within the Database
seekdb integrates built-in AI functions, allowing direct model calls through SQL:
- AI_EMBED: Converts text into vectors.
- AI_COMPLETE: Generates text using chat models.
- AI_RERANK: Reranks candidates from a search.
- AI_PROMPT: Assembles prompt templates dynamically.
Managing Multimodal Data and Workloads
Built to handle diverse data types, seekdb allows seamless querying of:
- Semantically similar documents.
- Filters based on JSON metadata like tenant or region.
- Spatial constraints for GIS data.
Key Takeaways
- AI Native Hybrid Search: Combines multiple retrieval methods into one simple interface.
- Unified Data Management: Keep all data types consistent within a single engine.
- Direct AI Functionality: Simplifies workflows and reduces orchestration needs.
- Single-node Design: Optimized for local or embedded AI workloads.
- Open Source Ecosystem: Supports integration with various AI tools and frameworks.
Conclusion
The introduction of seekdb by OceanBase represents a significant advancement in AI data management. By integrating hybrid search capabilities with advanced AI functions in one unified engine, it simplifies complex workflows and facilitates the deployment of AI applications.
Related Keywords
- AI database
- Hybrid search
- Multimodal data
- Open-source database
- Embedded systems
- Vector search
- Relational databases

