rbi

package module
v0.7.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 1, 2026 License: Apache-2.0 Imports: 28 Imported by: 0

README

rbi

GoDoc License

This package should be considered experimental.

Roaring Bolt Indexer

A secondary index layer for bbolt.

It turns a key-value store into a document-oriented database with rich query capabilities, while preserving bbolt’s ACID guarantees for data storage. Indexes are kept fully in memory and built on top of roaring64 for fast set operations.

Properties
  • ACID – data durability is delegated to bbolt.
  • Index-only filtering – disk is never touched.
  • Document-oriented – queries return whole records, not individual fields.
  • Strong typing – generic API with user-defined key and value types.
Features
  • Automatic indexing of exported struct fields
  • Fine-grained control via struct tags (db, dbi, rbi)
  • Efficient query building via qx package:
    • comparisons: EQ, GT, GTE, LT, LTE
    • slices: IN, HAS, HASANY
    • strings: PREFIX, SUFFIX, CONTAINS
    • logical: AND, OR, NOT
  • Index-based ordering with offset / limit
  • Partial updates (Patch*) with minimal index churn
  • Batch writes (BatchSet, BatchPatch, BatchDelete)
  • Uniqueness constraints
LLM notice

Starting from v0.7, parts of the code, documentation and tests were created or improved with the assistance of LLM. Code created with LLM has been tested and verified, but may still contain inefficiencies.

Usage

package main

import (
    "fmt"

    "github.com/vapstack/qx"
    "github.com/vapstack/rbi"
    "go.etcd.io/bbolt"
)

type User struct {
    ID      uint64   `db:"id"`
    Name    string   `db:"name"`
    Age     int      `db:"age"`
    Active  bool     `db:"active"`
    Tags    []string `db:"tags"`
    Meta    string   `rbi:"-"` // not indexed
    Exclude string   `db:"-"`  // not indexed
}

func main() {

    bolt, err := bbolt.Open("test.db", 0600, nil)
    if err != nil {
        panic(err)
    }
    db, err := rbi.New[uint64, User](bolt, nil)
    if err != nil {
        _ = bolt.Close()
        panic(err)
    }
    defer db.Close()
    defer bolt.Close()

    err = db.Set(1, &User{
        Name:   "Alice",
        Age:    30,
        Active: true,
        Tags:   []string{"admin", "dev"},
    })
    if err != nil {
        panic(err)
    }

    err = db.Set(2, &User{
        ID:     2,
        Name:   "Bob",
        Age:    40,
        Active: false,
        Tags:   []string{"dev"},
    })
    if err != nil {
        panic(err)
    }

    q := qx.Query(
        qx.OR(
            qx.AND(
                qx.EQ("active", true),
                qx.HAS("tags", []string{"dev"}),
            ),
            qx.GT("age", 35),
        ),
    ).
        By("age", qx.ASC).
        Max(10)

    users, err := db.QueryItems(q)
    if err != nil {
        panic(err)
    }

    for _, u := range users {
        fmt.Printf("%v (%v)\n", u.Name, u.Age)
    }
}

API

For the full API reference see GoDoc.

Writing data
  • Set(id, value) – insert or replace a record and update affected indexes.
  • BatchSet(ids, values) – batch variant of Set, significantly faster for bulk inserts.
  • Patch(id, fields) – apply partial updates and update only changed indexes.
  • PatchStrict(id, fields) – like Patch, but fails on unknown fields.
  • PatchIfExists(id, fields) / PatchStrictIfExists(id, fields) – patch only existing records.
  • BatchPatch(ids, fields) / BatchPatchStrict(ids, fields) – batch patch variants.
  • Delete(id) – remove a record and its index entries.
  • BatchDelete(ids) – batch variant of Delete.
Querying

Queries are constructed using the qx package.

Field names refer to the names specified in db tags.
If a field does not have a db tag, the Go struct field name is used.

q := qx.Query(
    qx.EQ("field", val),
    qx.IN("department", []string{"it", "management"}),
    qx.HASANY("tags", []string{"go", "java"}),
    qx.OR(
        qx.PREFIX("name", "A"),
        qx.GT("age", 50),
        qx.LTE("score", 99.5),
    ),
).
    By("field", qx.DESC).
    Skip(10).
    Max(50)

Query methods:

  • QueryItems(q) – return matching records
  • QueryKeys(q) – return matching IDs
  • Count(q) – return result cardinality (ignoring offset/limit)

Query execution model

Queries run entirely in-memory; stored records are never scanned.

The runtime uses a single planner/executor pipeline:

  1. Normalize query tree into a deterministic internal form.
  2. Compile leaf predicates into bitmap/iterator-backed checks.
  3. Select an execution strategy by shape and cost.
  4. Execute using shared iterator/bitmap contracts and tracing hooks.

Leaf predicates are resolved via field indexes, producing either bitmaps of matching record IDs or index-backed iterators. Logical operators (AND, OR, NOT) are applied using bitmap operations; large result sets may be represented as negative sets to avoid materializing large bitmaps.

For ordered queries, the ordered field index is traversed directly and intersected with compiled predicates. Offset and limit are applied during traversal when possible.

For limit-driven candidate plans, a selective index yields candidate IDs, remaining predicates are checked via index lookups, and execution stops once enough results are collected.

Only the final set of matching record IDs is materialized. For QueryItems, record values are fetched from bbolt only for IDs that have passed all filters and limits.

QueryItems runs against an index snapshot aligned with a bbolt read transaction. When an exact snapshot for transaction is not immediately available, it uses bounded waiting with a few fallbacks. If none of the available paths can provide a valid snapshot within the retry budget, QueryItems returns an error. Retry budget is 30 * SnapshotPinWaitTimeout (default: 30s, because SnapshotPinWaitTimeout default is 1s).

Configuration

All runtime controls are configured through Options. Recommended pattern is to start with defaults and override only required values.

opts := rbi.DefaultOptions()

// Planner/trace settings
opts.AnalyzeInterval = 30 * time.Minute // < 0 disables periodic analyze loop
opts.TraceSink = func(ev rbi.TraceEvent) { /* log/collect trace */ }
opts.TraceSampleEvery = 1000 // 0 means "every query" when TraceSink is set

// Online calibration settings
opts.CalibrationEnabled = true           // false disables calibration (default)
opts.CalibrationSampleEvery = 32         // 0 uses default (16)
opts.CalibrationPersistPath = "planner_calibration.json" // optional auto load/save

// Single-op write batcher settings
opts.BatchWindow = 200 * time.Microsecond
opts.BatchMax = 16
opts.BatchMaxQueue = 512 // <= 0 means unbounded queue
opts.BatchAllowCallbacks = true // true allows combining ops with PreCommit callbacks

db, err := rbi.New[uint64, User](bolt, opts)
if err != nil {
    panic(err)
}

Ordering Limitations

The package currently supports ordering by a single indexed field only.

Queries that specify more than one ordering expression are rejected with an error. This restriction is intentional and allows to execute ordered queries directly via index traversal without materializing or re-sorting intermediate result sets.

If multi-column ordering is required, it must be implemented at the application level.

Struct Tags and Indexing

By default, all exported struct fields are indexed using the Go field name.

To exclude a field from indexing, use one of:

  • db:"-"
  • dbi:"-"
  • rbi:"-"

Excluding large fields (blobs, binary data) is strongly recommended unless you actually query on them.

Slice Fields

Slice-typed fields are indexed element-wise and support HAS, HASNOT, HASANY, HASNONE operations.

Equality for slice fields is implemented as set equality, not array equality.
This means ["a", "b", "a"] == ["a", "b"]

Unique Constraints

Tagging a field with:

`rbi:"unique"`

enforces a uniqueness constraint for that field.

  • Only scalar (non-slice) fields can be unique.
  • Uniqueness is enforced across single and batch writes (Set, Patch*, BatchSet, BatchPatch*).
  • Violations return ErrUniqueViolation before committing the transaction.
  • Uniqueness guarantees rely on indexes and are unavailable when indexing is disabled.

Custom Indexing with ValueIndexer

Scalar values and slices of scalars are indexed by default. Custom types may implement:

type ValueIndexer interface {
    IndexingValue() string
}

The returned value is used as the indexed representation.

Contract:
  • Must return a stable, deterministic value.
  • Equal values must produce equal indexing values.
  • nil handling is the responsibility of the implementation.
  • IndexingValue may be called on a nil receiver.

Incorrect implementations may cause panics or undefined query behavior.

Patch Resolution Rules

Patch accepts string field identifiers and resolves them in the following order:

  1. Struct field name
  2. db tag
  3. json tag

This allows JSON payloads to be applied directly without additional mapping.

type User struct {
    // Indexed as "UserName". Patchable via "UserName"
    UserName string
    
    // Indexed as "email". Patchable via "Email", "email", or "mail".
    Email string `db:"email" json:"mail"`
    
    // Not indexed. Patchable via "Password" or "pass".
    Password string `db:"-" json:"pass"`
    
    // Not indexed. Patchable via "Meta".
    Meta string `rbi:"-"`
}

Index Persistence and Recovery

Indexes are persisted only on Close().

An .rbo marker file is created on startup. If the marker is present during the next open, it indicates an unclean shutdown and automatically triggers a full index rebuild from the stored data.

Memory Usage

All secondary indexes are kept in memory.
Memory usage is roughly proportional to:

  • number of indexed fields,
  • number of records,
  • cardinality and distribution of indexed values.

Memory usage can also grow in these cases:

  • High write rate with slow compaction (larger snapshot-delta overlays)
  • Larger snapshot registry and deeper delta chains
  • Large delta compact thresholds combined with write bursts
  • Long-lived readers that keep old snapshots pinned (delays snapshot cleanup)

Memory stabilizes when write rate, compaction throughput, and snapshot retention are balanced. It grows when write churn consistently outruns compaction/cleanup.

Careful index and snapshot configuration is recommended for large datasets.

Multiple Instances

Multiple DB instances may safely operate on the same bbolt database.
Each instance maintains its own in-memory index.

Bucket name

DB stores all records in a single top-level bbolt bucket.

By default, the bucket name is derived from the value type. A custom bucket name can be provided via Options if explicit control is required (e.g. when value type is renamed).

Encoding and schema evolution

Values are encoded using msgpack.

Msgpack provides good performance, compact binary representation, and a flat encoding model similar to JSON. This makes it tolerant to many schema changes, including field reordering and movement between embedded and top-level structs. Unlike gob, field decoding does not depend on the exact structural layout of the type.

Most schema changes are handled gracefully:

  • Adding fields – a new index is created for the field; existing records simply have no value for it.
  • Removing fields – the corresponding index is removed, but encoded data for the field remains on disk until the record is updated.
  • Renaming fields – the old index is removed and a new one is created; stored data remains until records are updated.
  • Changing field types – affected indexes are rebuilt; decoding behavior and compatibility are the responsibility of the user.

Indexes for affected fields are automatically rebuilt when schema changes are detected.

Non-goals

This package does not aim to be a relational database or a SQL engine.

  • no projections (SELECT field1, field2),
  • no joins,
  • no aggregation functions (for now),
  • no query-time computed fields.

The focus is on fast selection of complete documents.

Performance notes

  • Package is read-optimized.
  • Prefer batch writes over single inserts, if possible.
  • Always use limits if you do not need the whole set.
  • Not all logical branches are currently optimized.

There is still room for optimization, but the current performance is already suitable for many workloads.

Query performance

Query performance is shape-dependent. Some classes are fast by design, some are highly data-dependent, and some are inherently expensive.

1. Fast-by-design classes

These classes are consistently fast when predicates are selective and LIMIT is small. Typical behavior is low-microsecond to sub-microsecond in the current benchmark profile.

  • Unique/equality point query with LIMIT 1
    • O(1) index lookup + O(1) result extraction
  • Top-N on ordered field (ORDER BY field LIMIT N, small N)
    • O(N) in best case (early stop on first buckets)
  • Selective IN/HAS/HASANY with LIMIT
    • approximately O(k + N), where k is number of touched postings
  • Selective prefix with small limit
    • O(log M + span(prefix) + N)
  • Selective range + order + limit
    • O(scanned_buckets + checked_rows) with early stop
2. Data-dependent classes

These can differ by one to two orders of magnitude for the same query shape. The main reason is that planner/executor cost is dominated by data distribution.

  • Range queries (GT/GTE/LT/LTE)
  • Prefix/text-like filters (PREFIX, SUFFIX, CONTAINS)
  • Moderate OR expressions with mixed predicates

What mostly determines runtime:

  • Predicate selectivity and field cardinality
  • Overlap between OR branches (high overlap increases redundant checks + dedupe work)
  • Order-field cardinality/skew (high-cardinality order can increase scan/probe cost)
  • Prefix span size (short/broad prefix can degenerate into near full-range scan)
  • Offset depth (deep skip forces extra scanning even with small limit)
  • Negative predicates (NOT*) that reduce early-stop opportunities
3. Inherently heavy classes

These are expensive in almost any workload because they force broad candidate enumeration, global deduplication, expensive ordering, and/or large materialization.

  • Wide OR trees with ordering and deep pagination:
    • Needs branch-level scanning, dedupe, global rank merge, then skip large prefixes
    • Practical complexity often approaches O(total_examined_rows), with large constants
  • Broad text scans (CONTAINS / SUFFIX) without a selective anchor:
    • Often requires scanning many index keys/buckets before filtering
    • Little opportunity for early pruning without additional anchors

For heavy and data-dependent classes, benchmark with your real data distribution.
Synthetic uniform datasets often hide worst-case behavior.

Write performance

Write speed depends on how many index entries are touched per operation (changed fields, slice fan-out, uniqueness checks), and on bbolt fsync/IO. Insertions are typically more expensive than updates. Patch* is usually faster than full Set* when only a subset of indexed fields changes.

Batch APIs (BatchSet, BatchPatch, BatchDelete) significantly reduce per-record overhead.

Contributing

Pull requests are welcome.
For major changes, please open an issue first.

Documentation

Index

Constants

View Source
const (
	PlanPrefixORMerge      = "plan_or_merge_"
	PlanPrefixORMergeOrder = "plan_or_merge_order_"
)

Variables

View Source
var (
	ErrNotStructType     = errors.New("value is not a struct")
	ErrClosed            = errors.New("database closed")
	ErrRebuildInProgress = errors.New("index rebuild in progress")
	ErrInvalidQuery      = errors.New("invalid query")
	ErrIndexDisabled     = errors.New("index is disabled")
	ErrUniqueViolation   = errors.New("unique constraint violation")
	ErrRecordNotFound    = errors.New("record not found")
	ErrNoValidKeyIndex   = errors.New("no valid key for index")
	ErrNilValue          = errors.New("value is nil")
)

Functions

This section is empty.

Types

type BatchStats added in v0.7.0

type BatchStats struct {
	// Enabled reports whether write combining is enabled.
	Enabled bool
	// Window is current coalescing window duration.
	Window time.Duration
	// MaxBatch is configured maximum combined batch size.
	MaxBatch int
	// MaxQueue is configured maximum queue size (0 means unbounded).
	MaxQueue int
	// AllowCallbacks reports whether callback-bearing writes can be combined.
	AllowCallbacks bool

	// QueueLen is current pending requests in queue.
	QueueLen int
	// QueueCap is current allocated queue capacity.
	QueueCap int
	// WorkerRunning reports whether combiner worker goroutine is active.
	WorkerRunning bool
	// HotWindowActive reports whether adaptive hot coalescing window is active.
	HotWindowActive bool

	// Submitted is number of submit attempts from eligible write calls.
	Submitted uint64
	// Enqueued is number of requests accepted into combiner queue.
	Enqueued uint64
	// Dequeued is number of requests popped from queue for execution.
	Dequeued uint64
	// QueueHighWater is maximum observed queue length.
	QueueHighWater uint64

	// Batches is total executed combined-transaction batches.
	Batches uint64
	// CombinedBatches is number of batches containing more than one request.
	CombinedBatches uint64
	// CombinedOps is total requests executed inside multi-request batches.
	CombinedOps uint64
	// AvgBatchSize is average requests per executed batch.
	AvgBatchSize float64
	// MaxBatchSeen is maximum observed executed batch size.
	MaxBatchSeen uint64

	// CallbackOps is number of requests with PreCommit callbacks executed by combiner.
	CallbackOps uint64
	// CoalescedSetDelete is number of Set/Delete requests collapsed into later Set/Delete of same ID.
	CoalescedSetDelete uint64

	// CoalesceWaits is number of coalescing sleeps performed by worker.
	CoalesceWaits uint64
	// CoalesceWaitTime is total time spent sleeping for coalescing.
	CoalesceWaitTime time.Duration

	// FallbackDisabled is number of write calls not queued because combiner is disabled.
	FallbackDisabled uint64
	// FallbackQueueFull is number of write calls not queued because queue is full.
	FallbackQueueFull uint64
	// FallbackCallbacks is number of write calls not queued because callbacks
	// are present and callback batching is disabled.
	FallbackCallbacks uint64
	// FallbackPatchUnique is a legacy counter of patch calls not queued due to
	// potential unique-field touch (kept for compatibility; expected to stay 0).
	FallbackPatchUnique uint64
	// FallbackClosed is number of write calls rejected by combiner because DB is closed.
	FallbackClosed uint64

	// UniqueRejected is number of queued requests rejected by unique checks before commit.
	UniqueRejected uint64
	// TxBeginErrors is number of write tx begin failures inside combiner.
	TxBeginErrors uint64
	// TxOpErrors is number of write tx operation failures before commit.
	TxOpErrors uint64
	// TxCommitErrors is number of write tx commit failures.
	TxCommitErrors uint64
	// CallbackErrors is number of callback failures returned by PreCommit funcs.
	CallbackErrors uint64
}

BatchStats contains write-combiner queue/batch/fallback diagnostics.

type CalibrationSnapshot added in v0.7.0

type CalibrationSnapshot struct {
	UpdatedAt   time.Time          `json:"updated_at"`
	Multipliers map[string]float64 `json:"multipliers"`
	Samples     map[string]uint64  `json:"samples"`
}

CalibrationSnapshot is a serializable view of planner calibration coefficients and sample counts.

type CalibrationStats added in v0.7.0

type CalibrationStats struct {
	// Enabled reports whether online calibration is enabled.
	Enabled bool
	// SampleEvery controls calibration sampling frequency (every Nth query).
	SampleEvery uint64

	// UpdatedAt is the timestamp of the last calibration state update.
	UpdatedAt time.Time
	// SamplesTotal is the total number of accumulated calibration samples.
	SamplesTotal uint64

	// Multipliers stores per-plan calibration multipliers.
	Multipliers map[string]float64
	// Samples stores per-plan calibration sample counters.
	Samples map[string]uint64
}

CalibrationStats contains online planner calibration diagnostics.

type DB

type DB[K ~string | ~uint64, V any] struct {
	// contains filtered or unexported fields
}

DB wraps a bbolt database and maintains secondary indexes over values of type *V stored in a single bucket. It supports efficient equality and range queries, as well as array membership and array-length queries for slice fields.

DB is safe for concurrent use.

func New added in v0.7.0

func New[K ~uint64 | ~string, V any](bolt *bbolt.DB, options *Options) (db *DB[K, V], err error)

New creates a new indexed DB that uses the provided bbolt database.

The generic type V must be a struct; otherwise ErrNotStructType is returned.

If options is nil, default options are used. For custom configuration, start from DefaultOptions and override required fields. If options.BucketName is empty, the name of the value type V is used as the bucket name. New ensures the bucket exists, optionally loads a persisted index from disk, rebuilds missing parts of the index if allowed, and sets up field metadata and accessors.

New does not manage the underlying *bbolt.DB lifecycle.

func (*DB[K, V]) BatchDelete added in v0.7.0

func (db *DB[K, V]) BatchDelete(ids []K, fns ...PreCommitFunc[K, V]) error

BatchDelete removes all values stored under the provided ids in a single write transaction.

For each key, any existing value is decoded and passed as oldValue to all PreCommitFunc fns. If an error is encountered during processing, the entire operation is rolled back. Missing IDs are skipped and do not trigger callbacks.

func (*DB[K, V]) BatchGet added in v0.7.0

func (db *DB[K, V]) BatchGet(ids ...K) ([]*V, error)

BatchGet retrieves multiple values by their IDs in a single read transaction. The returned slice has the same length as ids; any missing keys have a nil entry at the corresponding index.

func (*DB[K, V]) BatchPatch added in v0.7.0

func (db *DB[K, V]) BatchPatch(ids []K, patch []Field, fns ...PreCommitFunc[K, V]) error

BatchPatch applies the same patch to all values stored under the given IDs in a single write transaction.

Unknown fields are ignored (as in Patch). Any errors during processing aborts the entire batch and rolls back the transaction.

Non-existent IDs are skipped and do not trigger callbacks.

BatchPatch allocates a buffer for each encoded value. Patching a large number of values will consume a proportional amount of memory.

func (*DB[K, V]) BatchPatchStrict added in v0.7.0

func (db *DB[K, V]) BatchPatchStrict(ids []K, patch []Field, fns ...PreCommitFunc[K, V]) error

BatchPatchStrict is like BatchPatch, but returns an error if the patch contains field names that cannot be resolved to a known struct field (by name, db or json tag).

func (*DB[K, V]) BatchSet added in v0.7.0

func (db *DB[K, V]) BatchSet(ids []K, newVals []*V, fns ...PreCommitFunc[K, V]) error

BatchSet stores multiple values under the provided IDs in a single write transaction. The length of the ids and values must be equal.

For each key, any existing value is decoded and passed as oldValue to all PreCommitFunc fns. If an error is encountered during any of the processing steps, the transaction is rolled back and the error is returned.

After a successful commit, the in-memory index is updated for all modified keys unless indexing is disabled.

BatchSet allocates a buffer for each encoded value. Storing a large number of values will consume a proportional amount of memory.

func (*DB[K, V]) BatchStats added in v0.7.0

func (db *DB[K, V]) BatchStats() BatchStats

BatchStats returns write-combiner queue/batch/fallback diagnostics.

func (*DB[K, V]) Bolt

func (db *DB[K, V]) Bolt() *bbolt.DB

Bolt returns the underlying *bbolt.DB instance used by this DB. Should be used with caution.

func (*DB[K, V]) BucketName added in v0.2.1

func (db *DB[K, V]) BucketName() []byte

BucketName returns a name of the bucket at which the data is stored.

func (*DB[K, V]) CalibrationStats added in v0.7.0

func (db *DB[K, V]) CalibrationStats() CalibrationStats

CalibrationStats returns current planner calibration stats.

func (*DB[K, V]) Close

func (db *DB[K, V]) Close() error

Close closes the indexed DB.

On the first call, Close:

  • Persists the current index state to the .rbi file unless index persistence is disabled.
  • Removes the .rbo flag file used to detect unsafe shutdowns.
  • Does not close the underlying *bbolt.DB.

Subsequent calls to Close are no-op. After Close, all other methods return ErrClosed.

func (*DB[K, V]) Count

func (db *DB[K, V]) Count(q *qx.QX) (uint64, error)

Count evaluates the expression from the given query and returns the number of matching records. It ignores Order, Offset and Limit fields. If q is nil, Count returns the total number of keys currently present in the database.

func (*DB[K, V]) Delete

func (db *DB[K, V]) Delete(id K, fns ...PreCommitFunc[K, V]) error

Delete removes the value stored under the given id, if any.

The existing value (if present) is decoded and passed as oldValue to all PreCommitFunc fns. If any fn returns an error, the operation is aborted. If the record does not exist, Delete is a no-op and no callbacks are invoked.

func (*DB[K, V]) DisableIndexing

func (db *DB[K, V]) DisableIndexing()

DisableIndexing disables index updates for subsequent write operations .

When indexing is disabled:

  • Index structures are no longer kept up to date.
  • Query, QueryKeys and Count will return an error, because the index is considered invalid.
  • The caller is responsible for rebuilding the index via RebuildIndex before attempting to run queries again.

This is intended for high-throughput batch writes where the index will be rebuilt later.

func (*DB[K, V]) DisableSync

func (db *DB[K, V]) DisableSync()

DisableSync disables fsync for bolt writes. Can help with batch inserts. It should not be used during normal operation.

func (*DB[K, V]) EnableIndexing

func (db *DB[K, V]) EnableIndexing()

EnableIndexing re-enables index updates for subsequent write operations. It does not automatically rebuild or validate any existing index state.

If indexing was previously disabled and writes were performed, the index may be stale or inconsistent until RebuildIndex is called.

func (*DB[K, V]) EnableSync

func (db *DB[K, V]) EnableSync()

EnableSync enables fsync for bolt writes. See DisableSync. By default, fsync is enabled.

func (*DB[K, V]) Get

func (db *DB[K, V]) Get(id K) (*V, error)

Get returns the value stored by id or nil if key was not found.

func (*DB[K, V]) GetCalibrationSnapshot added in v0.7.0

func (db *DB[K, V]) GetCalibrationSnapshot() (CalibrationSnapshot, bool)

GetCalibrationSnapshot returns a copy of current planner calibration state. The bool result is false if state was not initialized yet.

func (*DB[K, V]) IndexStats added in v0.7.0

func (db *DB[K, V]) IndexStats() IndexStats[K]

IndexStats returns current index stats. On large databases this can be expensive.

func (*DB[K, V]) LoadCalibration added in v0.7.0

func (db *DB[K, V]) LoadCalibration(path string) error

LoadCalibration reads planner calibration snapshot from a JSON file.

func (*DB[K, V]) MakePatch added in v0.3.0

func (db *DB[K, V]) MakePatch(oldVal, newVal *V) []Field

MakePatch builds and returns a patch describing fields that changed between oldVal and newVal.

The patch includes both indexed and non-indexed fields. For every modified field it adds a Field entry whose Name is the Go struct field name, and whose Value is a deep copy of the value taken from newVal.

If newVal is nil, it returns an empty slice.

func (*DB[K, V]) MakePatchInto added in v0.3.0

func (db *DB[K, V]) MakePatchInto(oldVal, newVal *V, dst []Field) []Field

MakePatchInto is like MakePatch, but writes the result into the provided buffer to reduce allocations.

dst is treated as scratch space: it will be reset to length 0 and then filled with the resulting patch. The returned slice may refer to the same underlying array or a grown one if capacity is insufficient.

If newVal is nil, it returns an empty slice.

func (*DB[K, V]) Patch

func (db *DB[K, V]) Patch(id K, patch []Field, fns ...PreCommitFunc[K, V]) error

Patch applies a partial update to the value stored under the given id, updating only the fields listed in patch.

Unknown field names in patch are silently ignored. All fields, indexed or not, are eligible to be patched. Patch attempts to convert the provided values to the appropriate field type, if possible. If conversion fails for any field, Patch returns an error and no changes are committed.

If no item is found for the specified id, ErrRecordNotFound is returned and callbacks are not invoked.

All PreCommitFunc fns are invoked with the original (old) and patched (new) values before commit. After a successful commit, the in-memory index is updated.

func (*DB[K, V]) PatchIfExists added in v0.5.2

func (db *DB[K, V]) PatchIfExists(id K, patch []Field, fns ...PreCommitFunc[K, V]) error

PatchIfExists is like Patch, but missing records are skipped.

func (*DB[K, V]) PatchStrict

func (db *DB[K, V]) PatchStrict(id K, patch []Field, fns ...PreCommitFunc[K, V]) error

PatchStrict is like Patch, but returns an error if the patch contains field names that cannot be resolved to a known struct field (by name, db or json tag).

func (*DB[K, V]) PatchStrictIfExists added in v0.5.2

func (db *DB[K, V]) PatchStrictIfExists(id K, patch []Field, fns ...PreCommitFunc[K, V]) error

PatchStrictIfExists is like PatchStrict, but missing records are skipped.

func (*DB[K, V]) PlannerStats added in v0.7.0

func (db *DB[K, V]) PlannerStats() PlannerStats

PlannerStats returns current planner stats.

func (*DB[K, V]) Query added in v0.7.0

func (db *DB[K, V]) Query(q *qx.QX) ([]*V, error)

Query evaluates the given query against the index and returns all matching values.

func (*DB[K, V]) QueryKeys

func (db *DB[K, V]) QueryKeys(q *qx.QX) ([]K, error)

QueryKeys evaluates the given query against the index and returns all matching ids.

func (*DB[K, V]) RebuildIndex

func (db *DB[K, V]) RebuildIndex() error

RebuildIndex discards and rebuilds all in-memory index data. It acquires an exclusive lock for its duration. While rebuild is active, new DB operations fail with ErrRebuildInProgress. While it is safe to call at any time, it might be expensive for large datasets.

func (*DB[K, V]) RefreshPlannerStats added in v0.7.0

func (db *DB[K, V]) RefreshPlannerStats() error

RefreshPlannerStats rebuilds planner statistics from the current in-memory index. This is a synchronous, full refresh intended for explicit/manual calls.

func (*DB[K, V]) ReleaseRecords added in v0.7.0

func (db *DB[K, V]) ReleaseRecords(v ...*V)

ReleaseRecords returns records to the record pool.

Make sure that the passed records are no longer used or held.

func (*DB[K, V]) RuntimeStats added in v0.7.0

func (db *DB[K, V]) RuntimeStats() RuntimeStats

RuntimeStats returns process runtime memory stats.

func (*DB[K, V]) SaveCalibration added in v0.7.0

func (db *DB[K, V]) SaveCalibration(path string) error

SaveCalibration writes planner calibration snapshot to a JSON file.

func (*DB[K, V]) ScanKeys added in v0.6.0

func (db *DB[K, V]) ScanKeys(seek K, fn func(K) (bool, error)) error

ScanKeys iterates over keys in the in-memory index snapshot and calls fn for each key greater than or equal to seek.

The scan stops when fn returns false or a non-nil error. The scan does not open a Bolt transaction and may not reflect concurrent writes.

For string keys, iteration order follows internal key index order, not lexicographic order; seek is applied only as a prefix filter.

func (*DB[K, V]) SeqScan

func (db *DB[K, V]) SeqScan(seek K, fn func(K, *V) (bool, error)) error

SeqScan performs a sequential scan over all records starting at the given key (inclusive), decoding each value and passing it to the provided fn. SeqScan stops reading when the fn returns false or a non-nil error. The scan runs inside a read-only transaction which remains open for the duration of the scan.

func (*DB[K, V]) SeqScanRaw

func (db *DB[K, V]) SeqScanRaw(seek K, fn func(K, []byte) (bool, error)) error

SeqScanRaw performs a sequential scan over all records starting at the given key (inclusive), passing raw bytes to the provided fn. These bytes are msgpack-encoded representation of the values.

SeqScanRaw stops reading when the provided fn returns false or a non-nil error. The database transaction remains open during the scan.

Bytes passed to fn must not be modified.

func (*DB[K, V]) Set

func (db *DB[K, V]) Set(id K, newVal *V, fns ...PreCommitFunc[K, V]) error

Set stores the given value under the specified key ID.

The value is msgpack-encoded and written inside a single write transaction. Any existing value for the key is decoded and passed as oldValue to all PreCommitFunc fns. If any fn returns an error, the transaction is rolled back and the error is returned.

func (*DB[K, V]) SetCalibrationSnapshot added in v0.7.0

func (db *DB[K, V]) SetCalibrationSnapshot(s CalibrationSnapshot) error

SetCalibrationSnapshot replaces planner calibration state with the provided snapshot.

func (*DB[K, V]) SnapshotStats added in v0.7.0

func (db *DB[K, V]) SnapshotStats() SnapshotStats

SnapshotStats returns current snapshot stats.

func (*DB[K, V]) Stats

func (db *DB[K, V]) Stats() Stats[K]

Stats returns an aggregate diagnostic snapshot by combining results of IndexStats, RuntimeStats, SnapshotStats, PlannerStats, CalibrationStats and BatchStats.

On large databases this can be expensive.

For specific use-cases, prefer calling the corresponding component method directly to avoid collecting unrelated diagnostic data.

func (*DB[K, V]) Truncate added in v0.5.0

func (db *DB[K, V]) Truncate() error

Truncate deletes all values stored in the database. This cannot be undone.

Truncate does not reclaim disk space.

type Field

type Field struct {
	// Name is the logical name of the field to patch.
	// It can be a struct field name, a "db" tag value, or a "json" tag value.
	Name string
	// Value is the new value to assign to the field.
	// Patch* methods will attempt to convert Value to the field's concrete type,
	// including numeric widening and some int/float conversions.
	Value any
}

Field represents a single field assignment used by Patch and BatchPatch. Name is matched against struct field name, "db" tag, or "json" tag, and Value is assigned to the matched field using reflection and conversion rules.

type IndexStats added in v0.7.0

type IndexStats[K ~uint64 | ~string] struct {
	// IndexBuildTime is the duration of the last index rebuild.
	BuildTime time.Duration
	// IndexBuildRPS is an approximate throughput (records per second) of the last index rebuild.
	BuildRPS int
	// IndexLoadTime is the time spent loading a persisted index from disk on the last successful load.
	LoadTime time.Duration

	// LastKey is the largest key present in the database according to the
	// current universe bitmap. For string keys this is derived from the
	// internal string mapping.
	LastKey K
	// KeyCount is the total number of keys currently present in the database.
	KeyCount uint64

	// UniqueFieldKeys contains the number of unique index keys per indexed field name.
	UniqueFieldKeys map[string]uint64
	// IndexSize contains the total size of the index, in bytes.
	Size uint64
	// IndexFieldSize contains the size of the index for each indexed field.
	FieldSize map[string]uint64
	// IndexFieldKeyBytes contains total bytes occupied by index keys per field.
	FieldKeyBytes map[string]uint64
	// IndexFieldTotalCardinality contains sum of posting-list cardinalities per field.
	FieldTotalCardinality map[string]uint64

	// FieldCount is the number of indexed fields in the current snapshot.
	FieldCount int

	// EntryCount is the total number of non-empty index entries across fields.
	EntryCount uint64
	// KeyBytes is the total byte length of index keys across entries.
	KeyBytes uint64
	// BitmapCardinality is the sum of posting bitmap cardinalities across entries.
	BitmapCardinality uint64

	// ApproxStructBytes is approximate memory used by index entry structs.
	ApproxStructBytes uint64
	// ApproxHeapBytes is rough index heap estimate from bitmaps, keys and structs.
	ApproxHeapBytes uint64
}

IndexStats contains index build/load timings and current index shape metrics.

type Options

type Options struct {

	// DisableIndexLoad prevents indexer from loading previously persisted index
	// data from the .rbi file on startup. If set, indexer will either rebuild
	// the index from the underlying bucket (unless DisableIndexRebuild is
	// also set) or operate with an empty index until rebuilt.
	DisableIndexLoad bool

	// DisableIndexStore prevents indexer from saving the in-memory index state
	// to the .rbi file on Close.
	DisableIndexStore bool

	// DisableIndexRebuild skips automatic index rebuilding when the index
	// cannot be loaded or is incomplete. When this is true and the index
	// cannot be reused, queries will fail until the caller explicitly
	// rebuilds the index using RebuildIndex.
	DisableIndexRebuild bool

	// BucketName overrides the default bucket name.
	// By default, bucket name is derived from the name of the value type V.
	BucketName string

	// AnalyzeInterval configures how often planner statistics should be
	// refreshed in the background.
	//
	// Default: 1 hour
	//
	// Negative value disables periodic refresh.
	AnalyzeInterval time.Duration

	// TraceSink receives optional per-query planner tracing events.
	// If nil, planner tracing is disabled.
	//
	// Synchronous or heavy sinks can significantly increase query latency.
	TraceSink func(TraceEvent)

	// TraceSampleEvery controls trace sampling:
	//   - 0: when sink is set, sample every query (equivalent to 1)
	//   - 1: sample every query
	//   - N: sample every Nth query
	TraceSampleEvery uint64

	// SnapshotPinWaitTimeout controls how long Query waits for a snapshot
	// with matching Bolt txID to appear after opening a read transaction.
	// Non-positive values use the default timeout.
	//
	// Default: 1s
	//
	// Query retry budget is bounded by 10x this value.
	// Too low can increase "snapshot is not available" errors under write bursts.
	// Too high can increase tail latency when snapshot publication is delayed.
	SnapshotPinWaitTimeout time.Duration

	// CalibrationEnabled enables online self-calibration of planner
	// cost coefficients using sampled query traces.
	//
	// Enable for workloads with evolving predicate selectivity/cost profile.
	// Keep disabled when strict latency determinism is preferred.
	CalibrationEnabled bool

	// CalibrationSampleEvery controls calibration sampling:
	//   - 0: use default sampling interval
	//   - 1: calibrate every query
	//   - N: calibrate every Nth query
	// The value is ignored when CalibrationEnabled is false.
	//
	// Default: 16
	//
	// Lower values adapt faster but add overhead and sensitivity to noise.
	// Higher values reduce overhead but adapt slower to workload shifts.
	CalibrationSampleEvery uint64

	// CalibrationPersistPath enables optional auto load/save of planner
	// calibration state from/to this JSON file.
	//
	// Default: empty (disabled).
	CalibrationPersistPath string

	// BucketFillPercent controls bbolt bucket fill factor for write operations.
	//
	// Default: 0.8
	//
	// Valid range: (0, 1]
	//
	// Lower values leave more free space on pages (usually larger file and
	// lower split cost on random updates). Higher values pack pages denser
	// (usually smaller file, potentially higher split/relocation cost).
	//
	// Values too low can cause excessive file growth and write amplification.
	// Values too high on churn-heavy workloads can sharply increase write latency.
	BucketFillPercent float64

	// SnapshotMaterializedPredCacheMaxEntries controls max number of cached
	// materialized predicate bitmaps for stable snapshots (without deltas).
	//
	// Default: 16
	//
	// Typical range: 16..256
	//
	// Negative value disables cache for stable snapshots.
	//
	// High values on diverse workloads can cause sharp memory growth.
	SnapshotMaterializedPredCacheMaxEntries int

	// SnapshotMaterializedPredCacheMaxEntriesWithDelta controls max cached
	// materialized predicate bitmaps for snapshots with active deltas.
	//
	// Default: 2
	//
	// Typical range: 2..64
	//
	// Negative value disables cache for delta snapshots.
	//
	// High values can increase GC pressure under write-heavy workloads.
	SnapshotMaterializedPredCacheMaxEntriesWithDelta int

	// SnapshotMaterializedPredCacheMaxBitmapCardinality skips caching very large
	// bitmaps to reduce retained heap and GC pressure.
	//
	// Default: 32K
	//
	// Negative value disables the guard.
	//
	// Negative (disabled) or very large values can significantly increase memory
	// usage for broad predicates.
	SnapshotMaterializedPredCacheMaxBitmapCardinality int

	// SnapshotRegistryMax limits amount of snapshots tracked for txID pinning/floor fallback.
	//
	// Default: 16
	//
	// Typical range: 32..512
	//
	// Higher values retain more snapshots (higher memory).
	// Too low values can increase snapshot misses for long readers.
	SnapshotRegistryMax uint

	// SnapshotDeltaCompactFieldKeys is a per-field threshold for accumulated
	// delta keys; above it field delta is compacted into base index.
	//
	// Default: 256
	//
	// Typical range: 128..2048
	//
	// Negative value disables key-count trigger.
	//
	// Too high can increase delta memory/read CPU.
	// Too low can increase compaction frequency and hurt write latency.
	SnapshotDeltaCompactFieldKeys int

	// SnapshotDeltaCompactFieldOps is a per-field threshold for accumulated
	// add/del cardinality across delta entries.
	//
	// Default: 4096
	//
	// Typical range: 2K..64K
	//
	// Negative value disables ops-count trigger.
	//
	// Too high delays compaction (delta growth, read overhead).
	// Too low can force frequent compaction and hurt write throughput.
	SnapshotDeltaCompactFieldOps int

	// SnapshotDeltaCompactMaxFieldsPerPublish limits how many fields can be
	// compacted from delta into base in one publish pass.
	//
	// Default: 3
	//
	// Negative value disables field compaction in publish path.
	//
	// High values can create severe write-latency spikes.
	SnapshotDeltaCompactMaxFieldsPerPublish int

	// SnapshotDeltaCompactUniverseOps is a threshold for universe add/drop
	// cardinality sum; above it universe delta is compacted into base.
	//
	// Default: 4096
	//
	// Typical values: 2K..64K
	//
	// Negative value disables universe compaction trigger.
	//
	// Too high values can increase overlay growth/read cost.
	// Too low values can increase write-path compaction work.
	SnapshotDeltaCompactUniverseOps int

	// SnapshotDeltaLayerMaxDepth limits per-field delta layer depth.
	// Once exceeded, layered delta is flattened into one layer.
	//
	// Default: 6
	//
	// Typical range: 4..64
	//
	// Negative value disables depth-based flattening.
	//
	// Very high/disabled values can increase read-path CPU and memory.
	// Very low values can increase flattening overhead on writes.
	SnapshotDeltaLayerMaxDepth int

	// SnapshotCompactorMaxIterationsPerRun limits background compaction work per wake-up.
	//
	// Default: 3
	//
	// Typical range: 1..8
	//
	// Zero disables compaction passes.
	//
	// High values increase contention with writers and can degrade throughput.
	SnapshotCompactorMaxIterationsPerRun uint

	// SnapshotCompactorRequestEveryNWrites controls best-effort compactor
	// wakeups under steady write load.
	//
	// Default: 4
	//
	// Typical range: 4..64
	//
	// Lower values improve delta control but increase write contention.
	// Higher values reduce contention but can increase delta memory/read cost.
	//
	// Zero disables periodic write-triggered compactor requests.
	//
	// Value 1 can cause sustained compactor/writer contention and write
	// throughput degradation on heavy write workloads.
	SnapshotCompactorRequestEveryNWrites uint

	// SnapshotCompactorIdleInterval configures one-shot idle debounce for
	// force-drain compaction when snapshot activity stops.
	// After this pause without new snapshot publication, compactor performs a
	// bounded force pass to collapse remaining deltas and aggressively prune
	// snapshot registry for best read-path locality.
	//
	// Default: 2s
	//
	// Typical range: 500ms..10s
	//
	// Non-positive value disables idle force-drain mode.
	//
	// Lower values converge faster after write bursts but can increase
	// compactor/writer contention on bursty workloads.
	// Higher values reduce background churn but keep layered state longer.
	SnapshotCompactorIdleInterval time.Duration

	// BatchWindow enables lightweight write micro-batching window for
	// single-record Set/Patch/Delete operations.
	//
	// Default: 200us
	//
	// Typical range: 10us..500us
	//
	// Non-positive value disables write combining.
	//
	// Higher values can reduce write-path overhead under contention but may
	// increase single-write latency at low load.
	BatchWindow time.Duration

	// BatchMax limits max operations merged into one combined write tx.
	//
	// Default: 16
	//
	// Typical range: 4..64
	//
	// Values <=1 disable effective batching.
	//
	// Very high values can create commit-size spikes and tail-latency variance.
	BatchMax int

	// BatchMaxQueue limits pending combined write requests.
	//
	// Default: 512
	//
	// Typical range: 128..8192
	//
	// Non-positive value disables queue cap.
	//
	// Larger values can increase memory usage under sustained overload.
	BatchMaxQueue int

	// BatchAllowCallbacks allows combiner batching for requests with one or more
	// PreCommit callbacks.
	//
	// Default: false
	//
	// When false, any Set/Patch/Delete call with callbacks bypasses combiner queue
	// and is executed via direct single-write path.
	//
	// When true, callback-bearing requests may be combined with other writes and
	// callbacks run inside the same shared write transaction.
	//
	// Limitations:
	// - A callback error aborts the current combined-transaction attempt. The
	//   failed request is isolated and remaining requests are retried without it.
	// - On such abort, the whole current write transaction is rolled back.
	//   Stored records and in-memory index state remain consistent; no partial
	//   data/index changes from the failed attempt are published.
	// - Non-callback transaction errors (put/delete/commit) still fail all
	//   requests from the current combined batch.
	// - Callback execution order follows operation order inside combined batch.
	// - Because surviving requests can be retried after isolating a failed one,
	//   their callbacks may run more than once. Callback logic with side effects
	//   outside this DB write transaction should be idempotent.
	BatchAllowCallbacks bool

	// NumericRangeBucketSize controls amount of sorted numeric keys grouped into one
	// pre-aggregated bucket for range predicate acceleration.
	//
	// Default: 512
	//
	// Non-positive value disables numeric bucket acceleration.
	NumericRangeBucketSize int

	// NumericRangeBucketMinFieldKeys is the minimum amount of unique keys in a
	// numeric field required to build range buckets.
	//
	// Default: 8192
	//
	// Non-positive value disables numeric bucket acceleration.
	NumericRangeBucketMinFieldKeys int

	// NumericRangeBucketMinSpanKeys is the minimum range span (in keys) required to
	// route GT/GTE/LT/LTE through bucket acceleration path.
	//
	// Default: 2048
	//
	// Non-positive value disables numeric bucket acceleration.
	NumericRangeBucketMinSpanKeys int
}

Options configures how indexer works with a bbolt database.

DefaultOptions returns a new options object with all defaults pre-filled. Passing nil options to New is equivalent to DefaultOptions.

func DefaultOptions added in v0.7.0

func DefaultOptions() *Options

DefaultOptions returns a new options object with default settings.

type PlanName added in v0.7.0

type PlanName string

PlanName is a stable plan identifier used by tracing and calibration.

const (
	PlanBitmap PlanName = "plan_bitmap"

	PlanCandidateNoOrder PlanName = "plan_candidate_no_order"
	PlanCandidateOrder   PlanName = "plan_candidate_order"

	PlanORMergeNoOrder     PlanName = "plan_or_merge_no_order"
	PlanORMergeOrderMerge  PlanName = "plan_or_merge_order_merge"
	PlanORMergeOrderStream PlanName = "plan_or_merge_order_stream"

	PlanOrdered        PlanName = "plan_ordered"
	PlanOrderedNoOrder PlanName = "plan_ordered_no_order"
	PlanOrderedAnchor  PlanName = "plan_ordered_anchor"
	PlanOrderedLead    PlanName = "plan_ordered_lead"

	PlanLimit              PlanName = "plan_limit"
	PlanLimitOrderBasic    PlanName = "plan_limit_order_basic"
	PlanLimitOrderPrefix   PlanName = "plan_limit_order_prefix"
	PlanLimitPrefixNoOrder PlanName = "plan_limit_prefix_no_order"
	PlanLimitRangeNoOrder  PlanName = "plan_limit_range_no_order"
	PlanUniqueEq           PlanName = "plan_unique_eq"
)

type PlannerFieldStats added in v0.7.0

type PlannerFieldStats struct {
	// DistinctKeys is number of distinct keys in field index.
	DistinctKeys uint64
	// NonEmptyKeys is number of keys with non-empty posting bitmap.
	NonEmptyKeys uint64
	// TotalBucketCard is total cardinality summed across all field buckets.
	TotalBucketCard uint64
	// AvgBucketCard is average bucket cardinality.
	AvgBucketCard float64
	// MaxBucketCard is maximum bucket cardinality.
	MaxBucketCard uint64
	// P50BucketCard is median bucket cardinality.
	P50BucketCard uint64
	// P95BucketCard is 95th percentile bucket cardinality.
	P95BucketCard uint64
}

PlannerFieldStats contains per-field cardinality distribution metrics.

type PlannerStats added in v0.7.0

type PlannerStats struct {
	// Version is the current planner statistics version.
	Version uint64
	// GeneratedAt is the timestamp when planner stats were generated.
	GeneratedAt time.Time
	// UniverseCardinality is the universe cardinality used by planner stats.
	UniverseCardinality uint64
	// FieldCount is the number of fields represented in planner stats.
	FieldCount int
	// Fields contains per-field planner cardinality distribution metrics.
	// The map is deep-copied and safe for caller mutation.
	Fields map[string]PlannerFieldStats

	// AnalyzeInterval is the configured periodic planner analyze interval.
	AnalyzeInterval time.Duration
	// TraceSampleEvery controls trace sampling frequency (every Nth query).
	TraceSampleEvery uint64
}

PlannerStats contains planner snapshot metadata, per-field stats and sampling settings.

type PreCommitFunc

type PreCommitFunc[K ~string | ~uint64, V any] = func(tx *bbolt.Tx, key K, oldValue, newValue *V) error

PreCommitFunc is a callback invoked inside the write transaction just before it is committed.

The callback:

  • Must not modify oldValue or newValue.
  • Must not commit or roll back the transaction.
  • Must not modify records in the bucket managed by this DB instance (or by any other DB instance with enabled indexing), because such writes bypass index synchronization.
  • May perform additional reads or writes within the same transaction.
  • May return an error to abort the operation; in this case the transaction will be rolled back and index state will not be updated.

PreCommitFunc is invoked only for records that exist or are being written. Patch/Delete operations skip missing records and do not invoke callbacks for them.

type RuntimeStats added in v0.7.0

type RuntimeStats struct {
	// Goroutines is the current number of goroutines.
	Goroutines int

	// HeapAlloc is bytes of allocated heap objects.
	HeapAlloc uint64
	// HeapInuse is bytes in in-use heap spans.
	HeapInuse uint64
	// HeapIdle is bytes in idle heap spans.
	HeapIdle uint64
	// HeapReleased is bytes returned from heap to OS.
	HeapReleased uint64
	// HeapObjects is the number of allocated heap objects.
	HeapObjects uint64

	// StackInuse is bytes in stack spans.
	StackInuse uint64
	// MSpanInuse is bytes in mspan allocator metadata.
	MSpanInuse uint64
	// MCacheInuse is bytes in mcache allocator metadata.
	MCacheInuse uint64

	// NextGC is the target heap size for the next GC cycle.
	NextGC uint64
	// LastGC is the timestamp of the last completed GC cycle.
	LastGC time.Time
	// NumGC is the total number of completed GC cycles.
	NumGC uint32
	// GCCPUFraction is the fraction of available CPU used by GC since start.
	GCCPUFraction float64
}

RuntimeStats contains process runtime memory counters.

type SnapshotStats added in v0.7.0

type SnapshotStats struct {
	// TxID is the transaction ID of the published snapshot.
	TxID uint64

	// HasDelta reports whether snapshot contains any delta state.
	HasDelta bool
	// UniverseBaseCard is cardinality of the base universe bitmap.
	UniverseBaseCard uint64
	// IndexLayerDepth is depth of index delta layer chain.
	IndexLayerDepth int
	// LenLayerDepth is depth of length-index delta layer chain.
	LenLayerDepth int

	// IndexDeltaFields is number of fields with effective index delta.
	IndexDeltaFields int
	// LenDeltaFields is number of fields with effective length delta.
	LenDeltaFields int
	// IndexDeltaKeys is total effective keys in index delta layers.
	IndexDeltaKeys int
	// LenDeltaKeys is total effective keys in length delta layers.
	LenDeltaKeys int
	// IndexDeltaOps is total effective operations in index delta layers.
	IndexDeltaOps uint64
	// LenDeltaOps is total effective operations in length delta layers.
	LenDeltaOps uint64

	// UniverseAddCard is cardinality of pending universe additions.
	UniverseAddCard uint64
	// UniverseRemCard is cardinality of pending universe removals.
	UniverseRemCard uint64

	// RegistrySize is number of snapshot entries tracked in registry map.
	RegistrySize int
	// RegistryOrderLen is length of registry order buffer.
	RegistryOrderLen int
	// RegistryHead is current head offset inside registry order buffer.
	RegistryHead int
	// PinnedRefs is number of registry snapshots with active pins.
	PinnedRefs int
	// PendingRefs is number of registry snapshots marked pending.
	PendingRefs int

	// CompactorQueueLen is current compactor request queue length.
	CompactorQueueLen int
	// CompactorRequested is total number of compaction requests.
	CompactorRequested uint64
	// CompactorRuns is total number of compactor loop runs.
	CompactorRuns uint64
	// CompactorAttempts is total latest-snapshot compaction attempts.
	CompactorAttempts uint64
	// CompactorSucceeded is total successful compactions applied.
	CompactorSucceeded uint64
	// CompactorLockMiss is total attempts skipped due to DB lock contention.
	CompactorLockMiss uint64
	// CompactorNoChange is total attempts that produced no effective changes.
	CompactorNoChange uint64
}

SnapshotStats contains copy-on-write snapshot and compactor diagnostics.

type Stats

type Stats[K ~uint64 | ~string] struct {
	// Index contains additional index shape diagnostics useful for memory analysis.
	Index IndexStats[K]
	// Runtime contains process-level memory counters sampled during Stats().
	Runtime RuntimeStats
	// Snapshot contains copy-on-write snapshot/compactor diagnostics.
	Snapshot SnapshotStats
	// Planner contains current planner statistics snapshot and settings.
	Planner PlannerStats
	// Calibration contains current online planner calibration state.
	Calibration CalibrationStats
	// Batch contains write-combiner queue/batch/fallback diagnostics.
	Batch BatchStats
}

Stats is an aggregate diagnostic snapshot of DB state.

It combines outputs of IndexStats, RuntimeStats, SnapshotStats, PlannerStats, CalibrationStats and BatchStats.

For scenario-specific telemetry, prefer calling the corresponding component method directly to avoid unnecessary work.

type TraceEvent added in v0.7.0

type TraceEvent struct {
	Timestamp time.Time
	Duration  time.Duration

	Plan string

	HasOrder   bool
	OrderField string
	OrderDesc  bool
	Offset     uint64
	Limit      uint64

	LeafCount int
	HasNeg    bool
	HasPrefix bool

	EstimatedRows uint64
	EstimatedCost float64
	FallbackCost  float64

	RowsExamined uint64
	RowsReturned uint64

	// ORBranches contains per-branch runtime metrics for OR plans.
	ORBranches []TraceORBranch
	// ORRoute contains route/cost diagnostics for ordered OR merge path.
	ORRoute TraceORRoute

	// OrderIndexScanWidth is the number of non-empty order-index buckets
	// traversed while producing query output.
	OrderIndexScanWidth uint64

	// DedupeCount is the number of duplicate candidates dropped globally.
	DedupeCount uint64

	// EarlyStopReason explains why execution stopped early.
	// Examples: "limit_reached", "input_exhausted", "candidates_exhausted".
	EarlyStopReason string

	Error string
}

TraceEvent is an optional per-query planner execution trace. It is emitted only when TraceSink is configured.

type TraceORBranch added in v0.7.0

type TraceORBranch struct {
	Index int

	RowsExamined uint64
	RowsEmitted  uint64

	Skipped    bool
	SkipReason string
}

type TraceORRoute added in v0.7.0

type TraceORRoute struct {
	Route  string
	Reason string

	KWayCost     float64
	FallbackCost float64
	Overlap      float64
	AvgChecks    float64

	HasPrefixNonOrder   bool
	HasSelectiveLead    bool
	FallbackCollectFast bool

	RuntimeGuardEnabled bool
	RuntimeGuardReason  string

	RuntimeFallbackTriggered    bool
	RuntimeFallbackReason       string
	RuntimeExaminedPerUnique    float64
	RuntimeProjectedExamined    float64
	RuntimeProjectedExaminedMax float64
}

TraceORRoute carries route diagnostics for ordered OR merge decisions.

type ValueIndexer

type ValueIndexer interface {
	IndexingValue() string
}

ValueIndexer defines how a field value is converted into a canonical string representation used as an index key in rbi.

A type that implements ValueIndexer is responsible for ensuring that IndexingValue returns a valid and stable string for every value that may appear in indexed data. This includes handling nil receivers if the type is a pointer or otherwise nillable. The caller does not perform nil checks before invoking IndexingValue.

IndexingValue must return a deterministic string: the same value must always produce the same indexing key.

The returned string is compared lexicographically when evaluating range queries (>, >=, <, <=). Implementation must ensure that the produced ordering matches the intent.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL