Data Model

Magneteco uses a hybrid storage approach combining PostgreSQL with pgvector for structured data and embeddings, and Neo4j for relationship graphs.

PostgreSQL Schema

memory_resources

Raw resources (immutable conversation/event logs):

CREATE TABLE memory_resources (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    app_id TEXT NOT NULL,
    user_id TEXT NOT NULL,
    s3_key TEXT NOT NULL,
    content_type TEXT NOT NULL CHECK (content_type IN ('conversation', 'event', 'document', 'webhook')),
    metadata JSONB DEFAULT '{}',
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

memory_items

Atomic memory items (extracted facts):

CREATE TABLE memory_items (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    app_id TEXT NOT NULL,
    user_id TEXT NOT NULL,
    resource_id UUID REFERENCES memory_resources(id) ON DELETE SET NULL,
    category TEXT NOT NULL,
    content TEXT NOT NULL,
    embedding VECTOR(1536),  -- OpenAI ada-002 dimension
    confidence FLOAT NOT NULL DEFAULT 1.0,
    importance TEXT NOT NULL DEFAULT 'medium',
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    last_accessed_at TIMESTAMPTZ,
    access_count INTEGER NOT NULL DEFAULT 0,
    archived BOOLEAN NOT NULL DEFAULT FALSE
);

-- Vector search index
CREATE INDEX idx_items_embedding ON memory_items
    USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

memory_categories

Category summaries (evolving high-level context):

CREATE TABLE memory_categories (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    app_id TEXT NOT NULL,
    user_id TEXT NOT NULL,
    category TEXT NOT NULL,
    summary TEXT,
    item_count INTEGER NOT NULL DEFAULT 0,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),

    CONSTRAINT memory_categories_unique UNIQUE (app_id, user_id, category)
);

Neo4j Graph Model

Node Types

User Node - Represents a user in the system:

(:User {
    app_id: String,
    id: String,
    created_at: DateTime,
    last_active_at: DateTime
})

Entity Node - Represents any domain object:

(:Entity {
    app_id: String,
    name: String,
    type: String,           // From domain config entityTypes
    properties: Map,        // Dynamic based on entity type
    created_at: DateTime,
    last_accessed: DateTime,
    archived: Boolean
})

Relationship Types

MENTIONED - User mentions an entity:

(User)-[:MENTIONED {
    timestamp: DateTime,
    context: String,
    source_resource: String
}]->(Entity)

RELATES_TO - Entity relationships:

(Entity)-[:RELATES_TO {
    type: String,           // From domain config relationTypes
    created_at: DateTime,
    archived: Boolean,
    properties: Map
}]->(Entity)

SUPERSEDES - Conflict resolution:

(Entity)-[:SUPERSEDES {
    timestamp: DateTime,
    reason: String
}]->(Entity)

Example Graph

    ┌──────────┐                    ┌──────────────┐
    │  User    │───MENTIONED───────►│   Client     │
    │ user-123 │                    │  Acme Corp   │
    └──────────┘                    └───────┬──────┘
         │                                  │
         │                                  │ OWNS
         │ MENTIONED                        ▼
         │                          ┌──────────────┐
         │                          │   Project    │
         └─────────────────────────►│ NetSuite Imp │
                                    └───────┬──────┘
                                            │
                                            │ HAS_REQUIREMENT
                                            ▼
                                    ┌──────────────┐
                                    │ Requirement  │
                                    │  User Auth   │
                                    └──────────────┘

Vector Search Function

PostgreSQL function for vector similarity search with time decay:

CREATE OR REPLACE FUNCTION search_memories_with_decay(
    p_app_id TEXT,
    p_user_id TEXT,
    p_embedding VECTOR(1536),
    p_limit INTEGER DEFAULT 20,
    p_decay_days INTEGER DEFAULT 30,
    p_min_confidence FLOAT DEFAULT 0.6
)
RETURNS TABLE (
    id UUID,
    content TEXT,
    category TEXT,
    final_score FLOAT
) AS $$
BEGIN
    RETURN QUERY
    SELECT
        mi.id,
        mi.content,
        mi.category,
        (
            (1.0 - (mi.embedding <=> p_embedding)) *
            mi.confidence *
            (1.0 / (1.0 + EXTRACT(EPOCH FROM (NOW() - mi.created_at)) / (p_decay_days * 86400)))
        )::FLOAT as final_score
    FROM memory_items mi
    WHERE mi.app_id = p_app_id
    AND mi.user_id = p_user_id
    AND NOT mi.archived
    AND mi.confidence >= p_min_confidence
    ORDER BY mi.embedding <=> p_embedding
    LIMIT p_limit;
END;
$$ LANGUAGE plpgsql;

Data Model

On this page