Priority: medium Complexity: low Status: done
model is reliably available in normalized_usage_events (52/52 rows populated) and derived_message_facts, but does not reach the higher-level derived tables:
| Table | model column |
Populated? |
|---|---|---|
normalized_usage_events |
yes | yes — always filled |
derived_message_facts |
yes | yes — always filled |
derived_session_usage |
no | — |
derived_attempts |
no (has model_provider) |
— |
derived_goals |
yes | always NULL |
derived_projects |
no | — |
This means any consumer that needs model + tokens (cost calculation, model-aware analysis, export) must join back to normalized_usage_events or derived_message_facts — defeating the purpose of pre-aggregated derived tables.
Propagate model during the derive step:
derived_session_usage — add model TEXT column. Populate from normalized_usage_events grouped by session_path. If a session uses multiple models, pick the dominant one (highest usage_event_count).
derived_attempts — add model TEXT column alongside existing model_provider. Same logic: dominant model from usage events for that session.
derived_goals — already has the column, just never populated. Set to dominant model across all sessions for the thread. If mixed, store the most-used one.
derived_projects — consider adding models TEXT (JSON array of distinct models seen). Lower priority.
ALTER TABLE derived_session_usage ADD COLUMN model TEXT (same pattern already used for cache_creation_input_tokens)ALTER TABLE derived_attempts ADD COLUMN model TEXTderived_goals.model — no schema change needed, just populate during deriveSELECT count(*) FROM derived_session_usage WHERE model IS NULL should be 0 for sessions with usage eventsderived_goals.model should match the model seen in derived_message_facts for the same thread