Personal Systems: Data Epistemics

In Jasper, his experiment on recreating the Canon Cat interface, Obenauer notes:

Something unexpected that I really like about this system is that it’s always “correct”. If I record something somewhere else (such as on a piece of paper) and move it into my system a day later, I can still put it in the correct day, if that’s how I have things grouped. In a similar system using my phone’s Notes app, the timestamps are automatic and unchangeable, so there’s a slightly different relationship I have with the timestamps and ordering of notes in this app. Apps with this automatic behavior are ever so slightly not my timestamps, and I relate to them accordingly.

In digital information systems, there is always a tension between canonical/”official” representations and user representations; we may receive an email on 5 Jan but understand that it was sent late and was meant for 3 Jan, or we may backdate documents as an understanding that an agreement finalized earlier is only ratified now.

User-first information systems

Before AI and agents, personal information systems largely focused on enabling user representations on a single-user system. This means the user has full control over the data schema, all data is mutable (may be modified at any point), and concern for legibility to other users is secondary at best.

Truth-first information systems

Enterprise information systems, having auditability as a primary concern, take a truth-first approach: information is recorded as it happens, with a canonical timestamp, and this information is stored read-only; no changes are possible without an audit trail.

Hybrid information systems

For an information system that must take into account the possiblity of agent interaction, we need both: user-first information systems are maximally user-friendly at the expense of legibility, which increases the risk of AI agents misunderstanding the user’s schema, particularly the parts that are most unconventional. Yet, fully truth-first information systems leave little room for schema customization, and are not ideal for personal information systems.

What if we separate both layers? We have a truth-first layer that records incontrovertible events: “this piece of information entered this system at this time from this source”. The truth-first layer makes investigation and conflict resolution possible, but is not what a user would reference on a daily basis. It reflects ground truth faithfully, providing it when it is needed.

A user-first layer is what a human user uses on a daily basis. We call it the belief layer: this is where a user forms, records, and updates their beliefs; they can link beliefs to the truth layer to provide themselves reassurance that their recorded information has provenance, if desired. This layer is mutable; the user can edit it as they please. For clarity, edits may also be recorded in the truth layer, leaving the user a history of edits. This can be used, programmatically or through AI agents, to perform undo operations or recreate an edit timeline.

Open research questions

  • What goes in the truth layer? What goes in the belief layer?
  • How does the required schema for each layer differ?