288 Years of Research Behind MAGS

Pieter van Schalkwyk · June 15, 2026 · 9 min read

The oldest idea in the MAGS codebase was published in 1738. Daniel Bernoulli’s paper on the measurement of risk introduced logarithmic utility: the observation that the same gain matters less the better off you already are. That equation still runs every time a MAGS agent weighs an outcome.

It is not alone. In a single decision cycle, a MAGS agent executes a forgetting curve measured in 1885, a recency effect documented in 1962, information theory from 1948, the economics that won the 2002 Nobel Prize, database transaction theory from 1987, and sequential decision mathematics that Princeton’s Warren Powell spent four decades unifying.

Every one of those references appears as a citation in the source code, at the line where the idea is implemented, and we maintain the index the way a journal paper maintains its bibliography.

That is 288 years of research executing in one decision cycle.

This post is the tour. It walks the lineage layer by layer, from the memory system through the decision system, and shows where each mechanism comes from and why it is there.

Why a Codebase Keeps a Bibliography

When a MAGS agent’s recommendation influences a maintenance decision on a gas compressor, someone eventually asks why. A regulator asks. An insurer asks. The engineer who signed the work order asks. “The model decided” does not survive any of those conversations.

A citation at the implementing line changes the conversation. It says this scoring formula was not improvised by a developer on a Tuesday; it is the Weber-Fechner law, published psychophysics, and you can check our implementation against the literature. A citation turns “trust us” into “verify us,” and that shift matters more in industrial operations than in any other place agents are being deployed.

It also disciplines us. If we cannot ground a mechanism in something stronger than intuition, we flag it as our own engineering judgment: owned, documented, and open to revision. A bibliography loses credibility through over-citing as fast as through under-citing. The standard we hold ourselves to is the same one a peer reviewer would apply to a paper.

The Memory Layer Comes From a Century of Psychology

A MAGS agent watching a plant sees thousands of signals a day. Storing them is trivial. Deciding what matters is the hard problem, and we did not invent our answer to it.

The significance engine, the mechanism that decides which observations deserve the agent’s attention, is assembled from published memory and perception research:

Ebbinghaus (1885). Memories decay. Older observations lose weight on a forgetting curve, so agents do not treat last month’s blip like this morning’s alarm.
Murdock (1962). The serial-position effect. Recent items are recalled preferentially, which becomes recency weighting in memory retrieval.
Weber-Fechner and Stevens. Perception scales logarithmically. Frequency counts are log-compressed so a handful of chatty sensors cannot drown out a quiet, important one.
Shannon (1948). Rare events carry more information, which is the mathematical heart of surprise scoring.
Salton and McGill (1983). The vector space model. Semantic novelty measured as distance in embedding space, decades before anyone said “embeddings.”
Box and Jenkins (1970). Statistical baselines that answer “is this activity level unusual?” from rolling history rather than gut feel.
Kahneman and Tversky (1979). Prospect Theory. Context changes significance, so the same anomaly scores higher during a quiet shift than during a chaotic one.

The result is an agent that pays attention the way experienced operators do: it habituates to the routine, startles at the genuinely novel, and weighs the recent over the stale, with each of those behaviors traceable to the paper that described it in humans first.

The Judgment Layer Comes From Economics

MAGS agents do not weigh outcomes with raw numbers. Bernoulli’s utility curve makes marginal gains matter less as outcomes grow. Prospect Theory’s loss-aversion kink makes losses loom larger than equivalent gains, and that asymmetry is there by design, because in industrial operations the downside of a process upset dwarfs the upside of a marginal optimization. Every decision also carries a multi-factor confidence score computed from evidence, consistency, and novelty, then recorded with the decision itself.

The Reliability Layer Answers a Question Pilots Never Ask

A MAGS agent’s memory spans graph, vector, and time-series databases. Ask what happens when one of those writes fails halfway through a decision, because that question separates a pilot from a production deployment, and answering it is this layer’s entire job. An agent that silently loses memories is an agent whose audit trail lies, and no operator should accept that risk knowingly.

So the persistence layer is built from the distributed-systems canon. Garcia-Molina and Salem’s sagas (1987) govern multi-store writes with compensating actions. Brewer’s CAP theorem and Vogels’ eventual consistency keep us honest about which guarantees hold when parts of the system fail. The MAPE-K loop from IBM’s autonomic computing work drives agent self-healing, while crash-only design, circuit breakers, and jittered backoff keep a hundred agents recovering from the same fault from stampeding the infrastructure. Every memory is checksummed. Every repair is itself audited.

The Knowledge Layer Speaks in Standards

The knowledge layer is built on the W3C semantic stack: RDF and OWL for the models, SPARQL for querying, SHACL for validation, and PROV-O for decision provenance. Equipment and failure taxonomies align to ISO 14224, ISO 15926, ISA-95, and IEC 62443. Decision records that conform to the standards your reliability engineers already use are records they can actually benchmark.

The Decision Layer Runs on Decision Science

MAGS has been decision-first from the start. Objective functions state what each agent optimizes. Utility functions handle the trade-offs when objectives conflict, and deontic rules, the hard limits on what an agent may do, bound its choices before it ever proposes an action. The latest iteration of that decision layer draws on sequential decision analytics, the framework Warren Powell built at Princeton to unify dynamic programming, stochastic optimization, optimal control, and reinforcement learning into one mathematical language.

Mapping our architecture onto Powell’s universal model confirmed we had independently converged on the combination of policy classes his work identifies as the one that succeeds in industry. The same mapping showed us the next two capabilities worth building, so we built them.

Agents that learn what their actions actually do. When a MAGS agent plans, it estimates each action’s impact on the KPIs. Those used to be static estimates. The agent now maintains a belief state. For every action and KPI pair, it holds a probability distribution, seeded by the initial estimate and updated after every execution by comparing predicted impact against measured outcome. The arithmetic is conjugate Bayesian, standard since DeGroot’s 1970 textbook. After enough operations, the agent stops believing a purge cycle improves yield by the figure a manual quoted. It has measured the effect, with quantified uncertainty, and its plans use the measured number.

Agents that know when to look. Deciding which sensing tool to run, and when, is a textbook exploration-versus-exploitation problem, so we used the textbook answer: upper confidence bounds (Lai and Robbins, 1985; Auer et al., 2002), the multi-armed bandit mathematics that stops an agent from permanently neglecting an information source because it was quiet last week.

Powell’s Reinforcement Learning and Stochastic Optimization and Optimal Learning now sit in the index alongside Shannon, Kahneman, and Bernoulli.

Learning Without Drift

The standard objection to learning agents in industrial operations is drift: systems that change themselves cannot be trusted near safety-critical processes.

Look at what kind of learning this is. A Bayesian update is arithmetic, with the same recorded data producing bit-identical beliefs every time, and the parameters it adjusts live inside the decision policy with no write-path to the safety constraints that bound it. The agent gets smarter about what works while remaining structurally incapable of relearning what is allowed. Most people assume learning and determinism are in tension. They turn out to be compatible, but only on the right mathematical foundations, which is the entire reason the foundations matter.

How to Evaluate Any Agent Platform, Ours Included

The lineage above is not a marketing claim. It is a standard, and it is one you can apply to any vendor in this market, including us. Three questions do most of the work.

Where does the attention model come from? If the answer is “the LLM decides what is important,” ask what happens to that judgment when the vendor updates the model version. A scoring model grounded in published psychophysics does not move when an API does.

What does the agent’s confidence mean? A number the model asserts about itself cannot be calibrated. A score computed from defined factors can be calibrated, audited, and improved over time.

Can you see the line of code where the decision mathematics lives, and the paper it implements? A vendor that builds on published mathematics will show you. The one that cannot will hesitate, and that hesitation tells you which kind of system you are buying.

We hold MAGS to the same three questions. In a recent public webinar, we put the MAGS source code in front of an LLM and let registered attendees interrogate it live: how confidence scoring works, where the objective functions live, what deontic logic is doing in the safety layer. Nothing was pre-seeded. The only context the model had was the code itself, and we published every question and answer to a public repository afterward. We will run the same session for any qualified, interested party: bring your own questions, put them to the code, and read the answers as they come.

What 288 Years Buys You

Generative AI is the newest component in MAGS, and it does what it is genuinely good at: reading context, drafting plans, and explaining reasoning in human terms. It operates inside an architecture whose attention model comes from psychophysics, whose judgment comes from decision economics, whose reliability comes from distributed-systems theory, whose vocabulary comes from industrial standards, and whose learning comes from decision science.

When someone asks why the agent did that, the answer traces through published, inspectable, hundred-year-tested ideas all the way down. That property costs years of engineering discipline to build and a single line of code to verify. Operators should demand it before any agent gets near a plant.

Request a live codebase session

Tags: mags agentic-ai decision-intelligence industrial-ai agentic-operations

Why a Codebase Keeps a Bibliography

The Memory Layer Comes From a Century of Psychology

The Judgment Layer Comes From Economics

The Reliability Layer Answers a Question Pilots Never Ask

The Knowledge Layer Speaks in Standards

The Decision Layer Runs on Decision Science

Learning Without Drift

How to Evaluate Any Agent Platform, Ours Included

What 288 Years Buys You

Stay ahead of the curve

Related Posts

An Ontology Is Not a Graph Database

Who Owns Your Context Layer?

Why Industrial AI Agents Don't Need to "Reason" Like ChatGPT