Skip to content

Statistic Plugins

A Statistic Plugin describes a kind of computation. It defines the query interface, the access-rule vocabulary, and the execution logic for one statistical operation.

Each Statistic Plugin ships with seven components.

Components

Query Definition

The schema of the query input: which graph paths are required, which parameters are accepted. This is what the R client serializes into a request to the node.

Statistic Access Rule Definition

The schema of an access rule that authorizes this statistic. Different statistics need different constraint vocabularies — a mean needs a minCount; a survival analysis may eventually need differential-privacy parameters. The SAR Definition is what makes the access-control language extensible without modifying the core.

Statistic Access Rule UI Component

The editor that data administrators use to write Statistic Access Rules for this statistic in the OXFORDIA admin dashboard. The UI is generated from the SAR Definition.

Statistic Access Rule Evaluation Function

The function the Query Access Evaluator calls to decide whether a given query is permitted under a given SAR. This runs before database execution — if authorization fails, no query is issued.

Query Execution Function

Translates the high-level aggregate query into a concrete SPARQL query and executes it against the local triplestore.

Post-Query Evaluation Function

Enforces result-dependent constraints after database execution but before the result is returned. The canonical example is the minCount guard: if fewer records contributed to the result than the threshold set in the SAR, the query is rejected and nothing is returned to the researcher.

R Client Library

The R-side wrapper that researchers actually call.


Reference implementations

Mean

Computes the arithmetic mean of a numeric graph path.

Package: oxfordia-plugin-stat-mean

Query parameters:

Parameter Required Description
graph_path yes The graph shortcut to compute the mean over

SAR constraints:

Constraint Description
minCount Minimum number of records that must contribute before a result is returned

R usage:

result <- oxfordia_mean(
  targets    = targets,
  auth       = auth,
  graph_path = "BaselineAge"
)
result$value    # global weighted mean
result$n        # total record count
result$per_site # tibble: site, mean, count

Median

Computes the median of a numeric graph path.

Package: oxfordia-plugin-stat-median

Query parameters:

Parameter Required Description
graph_path yes The graph shortcut to compute the median over

SAR constraints:

Constraint Description
minCount Minimum number of records that must contribute

Kaplan–Meier

Computes a Kaplan–Meier survival curve, optionally stratified by a categorical variable.

Package: oxfordia-plugin-stat-kaplan-meier

Query parameters:

Parameter Required Description
time_path yes Graph shortcut for time-to-event
event_path yes Graph shortcut for event indicator (boolean)
stratify_path no Graph shortcut for a stratification variable

SAR constraints:

Constraint Description
minCount Minimum number of records that must contribute

R usage:

result <- oxfordia_kaplan_meier(
  targets       = targets,
  auth          = auth,
  time_path     = "EventTime",
  event_path    = "EventOccurred",
  stratify_path = "GeneticGroup"
)
plot(result)

Privacy considerations for survival analysis

Kaplan–Meier curves necessarily expose information about individual late events in the tail of the curve. For sensitive cohorts, consider setting a conservative minCount and refer to the roadmap item on differential-privacy bounds (§7.2 of the whitepaper).


Writing a new Statistic Plugin

A Statistic Plugin is a package that exports the seven components listed above. Adding a new statistic — say, Standard Deviation — does not require modifying the Mean plugin, any Data Plugin, or the OXFORDIA core. It only requires implementing the seven components and registering them with the node.

Refer to the Mean plugin source at github.com/OXFORDIA-project/OXFORDIA-node as the reference implementation.