Data Plugins¶
A Data Plugin describes a domain-specific shape of data. It tells OXFORDIA what the data means, how it is structured, and how researchers should refer to it in queries.
Each Data Plugin ships with four components.
Components¶
Data Schema¶
A formal contract that data must satisfy to be considered valid input to the plugin's tooling. The schema language is ShEx (Shape Expressions). Ingest is validated automatically against the schema, so downstream tools can rely on the structure being exactly what they expect.
Graph Shortcuts¶
Real research questions rarely line up directly with the underlying RDF graph structure. "Baseline age" is not a single value — it is a path through related concepts: a Person has an Age Aspect, which has a Magnitude, which has a numeric value and a unit of measure.
Spelling that out on every query is tedious and error-prone. The graph-shortcut mechanism gives the whole path a stable, human-meaningful name — BaselineAge — that researchers and administrators refer to without needing to write out the full path.
UI Component¶
The plugin contributes its own resource view and any required custom importers (a CSV upload form, for example) into the OXFORDIA admin dashboard. When a dataset of this plugin's type is loaded, the dashboard uses the plugin's UI component to display and manage it.
R Client Library¶
The plugin exposes a typed R interface so that researchers can refer to plugin-provided graph shortcuts (e.g., BaselineAge) by name in their queries, without needing to know the underlying RDF structure.
Reference implementation: Nemaline Data Plugin¶
The reference Data Plugin models nemaline myopathy clinical trial records.
Package: oxfordia-plugin-data-nemaline
Graph shortcuts¶
| Shortcut | Description |
|---|---|
BaselineAge |
Age of participant at study enrollment |
EventTime |
Time from baseline to the event or censoring |
EventOccurred |
Whether the event occurred (boolean) |
GeneticGroup |
Genetic subgroup classification |
Ambulation |
Ambulatory status |
TotalMFM |
Total Motor Function Measure score |
CSV import format¶
The Nemaline plugin accepts a standardized CSV. Required columns:
| Column | Type | Description |
|---|---|---|
ID |
integer | Participant identifier |
CLUSTER |
string | Cluster assignment |
GENETIC_GROUP |
string | Genetic variant group |
BASELINE_AGE |
float | Age at enrollment (years) |
AMBULATION |
string | Ambulatory / Non-ambulatory |
TOTAL_MFM |
integer | Total MFM score |
KM_EVENT |
0 or 1 | Event indicator |
KM_TIME_YR |
float | Time to event or censoring (years) |
Writing a new Data Plugin¶
A Data Plugin is a package that exports:
- A ShEx schema for validation
- A graph shortcut map (shortcut name → SPARQL property path)
- A UI component (React) for the admin dashboard
- An R package exposing typed query helpers
Refer to the Nemaline plugin source as the reference implementation.