Supported Inputs
This page documents the current file and directory structures that PreFlight UI should accept and validate before handoff to core analysis.
Supported root structure
PreFlight UI should validate the same runner contract used by the deployed pipeline:
<input-root>/<project_id>/01_sources/
The required file at that location is:
<input-root>/<project_id>/01_sources/sources.json
Required inputs
The minimum valid package contains:
sources.jsonportfolio_selected.csv
The selection file must be resolvable through sources.json.
Optional supported inputs
PreFlight UI should recognize, validate, and classify these as optional rather than required:
assays.csvcompounds.csvtargets.csvstructures/*.pdb
Optional inputs can unlock additional downstream analysis depth, but their absence should not be treated as an automatic submission failure unless a selected workflow explicitly depends on them.
Supported file expectations
sources.json
The manifest should point only to files that exist under 01_sources/. Common keys include:
portfolio_selected_csvprimary_candidate_idassays_csvcompounds_csvtargets_csvpdbs
portfolio_selected.csv
The selected portfolio file must include one of:
smilescanonical_smiles
Recommended identifiers and metadata:
candidate_idcompound_idname- Customer metadata columns that should flow into analysis context
Structure files
When structure files are included, they should be staged as .pdb files under a structure directory referenced by the manifest.
What PreFlight should check exactly
At a minimum, the validator should confirm:
- The
<project_id>folder exists - The
01_sourcesdirectory exists sources.jsonexists and is parseableportfolio_selected.csvresolves from the manifest- The portfolio file contains at least one supported structure column for molecules
- Referenced optional files exist when listed in the manifest
Known limitations
- The validator can confirm structural readiness, but it does not guarantee scientific sufficiency for every module family.
- Missing optional files may be acceptable for ingestion but can still reduce module coverage later.
- Support for formats other than the current CSV and PDB-centric contract should be documented explicitly before being presented as supported.
For the analysis-side interpretation of these constraints, see Prepare Inputs.