CLI reference

Every gapseq subcommand, exhaustive flag list. Generated from --help, annotated per-section. Run any command with -h to see these locally.


Global options

These apply to every subcommand:

FlagDescription
--data-dir <PATH>Override gapseq reference-data directory. Resolution order when absent: GAPSMITH_DATA_DIR env → $XDG_DATA_HOME/gapsmith<exe-dir>/../dat./dat.
--seq-dir <PATH>Override the sequence database directory. Defaults to <data-dir>/seq.
-K, --threads <N>Worker threads. Default: all available cores.
-v, -vv, -vvvLogging verbosity (info / debug / trace).
-h, --helpPrint help.

gapsmith doall

End-to-end pipeline: find → find-transport → draft → medium → fill.

FlagDefaultDescription
<GENOME>Required. Protein FASTA (.faa / .faa.gz).
-A, --alignerblastpblastp, diamond, mmseqs2, or precomputed.
-b, --bitcutoff200.0Bitscore cutoff for the find stages.
-c, --coverage75Coverage cutoff (0–100 %).
-l, --min-bs-core50.0Core-reaction bitscore cutoff.
-t, --taxonomyautoBacteria, Archaea, or auto (currently maps to Bacteria).
-m, --mediumautoMedium CSV, or auto to infer via gapsmith medium.
-f, --out-dir.Output directory.
-K, --threadsall coresThread count.
--full-suiteoffInclude fill Steps 3 + 4 (slow).
-k, --min-growth0.01Minimum growth for gap-filling.
--gspa-run <DIR>Read precomputed cluster-rep hits from a gspa manifest instead of running the aligner.
--gspa-genome-id <ID>FASTA stemRow id in the manifest's genomes.tsv matching this run.

Gzipped inputs are auto-decompressed into a tempfile::tempdir.


gapsmith find

Pathway and reaction detection.

FlagDefaultDescription
<GENOME>Protein FASTA.
-p, --pathwaysallPathway keyword or id (or all).
-l, --pathway-dbmetacyc,customComma-separated: metacyc, kegg, seed, custom, all.
-t, --taxonomyBacteriaReference subfolder under seq-dir.
-A, --alignerdiamondAligner backend.
-P, --precomputedPrecomputed alignment TSV (skip aligner).
--gspa-run <DIR>gspa manifest — expands rep-level hits onto this genome's cluster members.
--gspa-genome-id <ID>FASTA stemGenome id in the manifest matching this invocation.
--gspa-coverage-fractionoffTreat the alignment TSV's qcov as 0–1 (mmseqs2 native).
-b, --bitcutoff200.0Bitscore cutoff.
-i, --identcutoff0.0Identity cutoff (%).
-c, --coverage75Coverage cutoff.
-a, --completeness-main80.0Pathway completeness cutoff.
-k, --completeness-hints66.0Relaxed cutoff when every key reaction is present.
-s, --strictoffDisable key-reaction heuristic.
-n, --include-superpathwaysoffDefault is to filter them out (matches upstream).
-o, --out-dir.Output directory.
-u, --suffixFilename suffix <stem>-<suffix>-{Reactions,Pathways}.tbl.

gapsmith find-transport

Transporter detection.

FlagDefaultDescription
<GENOME>Protein FASTA.
-A, --alignerblastpAligner backend.
-P, --precomputedPrecomputed TSV.
-b, --bitcutoff50.0Transport hits use a lower threshold than biosynthetic.
-i, --identcutoff0.0Identity cutoff.
-c, --coverage75Coverage cutoff.
-a, --nouse-alternativesoffDisable the alt-transporter fallback.
-m, --only-metRestrict to one substrate keyword.
-o, --out-dir.Output directory.

gapsmith draft

Build a draft metabolic model from find + find-transport output.

FlagDefaultDescription
-r, --reactionsrequired*-Reactions.tbl.
-t, --transporterrequired*-Transporter.tbl.
-b, --biomassautoauto / pos / neg / archaea / <path.json>.
-n, --nameinferredModel id.
-u, --high-evi-rxn-bs200.0Bitscore threshold for core reactions.
-l, --min-bs-for-core50.0Lower bound for candidate pool.
-o, --out-dir.Output directory.
--no-sbmloffSkip SBML emission.

gapsmith medium

Rule-based medium inference.

FlagDefaultDescription
-m, --modelrequiredDraft CBOR / JSON.
-p, --pathwaysrequired*-Pathways.tbl from find.
-c, --manual-fluxcpdXXXXX:val;cpdYYYYY:val overrides.
-o, --outputinferredOutput CSV path.
-f, --out-dir.Output directory (when -o is absent).

gapsmith fill

Iterative gap-filling.

FlagDefaultDescription
<MODEL>requiredDraft CBOR / JSON.
-n, --mediarequiredMedium CSV.
-r, --reactions*-Reactions.tbl for bitscore-weighted pFBA.
-t, --targetcpd11416Biomass pseudo-metabolite id.
-k, --min-growth0.01Minimum growth rate floor.
-b, --bcore50.0Minimum bitscore for "core" classification.
--high-evi200.0Bitscore-to-weight upper calibration point.
--dummy-weight100.0Weight for reactions with no hit.
-o, --out-dir.Output directory.
--no-sbmloffSkip SBML.
--step1-onlyoffRun only Step 1 (debugging).
--full-suiteoffInclude fill Steps 3 + 4.
--prune-futileoffThermodynamic futile-cycle prune (opt-in; slow).

gapsmith fba

FBA / pFBA on an existing model.

FlagDefaultDescription
<MODEL>requiredCBOR / JSON model.
-r, --objectivemodel's ownObjective reaction id.
--pfbaoffParsimonious FBA.
--pfba-coef0.001pFBA biomass trade-off coefficient.
--min-growth0.0Biomass floor (pFBA only).
--top20Print the top-N highest-absolute-flux reactions.
--minimiseoffMinimise instead of maximise.

gapsmith adapt

Edit reactions or force growth on a compound.

FlagDefaultDescription
-m, --modelrequiredModel file.
-a, --addComma-separated rxnNNNNN or MetaCyc pathway ids.
-r, --removeSame format; reactions are removed.
-w, --growthcpdNNNNN:TRUE / cpdNNNNN:FALSE.
-b, --reactions*-Reactions.tbl (needed for -w ...:TRUE gap-filling).
-k, --min-growth0.01Min growth floor for the -w ...:TRUE path.
-f, --out-dir.Output directory.
--no-sbmloffSkip SBML.

gapsmith pan

Build a pan-draft from N drafts.

FlagDefaultDescription
-m, --modelsrequiredComma-separated / directory / single path.
-t, --min-freq0.06Minimum across-draft reaction frequency.
-b, --only-binaryoffSkip the pan-draft, emit only the rxn × model presence TSV.
-f, --out-dir.Output directory.
--no-sbmloffSkip SBML.

gapsmith update-sequences

Zenodo seqdb sync.

FlagDefaultDescription
-t, --taxonomyBacteriaWhich seqdb to sync.
-D, --seq-dir<data-dir>/seqSeqdb location.
-Z, --recordpinnedZenodo record id, or latest.
-c, --check-onlyoffReport version, no download.
-q, --quietoffSuppress progress messages.

gapsmith convert

CBOR ↔ JSON round-trip.

FlagDefaultDescription
<INPUT>requiredPath to source file.
<OUTPUT>requiredDestination path.
--tofrom extensioncbor / json (overrides extension).
--prettyoffPretty-print JSON.

gapsmith export-sbml

Serialise a CBOR / JSON model as SBML L3V1 + FBC2 + groups.

FlagDefaultDescription
<INPUT>requiredCBOR / JSON model.
<OUTPUT>requiredDestination .xml path.
--compactoffOmit pretty-print / nested indentation.

gapsmith align

Run a single aligner standalone. Useful for debugging reference-FASTA issues or building a precomputed TSV to feed into find --aligner precomputed.

FlagDefaultDescription
-A, --alignerrequiredblastp / tblastn / diamond / mmseqs2 / precomputed.
-q, --queryrequiredQuery FASTA.
-t, --targetrequiredTarget FASTA.
-P, --precomputedWhen -A precomputed, path to the TSV.
-b, --bitcutoff0.0Filter hits below this bitscore.
-c, --coverage0Filter below this coverage %.
-e, --evalue1e-5E-value cutoff.
--extraPassthrough flags to the aligner binary.
-o, --outstdoutOutput TSV path.

gapsmith batch-align

Cluster N genomes + single alignment + per-genome TSV expansion.

FlagDefaultDescription
-q, --queryrequiredReference query FASTA.
-g, --genomesrequiredDirectory containing .faa(.gz) files.
-o, --out-dirrequiredOutput <genome>.tsv directory.
-A, --alignerdiamondAligner backend (blastp / diamond / mmseqs2).
--cluster-identity0.5mmseqs2 cluster identity.
--cluster-coverage0.8mmseqs2 cluster coverage.
-b, --bitcutoff0.0Per-hit bitscore filter.
-c, --coverage0Per-hit coverage filter.

gapsmith doall-batch

Run doall across many genomes in parallel. See multi-genome.md for the full recipe.

FlagDefaultDescription
-g, --genomes-dir <DIR>Directory of protein FASTAs.
--genomes-list <TSV>Explicit TSV list (id<TAB>path[<TAB>abundance]).
--gspa-run <DIR>Pulls the genome list from a gspa manifest.
-f, --out-dir <DIR>requiredOne <genome_id>/ subdir is written per input.
-j, --jobs <N>all coresRayon pool size.
--shard <i/N>Select genomes where index mod N == i.
--continue-on-erroroffLog failed genomes to doall-batch-errors.tsv instead of aborting.
(plus every passthrough flag from doall)

gapsmith community

Community-level optimisation. Two subcommands.

gapsmith community per-mag

Per-MAG FBA under a shared (union) medium. Linear in N.

FlagDefaultDescription
--drafts-dir <DIR>Directory of <id>-{draft,filled}.gmod.cbor.
--drafts-list <TSV><id><TAB><cbor>[<TAB><medium>].
--gspa-run <DIR>Use manifest's genome list.
--drafts-root <DIR>.Where per-genome doall outputs live (paired with --gspa-run).
-m, --medium <CSV>autoOverride the inferred shared medium.
-o, --out-dir <DIR>.Writes community-medium.csv, per-mag-growth.tsv.

gapsmith community cfba

Compose N drafts into one community model; solve weighted-sum biomass.

FlagDefaultDescription
--drafts-dir <DIR> / --drafts-list <TSV> / --gspa-run <DIR>Exactly one (same semantics as per-mag).
--drafts-root <DIR>.Where doall outputs live.
--abundance <TSV><id><TAB><abundance> overrides.
--biomass-rxn <ID>bio1Per-organism biomass reaction id.
--balanced-growthoffAdd v(bio_k) == v(bio_community) constraints.
-m, --medium <CSV>Optional shared medium CSV.
-o, --out-dir <DIR>.Writes community.gmod.cbor, community-fba.tsv.

gapsmith db inspect

Smoke-test the --data-dir. Loads every reference table and prints row counts. No flags beyond --data-dir.


gapsmith test

Print resolved paths + which external tool binaries are on PATH.


gapsmith example-model

Emit a small hand-built toy model (3 metabolites / 2 reactions) as CBOR. Useful for smoke-testing format round-trips and downstream tools.

FlagDefaultDescription
<OUTPUT>requiredDestination CBOR path.
--complexoffEmit a slightly bigger "complex" variant.