Formats

quality-control¶

`DecontamScoreFormat`¶

type: text file

`DecontamScoreDirFmt`¶

type: directory

files:

name	path	format	required
file	`stats.tsv`	`DecontamScoreFormat`	True

fragment-insertion¶

`PlacementsFormat`¶

type: text file

`PlacementsDirFmt`¶

plugin: fragment-insertion

type: directory

files:

name	path	format	required
file	`placements.json`	`PlacementsFormat`	True

`RAxMLinfoFormat`¶

type: text file

`SeppReferenceDirFmt`¶

plugin: fragment-insertion

type: directory

files:

name	path	format	required
alignment	`aligned-dna-sequences.fasta`	`AlignedDNAFASTAFormat`	True
phylogeny	`tree.nwk`	`NewickFormat`	True
raxml_info	`raxml-info.txt`	`RAxMLinfoFormat`	True

metadata¶

`ArtificialGroupingFormat`¶

type: text file

`ArtificialGroupingDirectoryFormat`¶

plugin: metadata

type: directory

files:

name	path	format	required
file	`artificial-groupings.tsv`	`ArtificialGroupingFormat`	True

fondue¶

`SRAMetadataFormat`¶

type: text file

`SRAMetadataDirFmt`¶

plugin: fondue

type: directory

files:

name	path	format	required
file	`sra-metadata.tsv`	`SRAMetadataFormat`	True

`SRAFailedIDsFormat`¶

This is a "fake" format only used to store a list of failed SRA IDs, which can be converted to QIIME's metadata and input into any fondue action.

type: text file

`SRAFailedIDsDirFmt`¶

plugin: fondue

type: directory

files:

name	path	format	required
file	`sra-failed-ids.tsv`	`SRAFailedIDsFormat`	True

`NCBIAccessionIDsFormat`¶

This is a format used to store a list of SRA accession IDs (run, study, BioProject, sample and experiment IDs), which can be converted to QIIME's metadata. Artifacts containing of run, study and BioProject IDs can be input into any fondue action.

type: text file

`NCBIAccessionIDsDirFmt`¶

plugin: fondue

type: directory

files:

name	path	format	required
file	`ncbi-accession-ids.tsv`	`NCBIAccessionIDsFormat`	True

sample-classifier¶

`SampleEstimatorDirFmt`¶

plugin: sample-classifier

type: directory

files:

name	path	format	required
version_info	`sklearn_version.json`	`JSONFormat`	True
sklearn_pipeline	`sklearn_pipeline.tar`	`PickleFormat`	True

`BooleanSeriesFormat`¶

type: text file

`BooleanSeriesDirectoryFormat`¶

plugin: sample-classifier

type: directory

files:

name	path	format	required
file	`outliers.tsv`	`BooleanSeriesFormat`	True

`ImportanceFormat`¶

type: text file

`ImportanceDirectoryFormat`¶

plugin: sample-classifier

type: directory

files:

name	path	format	required
file	`importance.tsv`	`ImportanceFormat`	True

`PredictionsFormat`¶

type: text file

`PredictionsDirectoryFormat`¶

plugin: sample-classifier

type: directory

files:

name	path	format	required
file	`predictions.tsv`	`PredictionsFormat`	True

`ProbabilitiesFormat`¶

type: text file

`ProbabilitiesDirectoryFormat`¶

plugin: sample-classifier

type: directory

files:

name	path	format	required
file	`class_probabilities.tsv`	`ProbabilitiesFormat`	True

`TrueTargetsDirectoryFormat`¶

plugin: sample-classifier

type: directory

files:

name	path	format	required
file	`true_targets.tsv`	`PredictionsFormat`	True

rescript¶

`SILVATaxonomyFormat`¶

type: text file

`SILVATaxonomyDirectoryFormat`¶

plugin: rescript

type: directory

files:

name	path	format	required
file	`silva_taxonomy.tsv`	`SILVATaxonomyFormat`	True

`SILVATaxidMapFormat`¶

type: text file

`SILVATaxidMapDirectoryFormat`¶

plugin: rescript

type: directory

files:

name	path	format	required
file	`silva_taxmap.tsv`	`SILVATaxidMapFormat`	True

composition¶

`FrictionlessCSVFileFormat`¶

Format for frictionless CSV.

`DataPackageSchemaFileFormat`¶

Format for the associated metadata for each file in the DataLoaf.

`DataLoafPackageDirFmt`¶

plugin: composition

type: directory

files:

name	path	format	required
data_slices	`.+\.csv`	`FrictionlessCSVFileFormat`	True
nutrition_facts	`datapackage.json`	`DataPackageSchemaFileFormat`	True

`ANCOMBC2OutputDirFmt`¶

Stores the model statistics and optionally the structural zeros table output by the ANCOMBC2 method.

The slices are: - lfc: log-fold change - se: standard error - W: lfc / se (the test statistic) - p: p-value - q: adjusted p-value - diff: differentially abundant boolean (i.e. q < alpha) - diff_robust: robust diff abun boolean (q < alpha AND passed_ss) - passed_ss: whether sensitivity analysis was passed

plugin: composition

type: directory

files:

name	path	format	required
lfc	`lfc.jsonl`	`TableJSONLFileFormat`	True
se	`se.jsonl`	`TableJSONLFileFormat`	True
W	`W.jsonl`	`TableJSONLFileFormat`	True
p	`p.jsonl`	`TableJSONLFileFormat`	True
q	`q.jsonl`	`TableJSONLFileFormat`	True
diff	`diff.jsonl`	`TableJSONLFileFormat`	True
passed_ss	`passed_ss.jsonl`	`TableJSONLFileFormat`	True
diff_robust	`diff_robust.jsonl`	`TableJSONLFileFormat`	False
structural_zeros	`structural-zeros.jsonl`	`TableJSONLFileFormat`	False

deblur¶

`DeblurStatsFmt`¶

type: text file

`DeblurStatsDirFmt`¶

plugin: deblur

type: directory

files:

name	path	format	required
file	`stats.csv`	`DeblurStatsFmt`	True

dada2¶

`DADA2StatsFormat`¶

type: text file

`DADA2StatsDirFmt`¶

plugin: dada2

type: directory

files:

name	path	format	required
file	`stats.tsv`	`DADA2StatsFormat`	True

`DADA2BaseTransitionStatsFormat`¶

type: text file

`DADA2BaseTransitionStatsDirFmt`¶

plugin: dada2

type: directory

files:

name	path	format	required
file	`Errorstats.tsv`	`DADA2BaseTransitionStatsFormat`	True

vsearch¶

`UchimeStatsFmt`¶

type: text file

`UchimeStatsDirFmt`¶

plugin: vsearch

type: directory

files:

name	path	format	required
file	`stats.tsv`	`UchimeStatsFmt`	True

quality-filter¶

`QualityFilterStatsFmt`¶

type: text file

`QualityFilterStatsDirFmt`¶

plugin: quality-filter

type: directory

files:

name	path	format	required
file	`stats.csv`	`QualityFilterStatsFmt`	True

feature-classifier¶

`BLASTDBDirFmtV5`¶

plugin: feature-classifier

type: directory

files:

name	path	format	required
idx1	`.+\.ndb`	`BLASTDBFileFmtV5`	True
idx2	`.+\.nhr`	`BLASTDBFileFmtV5`	True
idx3	`.+\.nin`	`BLASTDBFileFmtV5`	True
idx4	`.+\.not`	`BLASTDBFileFmtV5`	True
idx5	`.+\.nsq`	`BLASTDBFileFmtV5`	True
idx6	`.+\.ntf`	`BLASTDBFileFmtV5`	True
idx7	`.+\.nto`	`BLASTDBFileFmtV5`	True
idx8	`.+\.njs`	`BLASTDBFileFmtV5`	True

`TaxonomicClassifierDirFmt`¶

plugin: feature-classifier

type: directory

files:

name	path	format	required
preprocess_params	`preprocess_params.json`	`JSONFormat`	True
sklearn_pipeline	`sklearn_pipeline.tar`	`PickleFormat`	True

`TaxonomicClassiferTemporaryPickleDirFmt`¶

plugin: feature-classifier

type: directory

files:

name	path	format	required
version_info	`sklearn_version.json`	`JSONFormat`	True
sklearn_pipeline	`sklearn_pipeline.tar`	`PickleFormat`	True

longitudinal¶

`FirstDifferencesFormat`¶

type: text file

`FirstDifferencesDirectoryFormat`¶

plugin: longitudinal

type: directory

files:

name	path	format	required
file	`FirstDifferences.tsv`	`FirstDifferencesFormat`	True

types¶

`Bowtie2IndexDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
idx1	`.+(?<!\.rev)\.1\.bt2l?`	`Bowtie2IndexFileFormat`	True
idx2	`.+(?<!\.rev)\.2\.bt2l?`	`Bowtie2IndexFileFormat`	True
ref3	`.+\.3\.bt2l?`	`Bowtie2IndexFileFormat`	True
ref4	`.+\.4\.bt2l?`	`Bowtie2IndexFileFormat`	True
rev1	`.+\.rev\.1\.bt2l?`	`Bowtie2IndexFileFormat`	True
rev2	`.+\.rev\.2\.bt2l?`	`Bowtie2IndexFileFormat`	True

`LSMatFormat`¶

type: text file

`DistanceMatrixDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`distance-matrix.tsv`	`LSMatFormat`	True

`TSVTaxonomyFormat`¶

Format for a 2+ column TSV file with an expected minimal header.

The only header recognized by this format is:

Feature ID<tab>Taxon

Optionally followed by other arbitrary columns.

This format supports blank lines. The expected header must be the first non-blank line. In addition to the header, there must be at least one line of data.

type: text file

`TSVTaxonomyDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`taxonomy.tsv`	`TSVTaxonomyFormat`	True

`HeaderlessTSVTaxonomyFormat`¶

Format for a 2+ column TSV file without a header.

This format supports comment lines starting with #, and blank lines.

type: text file

`HeaderlessTSVTaxonomyDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`taxonomy.tsv`	`HeaderlessTSVTaxonomyFormat`	True

`TaxonomyFormat`¶

Legacy format for any 2+ column TSV file, with or without a header.

This format has been superseded by taxonomy file formats explicitly with and without headers, TSVTaxonomyFormat and HeaderlessTSVTaxonomyFormat, respectively.

This format remains in place for backwards-compatibility. Transformers are intentionally not hooked up to transform this format into the canonical .qza format (TSVTaxonomyFormat) to prevent users from importing data in this format. Transformers will remain in place to transform this format into in-memory Python objects (e.g. pd.Series) so that existing .qza files can still be loaded and processed.

The only header recognized by this format is:

Feature ID<tab>Taxon

Optionally followed by other arbitrary columns.

If this header isn't present, the format is assumed to be headerless.

This format supports comment lines starting with #, and blank lines.

type: text file

`TaxonomyDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`taxonomy.tsv`	`TaxonomyFormat`	True

`FASTAFormat`¶

type: text file

`DNAFASTAFormat`¶

type: text file

`DNASequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`dna-sequences.fasta`	`DNAFASTAFormat`	True

`PairedDNASequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
left_dna_sequences	`left-dna-sequences.fasta`	`DNAFASTAFormat`	True
right_dna_sequences	`right-dna-sequences.fasta`	`DNAFASTAFormat`	True

`AlignedDNAFASTAFormat`¶

type: text file

`AlignedDNASequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`aligned-dna-sequences.fasta`	`AlignedDNAFASTAFormat`	True

`DifferentialFormat`¶

type: text file

`DifferentialDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`differentials.tsv`	`DifferentialFormat`	True

`ProteinFASTAFormat`¶

type: text file

`AlignedProteinFASTAFormat`¶

type: text file

`MixedCaseProteinFASTAFormat`¶

type: text file

`MixedCaseAlignedProteinFASTAFormat`¶

type: text file

`ProteinSequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`protein-sequences.fasta`	`ProteinFASTAFormat`	True

`AlignedProteinSequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`aligned-protein-sequences.fasta`	`AlignedProteinFASTAFormat`	True

`MixedCaseProteinSequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`protein-sequences.fasta`	`MixedCaseProteinFASTAFormat`	True

`MixedCaseAlignedProteinSequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`aligned-protein-sequences.fasta`	`MixedCaseAlignedProteinFASTAFormat`	True

`RNAFASTAFormat`¶

type: text file

`RNASequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`rna-sequences.fasta`	`RNAFASTAFormat`	True

`AlignedRNAFASTAFormat`¶

type: text file

`AlignedRNASequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`aligned-rna-sequences.fasta`	`AlignedRNAFASTAFormat`	True

`PairedRNASequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
left_rna_sequences	`left-rna-sequences.fasta`	`RNAFASTAFormat`	True
right_rna_sequences	`right-rna-sequences.fasta`	`RNAFASTAFormat`	True

`BLAST6Format`¶

type: text file

`BLAST6DirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`blast6.tsv`	`BLAST6Format`	True

`MixedCaseDNAFASTAFormat`¶

type: text file

`MixedCaseDNASequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`dna-sequences.fasta`	`MixedCaseDNAFASTAFormat`	True

`MixedCaseRNAFASTAFormat`¶

type: text file

`MixedCaseRNASequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`rna-sequences.fasta`	`MixedCaseRNAFASTAFormat`	True

`MixedCaseAlignedDNAFASTAFormat`¶

type: text file

`MixedCaseAlignedDNASequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`aligned-dna-sequences.fasta`	`MixedCaseAlignedDNAFASTAFormat`	True

`MixedCaseAlignedRNAFASTAFormat`¶

type: text file

`MixedCaseAlignedRNASequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`aligned-rna-sequences.fasta`	`MixedCaseAlignedRNAFASTAFormat`	True

`SequenceCharacteristicsFormat`¶

Format for a TSV file with information about sequences like length of a feature. The first column contains feature identifiers and is followed by other optional columns.

The file cannot be empty and must have at least two columns.

Validation for additional columns can be added with a semantic validator tied to a property. For example the "validate_seq_char_len" validator for "FeatureData[SequenceCharacteristics % Properties("length")]" adds validation for a numerical column called "length".

type: text file

`SequenceCharacteristicsDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`sequence_characteristics.tsv`	`SequenceCharacteristicsFormat`	True

`MAGSequencesDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-4[0-9a-fA-F]{3}-[89abAB][0-9a-fA-F]{3}-[0-9a-fA-F]{12}\.(fa\|fasta)$`	`DNAFASTAFormat`	True

`MAGtoContigsFormat`¶

type: text file

`MAGtoContigsDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
file	`mag-to-contigs.json`	`MAGtoContigsFormat`	True

`FeatureMapFormat`¶

type: text file

`FeatureMapDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
file	`feature-map.json`	`FeatureMapFormat`	True

`BIOMV100Format`¶

type: text file

`BIOMV210Format`¶

type: binary file

`BIOMV100DirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
file	`feature-table.biom`	`BIOMV100Format`	True

`BIOMV210DirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
file	`feature-table.biom`	`BIOMV210Format`	True

`GenesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
genes	`.+\.(fa\|fna\|fasta)$`	`DNAFASTAFormat`	True

`ProteinsDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
proteins	`.+\.(fa\|faa\|fasta)$`	`ProteinFASTAFormat`	True

`LociDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
loci	`.+\.gff$`	`GFF3Format`	True

`GenomeSequencesDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
genomes	`.+\.(fasta\|fa)$`	`DNAFASTAFormat`	True

`OrthologFileFmt`¶

type: text file

`SeedOrthologDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
seed_orthologs	`.\..\.seed_orthologs`	`OrthologFileFmt`	True

`OrthologAnnotationDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
annotations	`.+\.annotations`	`OrthologFileFmt`	True

`GFF3Format`¶

Generic Feature Format Version 3 (GFF3) spec: gff3.md NCBI modifications to the above: https://www.ncbi.nlm.nih.gov/datasets/docs/reference-docs/file-formats/about-ncbi-gff3/

type: text file

`KaijuDBDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
nodes	`nodes.dmp`	`NCBITaxonomyNodesFormat`	True
names	`names.dmp`	`NCBITaxonomyNamesFormat`	True
index	`kaiju_db.+\.fmi`	`KaijuIndexFormat`	True

`KaijuIndexFormat`¶

type: binary file

`Kraken2ReportFormat`¶

type: text file

`Kraken2OutputFormat`¶

type: text file

`Kraken2DBFormat`¶

type: text file

`Kraken2DBReportFormat`¶

type: text file

`Kraken2ReportDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
reports	`.+\.report\.(txt\|tsv)$`	`Kraken2ReportFormat`	True

`Kraken2OutputDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
outputs	`.+\.output\.(txt\|tsv)$`	`Kraken2OutputFormat`	True

`Kraken2DBDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
hash	`hash.k2d`	`Kraken2DBFormat`	True
opts	`opts.k2d`	`Kraken2DBFormat`	True
taxo	`taxo.k2d`	`Kraken2DBFormat`	True

`Kraken2DBReportDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`report.txt`	`Kraken2DBReportFormat`	True

`BrackenDBFormat`¶

type: text file

`BrackenDBDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
kmers	`database(\d{2,})mers\.kmer_distrib$`	`BrackenDBFormat`	True

`ImmutableMetadataFormat`¶

type: text file

`ImmutableMetadataDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`metadata.tsv`	`ImmutableMetadataFormat`	True

`MultiplexedSingleEndBarcodeInSequenceDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
file	`forward.fastq.gz`	`FastqGzFormat`	True

`MultiplexedPairedEndBarcodeInSequenceDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
forward_sequences	`forward.fastq.gz`	`FastqGzFormat`	True
reverse_sequences	`reverse.fastq.gz`	`FastqGzFormat`	True

`MultiplexedFastaQualDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`reads.fasta`	`DNAFASTAFormat`	True
quality	`reads.qual`	`QualFormat`	True

`EMPMultiplexedDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`sequences.fastq.gz`	`FastqGzFormat`	True
barcodes	`barcodes.fastq.gz`	`FastqGzFormat`	True

`ErrorCorrectionDetailsDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
file	`details.tsv`	`ErrorCorrectionDetailsFmt`	True

`ErrorCorrectionDetailsFmt`¶

type: text file

`EMPSingleEndDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`sequences.fastq.gz`	`FastqGzFormat`	True
barcodes	`barcodes.fastq.gz`	`FastqGzFormat`	True

`EMPSingleEndCasavaDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`Undetermined_S0_L001_R1_001.fastq.gz`	`FastqGzFormat`	True
barcodes	`Undetermined_S0_L001_I1_001.fastq.gz`	`FastqGzFormat`	True

`EMPPairedEndDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
forward	`forward.fastq.gz`	`FastqGzFormat`	True
reverse	`reverse.fastq.gz`	`FastqGzFormat`	True
barcodes	`barcodes.fastq.gz`	`FastqGzFormat`	True

`EMPPairedEndCasavaDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
forward	`Undetermined_S0_L001_R1_001.fastq.gz`	`FastqGzFormat`	True
reverse	`Undetermined_S0_L001_R2_001.fastq.gz`	`FastqGzFormat`	True
barcodes	`Undetermined_S0_L001_I1_001.fastq.gz`	`FastqGzFormat`	True

`OrdinationFormat`¶

type: text file

`OrdinationDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`ordination.txt`	`OrdinationFormat`	True

`ProcrustesStatisticsFmt`¶

type: text file

`ProcrustesStatisticsDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
file	`ProcrustesStatistics.tsv`	`ProcrustesStatisticsFmt`	True

`FastqManifestFormat`¶

Mapping of sample identifiers to relative filepaths and read direction.

type: text file

`FastqAbsolutePathManifestFormat`¶

Mapping of sample identifiers to absolute filepaths and read direction.

type: text file

`YamlFormat`¶

Arbitrary yaml-formatted file.

type: text file

`FastqGzFormat`¶

A gzipped fastq file.

type: binary file

`CasavaOneEightSingleLanePerSampleDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`.+_.+_L[0-9][0-9][0-9]_R[12]_001\.fastq\.gz`	`FastqGzFormat`	True

`CasavaOneEightLanelessPerSampleDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`.+_.+_R[12]_001\.fastq\.gz`	`FastqGzFormat`	True

`SingleLanePerSampleSingleEndFastqDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`.+_.+_L[0-9][0-9][0-9]_R[12]_001\.fastq\.gz`	`FastqGzFormat`	True
manifest	`MANIFEST`	`FastqManifestFormat`	True
metadata	`metadata.yml`	`YamlFormat`	True

`SingleLanePerSamplePairedEndFastqDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`.+_.+_L[0-9][0-9][0-9]_R[12]_001\.fastq\.gz`	`FastqGzFormat`	True
manifest	`MANIFEST`	`FastqManifestFormat`	True
metadata	`metadata.yml`	`YamlFormat`	True

`SingleEndFastqManifestPhred33`¶

type: text file

`SingleEndFastqManifestPhred64`¶

type: text file

`PairedEndFastqManifestPhred33`¶

type: text file

`PairedEndFastqManifestPhred64`¶

type: text file

`SingleEndFastqManifestPhred33V2`¶

type: text file

`SingleEndFastqManifestPhred64V2`¶

type: text file

`PairedEndFastqManifestPhred33V2`¶

type: text file

`PairedEndFastqManifestPhred64V2`¶

type: text file

`QIIME1DemuxFormat`¶

QIIME 1 demultiplexed FASTA format.

The QIIME 1 demultiplexed FASTA format is the default output format of split_libraries.py and split_libraries_fastq.py. The file output by QIIME 1 is named seqs.fna; this filename is sometimes associated with the file format itself due to its widespread usage in QIIME 1.

The format is documented here: http://qiime.org/documentation/file_formats.html#demultiplexed-sequences

Format details:

- FASTA file with exactly two lines per record: header and sequence. Each sequence must span exactly one line and cannot be split across multiple lines.

- The ID in each header must follow the format <sample-id>_<seq-id>. <sample-id> is the identifier of the sample the sequence belongs to, and <seq-id> is an identifier for the sequence within its sample. In QIIME 1, <seq-id> is typically an incrementing integer starting from zero, but any non-empty value can be used here, as long as the header IDs remain unique throughout the file. Note: <sample-id> may contain sample IDs that contain underscores; the rightmost underscore will used to delimit sample and sequence IDs.

- Descriptions in headers are permitted and ignored.

- Header IDs must be unique within the file.

- Each sequence must be DNA and cannot be empty.

type: text file

`QIIME1DemuxDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
file	`seqs.fna`	`QIIME1DemuxFormat`	True

`SampleIdIndexedSingleEndPerSampleDirFmt`¶

Single-end reads in fastq.gz files where base filename is the sample id

The full file name, minus the extension (.fastq.gz) is the sample id. For example, the sample id for the file: * sample-1.fastq.gz is sample-1 * xyz.fastq.gz is xyz * sample-42_S1_L001_R1_001.fastq.gz is sample-42_S1_L001_R1_001

plugin: types

type: directory

files:

name	path	format	required
sequences	`.+\.fastq\.gz`	`FastqGzFormat`	True

`MultiFASTADirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`.+\.(fa\|fasta)$`	`DNAFASTAFormat`	True

`MultiMAGSequencesDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`.+\.(fa\|fasta)$`	`DNAFASTAFormat`	True
manifest	`MANIFEST`	`MultiMAGManifestFormat`	True

`ContigSequencesDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
sequences	`[^\.].+_contigs.(fasta\|fa)$`	`DNAFASTAFormat`	True

`MultiBowtie2IndexDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
idx1	`.+(?<!\.rev)\.1\.bt2l?`	`Bowtie2IndexFileFormat`	True
idx2	`.+(?<!\.rev)\.2\.bt2l?`	`Bowtie2IndexFileFormat`	True
ref3	`.+\.3\.bt2l?`	`Bowtie2IndexFileFormat`	True
ref4	`.+\.4\.bt2l?`	`Bowtie2IndexFileFormat`	True
rev1	`.+\.rev\.1\.bt2l?`	`Bowtie2IndexFileFormat`	True
rev2	`.+\.rev\.2\.bt2l?`	`Bowtie2IndexFileFormat`	True

`BAMFormat`¶

type: binary file

`BAMDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
bams	`.+\.bam`	`BAMFormat`	True

`MultiBAMDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
bams	`.+\/.+\.bam`	`BAMFormat`	True

`MultiMAGManifestFormat`¶

type: text file

`ProteinMultipleProfileHmmFileFmt`¶

type: text file

`ProteinSingleProfileHmmFileFmt`¶

type: text file

`RNAMultipleProfileHmmFileFmt`¶

type: text file

`RNASingleProfileHmmFileFmt`¶

type: text file

`DNAMultipleProfileHmmFileFmt`¶

type: text file

`DNASingleProfileHmmFileFmt`¶

type: text file

`PressedProfileHmmsDirectoryFmt`¶

The <hmmfile>.h3m file contains the profile HMMs and their annotation in a binary format. The <hmmfile>.h3i file is an SSI index for the <hmmfile>.h3m file. The <hmmfile>.h3f file contains precomputed data structures for the fast heuristic filter (the MSV filter). The <hmmfile>.h3p file contains precomputed data structures for the rest of each profile.

plugin: types

type: directory

files:

name	path	format	required
h3m	`.*\.hmm\.h3m`	`ProfileHmmBinaryFileFmt`	True
h3i	`.*\.hmm\.h3i`	`ProfileHmmBinaryFileFmt`	True
h3f	`.*\.hmm\.h3f`	`ProfileHmmBinaryFileFmt`	True
h3p	`.*\.hmm\.h3p`	`ProfileHmmBinaryFileFmt`	True

`ProteinSingleProfileHmmDirectoryFmt`¶

plugin: types

type: directory

files:

name	path	format	required
profile	`.*\.hmm`	`ProteinSingleProfileHmmFileFmt`	True

`ProteinMultipleProfileHmmDirectoryFmt`¶

plugin: types

type: directory

files:

name	path	format	required
profiles	`.*\.hmm`	`ProteinMultipleProfileHmmFileFmt`	True

`DNASingleProfileHmmDirectoryFmt`¶

plugin: types

type: directory

files:

name	path	format	required
profile	`.*\.hmm`	`DNASingleProfileHmmFileFmt`	True

`DNAMultipleProfileHmmDirectoryFmt`¶

plugin: types

type: directory

files:

name	path	format	required
profiles	`.*\.hmm`	`DNAMultipleProfileHmmFileFmt`	True

`RNASingleProfileHmmDirectoryFmt`¶

plugin: types

type: directory

files:

name	path	format	required
profile	`.*\.hmm`	`RNASingleProfileHmmFileFmt`	True

`RNAMultipleProfileHmmDirectoryFmt`¶

plugin: types

type: directory

files:

name	path	format	required
profiles	`.*\.hmm`	`RNAMultipleProfileHmmFileFmt`	True

`EggnogRefTextFileFmt`¶

type: text file

`EggnogRefBinFileFmt`¶

type: binary file

`EggnogRefDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
eggnog	`eggnog.db.`	`EggnogRefBinFileFmt`	True

`DiamondDatabaseFileFmt`¶

type: binary file

`DiamondDatabaseDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
file	`ref_db.dmnd`	`DiamondDatabaseFileFmt`	True

`NCBITaxonomyNodesFormat`¶

type: text file

`NCBITaxonomyNamesFormat`¶

type: text file

`NCBITaxonomyBinaryFileFmt`¶

type: binary file

`NCBITaxonomyDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
node	`nodes.dmp`	`NCBITaxonomyNodesFormat`	True
names	`names.dmp`	`NCBITaxonomyNamesFormat`	True
tax_map	`prot.accession2taxid.gz`	`NCBITaxonomyBinaryFileFmt`	True

`EggnogProteinSequencesDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
taxid_info	`e5.taxid_info.tsv`	`EggnogRefTextFileFmt`	True
proteins	`e5.proteomes.faa`	`MixedCaseProteinFASTAFormat`	True

`AlphaDiversityFormat`¶

type: text file

`AlphaDiversityDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`alpha-diversity.tsv`	`AlphaDiversityFormat`	True

`NewickFormat`¶

type: text file

`NewickDirectoryFormat`¶

plugin: types

type: directory

files:

name	path	format	required
file	`tree.nwk`	`NewickFormat`	True

`NDJSONFileFormat`¶

Format for newline-delimited (ND) JSON file.

type: text file

`DataResourceSchemaFileFormat`¶

Format for data resource schema.

type: text file

`TabularDataResourceDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
data	`data.ndjson`	`NDJSONFileFormat`	True
metadata	`dataresource.json`	`DataResourceSchemaFileFormat`	True

`TableJSONLFileFormat`¶

type: text file

`TableJSONLDirFmt`¶

plugin: types

type: directory

files:

name	path	format	required
file	`data.table.jsonl`	`TableJSONLFileFormat`	True

DecontamScoreFormat¶

DecontamScoreDirFmt¶

PlacementsFormat¶

PlacementsDirFmt¶

RAxMLinfoFormat¶

SeppReferenceDirFmt¶

ArtificialGroupingFormat¶

ArtificialGroupingDirectoryFormat¶

SRAMetadataFormat¶

SRAMetadataDirFmt¶

SRAFailedIDsFormat¶

SRAFailedIDsDirFmt¶

NCBIAccessionIDsFormat¶

NCBIAccessionIDsDirFmt¶

SampleEstimatorDirFmt¶

BooleanSeriesFormat¶

BooleanSeriesDirectoryFormat¶

ImportanceFormat¶

ImportanceDirectoryFormat¶

PredictionsFormat¶

PredictionsDirectoryFormat¶

ProbabilitiesFormat¶

ProbabilitiesDirectoryFormat¶

TrueTargetsDirectoryFormat¶

SILVATaxonomyFormat¶

SILVATaxonomyDirectoryFormat¶

SILVATaxidMapFormat¶

SILVATaxidMapDirectoryFormat¶

FrictionlessCSVFileFormat¶

DataPackageSchemaFileFormat¶

DataLoafPackageDirFmt¶

ANCOMBC2OutputDirFmt¶

DeblurStatsFmt¶

DeblurStatsDirFmt¶

DADA2StatsFormat¶

DADA2StatsDirFmt¶

DADA2BaseTransitionStatsFormat¶

DADA2BaseTransitionStatsDirFmt¶

UchimeStatsFmt¶

UchimeStatsDirFmt¶

QualityFilterStatsFmt¶

QualityFilterStatsDirFmt¶

BLASTDBDirFmtV5¶

TaxonomicClassifierDirFmt¶

TaxonomicClassiferTemporaryPickleDirFmt¶

FirstDifferencesFormat¶

FirstDifferencesDirectoryFormat¶

Bowtie2IndexDirFmt¶

LSMatFormat¶

DistanceMatrixDirectoryFormat¶

TSVTaxonomyFormat¶

TSVTaxonomyDirectoryFormat¶

HeaderlessTSVTaxonomyFormat¶

HeaderlessTSVTaxonomyDirectoryFormat¶

TaxonomyFormat¶

TaxonomyDirectoryFormat¶

FASTAFormat¶

DNAFASTAFormat¶

DNASequencesDirectoryFormat¶

PairedDNASequencesDirectoryFormat¶

AlignedDNAFASTAFormat¶

AlignedDNASequencesDirectoryFormat¶

DifferentialFormat¶

DifferentialDirectoryFormat¶

ProteinFASTAFormat¶

AlignedProteinFASTAFormat¶

MixedCaseProteinFASTAFormat¶

MixedCaseAlignedProteinFASTAFormat¶

ProteinSequencesDirectoryFormat¶

AlignedProteinSequencesDirectoryFormat¶

MixedCaseProteinSequencesDirectoryFormat¶

MixedCaseAlignedProteinSequencesDirectoryFormat¶

RNAFASTAFormat¶

RNASequencesDirectoryFormat¶

AlignedRNAFASTAFormat¶

AlignedRNASequencesDirectoryFormat¶

PairedRNASequencesDirectoryFormat¶

BLAST6Format¶

BLAST6DirectoryFormat¶

`DecontamScoreFormat`¶

`DecontamScoreDirFmt`¶

`PlacementsFormat`¶

`PlacementsDirFmt`¶

`RAxMLinfoFormat`¶

`SeppReferenceDirFmt`¶

`ArtificialGroupingFormat`¶

`ArtificialGroupingDirectoryFormat`¶

`SRAMetadataFormat`¶

`SRAMetadataDirFmt`¶

`SRAFailedIDsFormat`¶

`SRAFailedIDsDirFmt`¶

`NCBIAccessionIDsFormat`¶

`NCBIAccessionIDsDirFmt`¶

`SampleEstimatorDirFmt`¶

`BooleanSeriesFormat`¶

`BooleanSeriesDirectoryFormat`¶

`ImportanceFormat`¶

`ImportanceDirectoryFormat`¶

`PredictionsFormat`¶

`PredictionsDirectoryFormat`¶

`ProbabilitiesFormat`¶

`ProbabilitiesDirectoryFormat`¶

`TrueTargetsDirectoryFormat`¶

`SILVATaxonomyFormat`¶

`SILVATaxonomyDirectoryFormat`¶

`SILVATaxidMapFormat`¶

`SILVATaxidMapDirectoryFormat`¶

`FrictionlessCSVFileFormat`¶

`DataPackageSchemaFileFormat`¶

`DataLoafPackageDirFmt`¶

`ANCOMBC2OutputDirFmt`¶

`DeblurStatsFmt`¶

`DeblurStatsDirFmt`¶

`DADA2StatsFormat`¶

`DADA2StatsDirFmt`¶

`DADA2BaseTransitionStatsFormat`¶

`DADA2BaseTransitionStatsDirFmt`¶

`UchimeStatsFmt`¶

`UchimeStatsDirFmt`¶

`QualityFilterStatsFmt`¶

`QualityFilterStatsDirFmt`¶

`BLASTDBDirFmtV5`¶

`TaxonomicClassifierDirFmt`¶

`TaxonomicClassiferTemporaryPickleDirFmt`¶

`FirstDifferencesFormat`¶

`FirstDifferencesDirectoryFormat`¶

`Bowtie2IndexDirFmt`¶

`LSMatFormat`¶

`DistanceMatrixDirectoryFormat`¶

`TSVTaxonomyFormat`¶

`TSVTaxonomyDirectoryFormat`¶

`HeaderlessTSVTaxonomyFormat`¶

`HeaderlessTSVTaxonomyDirectoryFormat`¶

`TaxonomyFormat`¶

`TaxonomyDirectoryFormat`¶

`FASTAFormat`¶

`DNAFASTAFormat`¶

`DNASequencesDirectoryFormat`¶

`PairedDNASequencesDirectoryFormat`¶

`AlignedDNAFASTAFormat`¶

`AlignedDNASequencesDirectoryFormat`¶

`DifferentialFormat`¶

`DifferentialDirectoryFormat`¶

`ProteinFASTAFormat`¶

`AlignedProteinFASTAFormat`¶

`MixedCaseProteinFASTAFormat`¶

`MixedCaseAlignedProteinFASTAFormat`¶

`ProteinSequencesDirectoryFormat`¶

`AlignedProteinSequencesDirectoryFormat`¶

`MixedCaseProteinSequencesDirectoryFormat`¶

`MixedCaseAlignedProteinSequencesDirectoryFormat`¶

`RNAFASTAFormat`¶

`RNASequencesDirectoryFormat`¶

`AlignedRNAFASTAFormat`¶

`AlignedRNASequencesDirectoryFormat`¶

`PairedRNASequencesDirectoryFormat`¶

`BLAST6Format`¶

`BLAST6DirectoryFormat`¶

`MixedCaseDNAFASTAFormat`¶