Documentation ¶
Overview ¶
Generates a go source file declaring structs compatible with the tages declared in the given mro sources.
MRO files are parsed given the current mropath. If a specific set of stages is not specified, then code is generated for all stages.
The given source file is created, with the given package name. For each stage which will be used, a structure is created which is appropriate for serializing the stage args and outs files. In addition, for stages which split, structures are generated for stagedefs and chunk args. These structs are named as <stageName><File>, where file is one of Args, Outs, ChunkDef, ChunkArgs, or JoinArgs. Stages which do not split will have only the first two of those.
ChunkDefs objects will have a ToChunkDef method, which converts from the stage- specific chunk def object to a *core.ChunkDef, which is required by the go adapter for the return value of the split.
ChunkArgs is a combination of ChunkDefs and the stage Args, and is used by the chunk main to deserialize its arguments.
ChunkOuts is a combination of the split outs and the stage outs. It defines a custom json marshaller in order to ensure the outputs are correctly flattened in the json representation. It is used by the chunk main for its output and by the join to deserialize the chunk outputs.
JoinArgs is a combination of the Args and to job resources structure. It is used by the join instead of Args if the join wants to see the thread/memory request assigned to it by the split.
A stage without a split will look like the following simple example:
func main(metadata *core.Metadata) (interface{}, error) { var args StageNameArgs if err := metadata.ReadInto(core.ArgsFile, &args); err != nil { return nil, err } return &StageNameOuts{ Arg1: value1, Arg2: value2, }, nil }
Stages with splits will be more complex and should use the corresponding datastructures.
Leading underscores are stripped from the stage. The stage name is converted to camelCase unless '-public' is specified on the command line, in which case it is converted to PascalCase.
Given the input pipeline.mro:
filetype bam; filetype vcf; filetype vcf.gz; filetype vcf.gz.tbi; filetype filter_params; filetype json; filetype bed; filetype tsv; filetype tsv.gz; filetype h5; filetype csv; stage POPULATE_INFO_FIELDS( in vcf vc_precalled, in string variant_mode, in vcf.gz haploid_merge "optional vcf to merge with normal calls", in string[] chunk_locus "list of chunk loci, if supplying haploid_merge", in int min_mapq_attach_bc, out vcf.gz, src py "stages/snpindels/populate_info", ) split using ( in vcf chunk_input, out int chunk_output, )
A user would run
$ mro2go -package populate -o stagestructs.go pipeline.mro
to generate stagestructs.go:
package populate import ( "github.com/martian-lang/martian/martian/core" ) // A structure to encode and decode args to the POPULATE_INFO_FIELDS stage. type PopulateInfoFieldsArgs struct { // vcf file VcPrecalled string `json:"vc_precalled"` VariantMode string `json:"variant_mode"` // vcf.gz file: optional vcf to merge with normal calls HaploidMerge string `json:"haploid_merge"` // list of chunk loci, if supplying haploid_merge ChunkLocus []string `json:"chunk_locus"` MinMapqAttachBc int `json:"min_mapq_attach_bc"` } // A structure to encode and decode outs from the POPULATE_INFO_FIELDS stage. type PopulateInfoFieldsOuts struct { // vcf.gz file Default string `json:"default"` } // A structure to encode and decode args to the POPULATE_INFO_FIELDS chunks. // Defines the resources and arguments of a chunk. type PopulateInfoFieldsChunkDef struct { *core.JobResources `json:",omitempty"` // vcf file ChunkInput string `json:"chunk_input"` } func (self *PopulateInfoFieldsChunkIns) ToChunkDef() *core.ChunkDef { return &core.ChunkDef{ Args: core.ArgumentMap{ "chunk_input": self.ChunkInput, }, Resources: self.JobResources, } } // A structure to decode args to the chunks type PopulateInfoFieldsChunkArgs struct { PopulateInfoFieldsChunkDef PopulateInfoFieldsArgs } // A structure to decode args to the join method. type PopulateInfoFieldsJoinArgs struct { core.JobResources PopulateInfoFieldsArgs } // A structure to encode outs from the chunks. type PopulateInfoFieldsChunkOuts struct { PopulateInfoFieldsOuts ChunkOutput int `json:"chunk_output"` } func (self *PopulateInfoFieldsChunkOuts) MarshalJSON() ([]byte, error) { ... }