API Documentation
Welcome to the Mycelia API documentation! This guide organizes both implemented functions and planned features by biological workflows. Mycelia provides substantial functionality for bioinformatics analysis with extensive tool integration, while continuing to expand with experimental algorithms and additional features.
๐งฌ Quick Start
New to Mycelia? Start with our workflow-based guides:
- Basic Workflows - Common analysis patterns
- Function Index - Alphabetical function list
- Parameter Guide - Common parameters explained
๐ By Workflow Stage
Follow the typical bioinformatics analysis workflow:
1. Data Acquisition & Simulation
Download genomic data from public databases and simulate synthetic datasets for testing.
Working Functions: download_genome_by_accession
, simulate_pacbio_reads
, simulate_nanopore_reads
2. Quality Control & Preprocessing
Assess and improve sequencing data quality before analysis.
Working Functions: analyze_fastq_quality
, calculate_gc_content
, assess_duplication_rates
, qc_filter_short_reads_fastp
, qc_filter_long_reads_filtlong
, trim_galore_paired
Planned: filter_by_quality
, per-base quality visualization
3. Sequence Analysis & K-mers
Analyze sequence composition, count k-mers, and extract genomic features.
Working Functions: count_canonical_kmers
, jaccard_distance
, kmer_counts_to_js_divergence
Planned: kmer_frequency_spectrum
, estimate_genome_size
4. Genome Assembly (planned)
Assemble genomes from sequencing reads using various approaches.
Working Functions: assemble_metagenome_megahit
, assemble_metagenome_metaspades
(external tools) Experimental: Graph-based assembly framework Planned: assemble_genome
, polish_assembly
5. Assembly Validation
Validate and assess the quality of genome assemblies.
Working Functions: assess_assembly_quality
, validate_assembly
, run_quast
, run_busco
, run_mummer
, CheckM/CheckM2 integration Planned: Mauve integration
6. Gene Annotation
Predict genes and assign functional annotations.
Working Functions: Pyrodigal, BLAST+, MMSeqs2, TransTerm, tRNAscan-SE, MLST integrations Planned: GO term analysis, Reactome pathway analysis, PDB integration via UniRef annotations
7. Comparative Genomics (planned)
Compare genomes, build pangenomes, and construct phylogenetic trees.
Working Functions: analyze_pangenome_kmers
, build_genome_distance_matrix
Planned: construct_phylogeny
, calculate_synteny
8. Visualization & Reporting
Create plots, figures, and reports for analysis results.
Working Functions: plot_kmer_frequency_spectra
, visualize_genome_coverage
, plot_embeddings
, plot_taxa_abundances
, coverage plots, taxonomic visualizations Planned: Per-base quality plots, assembly statistics visualization, phylogenetic tree plotting
๐ By Data Type
Working with specific file formats and data structures:
<!โ Data type documentation planned for future releases
โ>
FASTA/FASTQ Files (planned)
Reading, writing, and manipulating sequence files.
Assembly Files (planned)
Working with contigs, scaffolds, and assembly statistics.
Annotation Files (planned)
Handling GFF3, GenBank, and other annotation formats.
Alignment Files (planned)
Processing BAM/SAM files and alignment results.
Phylogenetic Trees (planned)
Tree construction, manipulation, and visualization.
๐ฏ By Analysis Goal
Cross-cutting concerns and specific use cases:
Basic Workflows
Complete examples for common analysis tasks.
Advanced Usage
Complex workflows and optimization techniques.
Function Index
Alphabetical listing of all functions with brief descriptions.
Parameter Guide
Common parameters and their usage across functions.
๐ Finding What You Need
By Task
- "I want to assemble a genome" โ Genome Assembly (planned)
- "I need to validate my assembly" โ Assembly Validation (planned)
- "I want to compare genomes" โ Comparative Genomics (planned)
- "I need to check data quality" โ Quality Control
By Data Type
- "I have FASTQ files" โ FASTA/FASTQ Files (planned)
- "I have assembly contigs" โ Assembly Files (planned)
- "I have gene annotations" โ Annotation Files (planned)
By Experience Level
- Beginner โ Basic Workflows
- Intermediate โ Workflow-specific guides
- Advanced โ Advanced Usage
๐ก Usage Patterns
Function Documentation Format
Each function is documented with:
"""
function_name(required_param, optional_param="default")
Brief description of what the function does.
## Purpose
When and why to use this function in your workflow.
## Arguments
- `required_param`: Description and expected data type
- `optional_param`: Description, default value, and alternatives
## Returns
Description of return value and structure.
## Examples
julia
Basic usage
result = function_name("input.fasta")
Advanced usage with options
result = functionname("input.fasta", optionalparam="custom_value", threads=4)
## Related Functions
- [`related_function`](@ref) - What it does
- [`workflow_next_step`](@ref) - Next step in workflow
## Performance Notes
- Memory usage: ~X GB for typical datasets
- Runtime: ~X minutes for Y-sized genomes
- Scaling: Linear/quadratic with input size
## See Also
- [Workflow Guide](../workflows/relevant-workflow.md)
- [Data Type Guide](../data-types/relevant-type.md)
"""
Cross-References
Functions are linked to:
- Workflow context - Where they fit in analysis pipelines
- Related functions - What to use before/after
- Data types - What formats they accept/produce
- Examples - Real usage scenarios
๐ Integration with Tutorials
This API documentation integrates with the tutorial system:
- Tutorials show complete workflows with explanation
- API docs provide detailed function reference
- Examples bridge the gap with focused use cases
For hands-on learning, see the Tutorials which use these functions in complete bioinformatics workflows.
This documentation is automatically generated from function docstrings and organized for biological workflows. Functions are tested through the tutorial system to ensure accuracy.