Probabilistic Assembly with Mycelia
Mycelia's probabilistic assembly approach combines probabilistic modelling with biological insight to deliver assemblies with confidence intervals.
๐งฌ What is Probabilistic Assembly?
Traditional Assembly
Traditional assemblers make deterministic decisions at each step:
- Fixed k-mer size throughout assembly
- Binary decisions: keep or discard sequences
- Quality scores often ignored after initial filtering
- Heuristic-based error correction
Probabilistic Assembly
Mycelia's probabilistic approach treats assembly as a statistical inference problem:
- Dynamic k-mer progression based on data characteristics
- Probabilistic path selection using quality scores
- Maximum likelihood error correction
- Self-optimizing parameters through machine learning
Traditional: Reads โ Graph โ Heuristic Cleaning โ Assembly
Probabilistic: Reads โ Quality-Aware Graph โ Statistical Inference โ Optimal Assembly
๐ฏ Why Use Probabilistic Assembly?
1. High Accuracy
- Preserves quality information throughout assembly
- Statistically principled error correction
2. Self-Optimizing
- No manual parameter tuning required
- Automatically selects optimal k-mer sizes
- Adapts to your data's characteristics
3. Quality-Aware
- First assembler to preserve per-base quality scores
- Handles varying coverage gracefully
๐ Quick Start (5 Minutes)
Get your first assembly running in under 5 minutes:
# 1. Install Mycelia (one-time setup)
import Pkg
Pkg.add(url="https://github.com/cjprybol/Mycelia.git")
# 2. Load the package
import Mycelia
# 3. Assemble your genome
assembly = Mycelia.assemble_genome("my_reads.fastq")
# 4. Check your results
println("Assembly complete! $(assembly.num_contigs) contigs, N50: $(assembly.n50)")
# 5. Save the assembly
Mycelia.write_fasta(assembly.contigs, "my_assembly.fasta")
That's it! Mycelia automatically:
- โ Detects your data type
- โ Selects optimal parameters
- โ Performs quality-aware assembly
- โ Validates results
๐บ๏ธ Choose Your Path
By Data Type
graph TD
A[What type of data do you have?] --> B[FASTA only<br/>No quality scores]
A --> C[FASTQ<br/>With quality scores]
A --> D[Mixed<br/>Short + long reads]
B --> E[K-mer Graphs<br/>โ Tutorial 1]
C --> F[Qualmer Graphs<br/>โ Tutorial 2]
D --> G[Hybrid Assembly<br/>โ Tutorial 3]
By Experience Level
๐ฑ Beginner - "I just want it to work"
โ Start with Assembly in 5 Minutes
- Automatic parameter selection
- Simple one-function interface
- Clear output interpretation
๐ฟ Intermediate - "I want to understand and optimize"
โ Continue to Understanding Assembly Methods
- Compare different approaches
- Tune for your specific needs
- Interpret quality metrics
๐ณ Advanced - "I want full control"
โ Explore Advanced Assembly Theory
- Custom graph algorithms
- Machine learning integration
- Novel method development
๐ Each Method
Intelligent Assembly
- โ Automatic parameter optimization
- โ Good for unknown data characteristics
- โ Balances speed and accuracy
Iterative Assembly
- โ Best for low-quality data
- โ Refines assembly through iterations
- โฑ๏ธ Takes more time
Reinforcement Learning Assembly
- โ Learns from experience
- โ Adapts to new data types
- ๐งช Experimental feature
๐ ๏ธ Next Steps
Ready to Start?
- Install Mycelia - Multiple installation options
- Run the 5-minute tutorial - See it in action
- Choose your workflow - Find the best approach
Need Help?
- ๐ FAQ - Common questions and troubleshooting
- ๐ฌ Community Forum - Ask questions
- ๐ Report Issues - Help us improve
Want to Learn More?
- ๐งฎ Mathematical Foundations - The theory behind the methods
- ๐ Benchmarks - Performance comparisons
- ๐ฌ Case Studies - Real-world applications
๐ Why Mycelia?
Scientific Innovation
- Novel 6-graph hierarchy - Unifies multiple assembly approaches
- Quality preservation - First to maintain quality throughout assembly
- Principled algorithms - Based on proven statistical methods
Practical Benefits
- No parameter tuning - It should just work
- Transparent results - Data-driven accuracy and completeness
- Active development - Regular updates and improvements
Open Source
- Free to use - MIT licensed
- Community driven - Contributions welcome
- Transparent - All algorithms documented
Ready to experience the future of genome assembly? Get started now โ