latchbio-integration
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseLatchBio Integration
LatchBio 集成
Overview
概述
Latch is a Python framework for building and deploying bioinformatics workflows as serverless pipelines. Built on Flyte, create workflows with @workflow/@task decorators, manage cloud data with LatchFile/LatchDir, configure resources, and integrate Nextflow/Snakemake pipelines.
Latch是一个Python框架,用于构建并将生物信息学工作流部署为无服务器管道。基于Flyte构建,可使用@workflow/@task装饰器创建工作流,通过LatchFile/LatchDir管理云数据,配置资源,并集成Nextflow/Snakemake管道。
Core Capabilities
核心功能
The Latch platform provides four main areas of functionality:
Latch平台提供四大核心功能领域:
1. Workflow Creation and Deployment
1. 工作流创建与部署
- Define serverless workflows using Python decorators
- Support for native Python, Nextflow, and Snakemake pipelines
- Automatic containerization with Docker
- Auto-generated no-code user interfaces
- Version control and reproducibility
- 使用Python装饰器定义无服务器工作流
- 支持原生Python、Nextflow和Snakemake管道
- 借助Docker自动容器化
- 自动生成无代码用户界面
- 版本控制与可复现性
2. Data Management
2. 数据管理
- Cloud storage abstractions (LatchFile, LatchDir)
- Structured data organization with Registry (Projects → Tables → Records)
- Type-safe data operations with links and enums
- Automatic file transfer between local and cloud
- Glob pattern matching for file selection
- 云存储抽象(LatchFile、LatchDir)
- 借助Registry实现结构化数据组织(项目→表格→记录)
- 支持链接和枚举的类型安全数据操作
- 本地与云之间的自动文件传输
- 文件选择的通配符模式匹配
3. Resource Configuration
3. 资源配置
- Pre-configured task decorators (@small_task, @large_task, @small_gpu_task, @large_gpu_task)
- Custom resource specifications (CPU, memory, GPU, storage)
- GPU support (K80, V100, A100)
- Timeout and storage configuration
- Cost optimization strategies
- 预配置的任务装饰器(@small_task、@large_task、@small_gpu_task、@large_gpu_task)
- 自定义资源规格(CPU、内存、GPU、存储)
- GPU支持(K80、V100、A100)
- 超时与存储配置
- 成本优化策略
4. Verified Workflows
4. 已验证工作流
- Production-ready pre-built pipelines
- Bulk RNA-seq, DESeq2, pathway analysis
- AlphaFold and ColabFold for protein structure prediction
- Single-cell tools (ArchR, scVelo, emptyDropsR)
- CRISPR analysis, phylogenetics, and more
- 生产就绪的预构建管道
- 批量RNA-seq、DESeq2、通路分析
- 用于蛋白质结构预测的AlphaFold和ColabFold
- 单细胞工具(ArchR、scVelo、emptyDropsR)
- CRISPR分析、系统发育分析等
Quick Start
快速开始
Installation and Setup
安装与设置
bash
undefinedbash
undefinedInstall Latch SDK
Install Latch SDK
python3 -m uv pip install latch
python3 -m uv pip install latch
Login to Latch
Login to Latch
latch login
latch login
Initialize a new workflow
Initialize a new workflow
latch init my-workflow
latch init my-workflow
Register workflow to platform
Register workflow to platform
latch register my-workflow
**Prerequisites:**
- Docker installed and running
- Latch account credentials
- Python 3.8+latch register my-workflow
**前置要求:**
- 已安装并运行Docker
- Latch账户凭证
- Python 3.8+Basic Workflow Example
基础工作流示例
python
from latch import workflow, small_task
from latch.types import LatchFile
@small_task
def process_file(input_file: LatchFile) -> LatchFile:
"""Process a single file"""
# Processing logic
return output_file
@workflow
def my_workflow(input_file: LatchFile) -> LatchFile:
"""
My bioinformatics workflow
Args:
input_file: Input data file
"""
return process_file(input_file=input_file)python
from latch import workflow, small_task
from latch.types import LatchFile
@small_task
def process_file(input_file: LatchFile) -> LatchFile:
"""Process a single file"""
# Processing logic
return output_file
@workflow
def my_workflow(input_file: LatchFile) -> LatchFile:
"""
My bioinformatics workflow
Args:
input_file: Input data file
"""
return process_file(input_file=input_file)When to Use This Skill
适用场景
This skill should be used when encountering any of the following scenarios:
Workflow Development:
- "Create a Latch workflow for RNA-seq analysis"
- "Deploy my pipeline to Latch"
- "Convert my Nextflow pipeline to Latch"
- "Add GPU support to my workflow"
- Working with ,
@workflowdecorators@task
Data Management:
- "Organize my sequencing data in Latch Registry"
- "How do I use LatchFile and LatchDir?"
- "Set up sample tracking in Latch"
- Working with paths
latch:///
Resource Configuration:
- "Configure GPU for AlphaFold on Latch"
- "My task is running out of memory"
- "How do I optimize workflow costs?"
- Working with task decorators
Verified Workflows:
- "Run AlphaFold on Latch"
- "Use DESeq2 for differential expression"
- "Available pre-built workflows"
- Using module
latch.verified
当遇到以下场景时,可使用本技能:
工作流开发:
- "为RNA-seq分析创建Latch工作流"
- "将我的管道部署到Latch"
- "将我的Nextflow管道转换为Latch格式"
- "为我的工作流添加GPU支持"
- 使用、
@workflow装饰器@task
数据管理:
- "在Latch Registry中整理我的测序数据"
- "如何使用LatchFile和LatchDir?"
- "在Latch中设置样本追踪"
- 使用路径
latch:///
资源配置:
- "在Latch上为AlphaFold配置GPU"
- "我的任务内存不足"
- "如何优化工作流成本?"
- 使用任务装饰器
已验证工作流:
- "在Latch上运行AlphaFold"
- "使用DESeq2进行差异表达分析"
- "可用的预构建工作流"
- 使用模块
latch.verified
Detailed Documentation
详细文档
This skill includes comprehensive reference documentation organized by capability:
本技能包含按功能分类的全面参考文档:
references/workflow-creation.md
references/workflow-creation.md
Read this for:
- Creating and registering workflows
- Task definition and decorators
- Supporting Python, Nextflow, Snakemake
- Launch plans and conditional sections
- Workflow execution (CLI and programmatic)
- Multi-step and parallel pipelines
- Troubleshooting registration issues
Key topics:
- and
latch initcommandslatch register - and
@workflowdecorators@task - LatchFile and LatchDir basics
- Type annotations and docstrings
- Launch plans with preset parameters
- Conditional UI sections
适用场景:
- 创建并注册工作流
- 任务定义与装饰器
- 支持Python、Nextflow、Snakemake
- 启动计划与条件区域
- 工作流执行(CLI与程序化方式)
- 多步骤与并行管道
- 注册问题排查
核心主题:
- 和
latch init命令latch register - 和
@workflow装饰器@task - LatchFile和LatchDir基础
- 类型注解与文档字符串
- 带预设参数的启动计划
- 条件UI区域
references/data-management.md
references/data-management.md
Read this for:
- Cloud storage with LatchFile and LatchDir
- Registry system (Projects, Tables, Records)
- Linked records and relationships
- Enum and typed columns
- Bulk operations and transactions
- Integration with workflows
- Account and workspace management
Key topics:
- path format
latch:/// - File transfer and glob patterns
- Creating and querying Registry tables
- Column types (string, number, file, link, enum)
- Record CRUD operations
- Workflow-Registry integration
适用场景:
- 使用LatchFile和LatchDir进行云存储
- Registry系统(项目、表格、记录)
- 关联记录与关系
- 枚举与类型化列
- 批量操作与事务
- 与工作流集成
- 账户与工作区管理
核心主题:
- 路径格式
latch:/// - 文件传输与通配符模式
- 创建与查询Registry表格
- 列类型(字符串、数字、文件、链接、枚举)
- 记录CRUD操作
- 工作流-Registry集成
references/resource-configuration.md
references/resource-configuration.md
Read this for:
- Task resource decorators
- Custom CPU, memory, GPU configuration
- GPU types (K80, V100, A100)
- Timeout and storage settings
- Resource optimization strategies
- Cost-effective workflow design
- Monitoring and debugging
Key topics:
- ,
@small_task,@large_task,@small_gpu_task@large_gpu_task - with precise specifications
@custom_task - Multi-GPU configuration
- Resource selection by workload type
- Platform limits and quotas
适用场景:
- 任务资源装饰器
- 自定义CPU、内存、GPU配置
- GPU类型(K80、V100、A100)
- 超时与存储设置
- 资源优化策略
- 高性价比工作流设计
- 监控与调试
核心主题:
- 、
@small_task、@large_task、@small_gpu_task@large_gpu_task - 带精确规格的
@custom_task - 多GPU配置
- 按工作负载类型选择资源
- 平台限制与配额
references/verified-workflows.md
references/verified-workflows.md
Read this for:
- Pre-built production workflows
- Bulk RNA-seq and DESeq2
- AlphaFold and ColabFold
- Single-cell analysis (ArchR, scVelo)
- CRISPR editing analysis
- Pathway enrichment
- Integration with custom workflows
Key topics:
- module imports
latch.verified - Available verified workflows
- Workflow parameters and options
- Combining verified and custom steps
- Version management
适用场景:
- 预构建生产级工作流
- 批量RNA-seq与DESeq2
- AlphaFold与ColabFold
- 单细胞分析(ArchR、scVelo)
- CRISPR编辑分析
- 通路富集
- 与自定义工作流集成
核心主题:
- 模块导入
latch.verified - 可用的已验证工作流
- 工作流参数与选项
- 组合已验证步骤与自定义步骤
- 版本管理
Common Workflow Patterns
常见工作流模式
Complete RNA-seq Pipeline
完整RNA-seq管道
python
from latch import workflow, small_task, large_task
from latch.types import LatchFile, LatchDir
@small_task
def quality_control(fastq: LatchFile) -> LatchFile:
"""Run FastQC"""
return qc_output
@large_task
def alignment(fastq: LatchFile, genome: str) -> LatchFile:
"""STAR alignment"""
return bam_output
@small_task
def quantification(bam: LatchFile) -> LatchFile:
"""featureCounts"""
return counts
@workflow
def rnaseq_pipeline(
input_fastq: LatchFile,
genome: str,
output_dir: LatchDir
) -> LatchFile:
"""RNA-seq analysis pipeline"""
qc = quality_control(fastq=input_fastq)
aligned = alignment(fastq=qc, genome=genome)
return quantification(bam=aligned)python
from latch import workflow, small_task, large_task
from latch.types import LatchFile, LatchDir
@small_task
def quality_control(fastq: LatchFile) -> LatchFile:
"""Run FastQC"""
return qc_output
@large_task
def alignment(fastq: LatchFile, genome: str) -> LatchFile:
"""STAR alignment"""
return bam_output
@small_task
def quantification(bam: LatchFile) -> LatchFile:
"""featureCounts"""
return counts
@workflow
def rnaseq_pipeline(
input_fastq: LatchFile,
genome: str,
output_dir: LatchDir
) -> LatchFile:
"""RNA-seq analysis pipeline"""
qc = quality_control(fastq=input_fastq)
aligned = alignment(fastq=qc, genome=genome)
return quantification(bam=aligned)GPU-Accelerated Workflow
GPU加速工作流
python
from latch import workflow, small_task, large_gpu_task
from latch.types import LatchFile
@small_task
def preprocess(input_file: LatchFile) -> LatchFile:
"""Prepare data"""
return processed
@large_gpu_task
def gpu_computation(data: LatchFile) -> LatchFile:
"""GPU-accelerated analysis"""
return results
@workflow
def gpu_pipeline(input_file: LatchFile) -> LatchFile:
"""Pipeline with GPU tasks"""
preprocessed = preprocess(input_file=input_file)
return gpu_computation(data=preprocessed)python
from latch import workflow, small_task, large_gpu_task
from latch.types import LatchFile
@small_task
def preprocess(input_file: LatchFile) -> LatchFile:
"""Prepare data"""
return processed
@large_gpu_task
def gpu_computation(data: LatchFile) -> LatchFile:
"""GPU-accelerated analysis"""
return results
@workflow
def gpu_pipeline(input_file: LatchFile) -> LatchFile:
"""Pipeline with GPU tasks"""
preprocessed = preprocess(input_file=input_file)
return gpu_computation(data=preprocessed)Registry-Integrated Workflow
集成Registry的工作流
python
from latch import workflow, small_task
from latch.registry.table import Table
from latch.registry.record import Record
from latch.types import LatchFile
@small_task
def process_and_track(sample_id: str, table_id: str) -> str:
"""Process sample and update Registry"""
# Get sample from registry
table = Table.get(table_id=table_id)
records = Record.list(table_id=table_id, filter={"sample_id": sample_id})
sample = records[0]
# Process
input_file = sample.values["fastq_file"]
output = process(input_file)
# Update registry
sample.update(values={"status": "completed", "result": output})
return "Success"
@workflow
def registry_workflow(sample_id: str, table_id: str):
"""Workflow integrated with Registry"""
return process_and_track(sample_id=sample_id, table_id=table_id)python
from latch import workflow, small_task
from latch.registry.table import Table
from latch.registry.record import Record
from latch.types import LatchFile
@small_task
def process_and_track(sample_id: str, table_id: str) -> str:
"""Process sample and update Registry"""
# Get sample from registry
table = Table.get(table_id=table_id)
records = Record.list(table_id=table_id, filter={"sample_id": sample_id})
sample = records[0]
# Process
input_file = sample.values["fastq_file"]
output = process(input_file)
# Update registry
sample.update(values={"status": "completed", "result": output})
return "Success"
@workflow
def registry_workflow(sample_id: str, table_id: str):
"""Workflow integrated with Registry"""
return process_and_track(sample_id=sample_id, table_id=table_id)Best Practices
最佳实践
Workflow Design
工作流设计
- Use type annotations for all parameters
- Write clear docstrings (appear in UI)
- Start with standard task decorators, scale up if needed
- Break complex workflows into modular tasks
- Implement proper error handling
- 为所有参数添加类型注解
- 编写清晰的文档字符串(会显示在UI中)
- 从标准任务装饰器开始,必要时再扩展
- 将复杂工作流拆分为模块化任务
- 实现适当的错误处理
Data Management
数据管理
- Use consistent folder structures
- Define Registry schemas before bulk entry
- Use linked records for relationships
- Store metadata in Registry for traceability
- 使用一致的文件夹结构
- 批量录入前定义Registry模式
- 使用关联记录建立关系
- 在Registry中存储元数据以实现可追溯性
Resource Configuration
资源配置
- Right-size resources (don't over-allocate)
- Use GPU only when algorithms support it
- Monitor execution metrics and optimize
- Design for parallel execution when possible
- 合理分配资源(不要过度分配)
- 仅在算法支持时使用GPU
- 监控执行指标并优化
- 尽可能设计为并行执行
Development Workflow
开发工作流
- Test locally with Docker before registration
- Use version control for workflow code
- Document resource requirements
- Profile workflows to determine actual needs
- 注册前使用Docker本地测试
- 为工作流代码使用版本控制
- 记录资源需求
- 分析工作流以确定实际需求
Troubleshooting
故障排查
Common Issues
常见问题
Registration Failures:
- Ensure Docker is running
- Check authentication with
latch login - Verify all dependencies in Dockerfile
- Use flag for detailed logs
--verbose
Resource Problems:
- Out of memory: Increase memory in task decorator
- Timeouts: Increase timeout parameter
- Storage issues: Increase ephemeral storage_gib
Data Access:
- Use correct path format
latch:/// - Verify file exists in workspace
- Check permissions for shared workspaces
Type Errors:
- Add type annotations to all parameters
- Use LatchFile/LatchDir for file/directory parameters
- Ensure workflow return type matches actual return
注册失败:
- 确保Docker正在运行
- 使用检查认证状态
latch login - 验证Dockerfile中的所有依赖
- 使用标志获取详细日志
--verbose
资源问题:
- 内存不足:在任务装饰器中增加内存
- 超时:增加timeout参数
- 存储问题:增加ephemeral storage_gib
数据访问:
- 使用正确的路径格式
latch:/// - 验证文件是否存在于工作区
- 检查共享工作区的权限
类型错误:
- 为所有参数添加类型注解
- 对文件/目录参数使用LatchFile/LatchDir
- 确保工作流返回类型与实际返回值匹配
Additional Resources
额外资源
- Official Documentation: https://docs.latch.bio
- GitHub Repository: https://github.com/latchbio/latch
- Slack Community: Join Latch SDK workspace
- API Reference: https://docs.latch.bio/api/latch.html
- Blog: https://blog.latch.bio
- 官方文档:https://docs.latch.bio
- GitHub仓库:https://github.com/latchbio/latch
- Slack社区:加入Latch SDK工作区
- API参考:https://docs.latch.bio/api/latch.html
- 博客:https://blog.latch.bio
Support
支持
For issues or questions:
- Check documentation links above
- Search GitHub issues
- Ask in Slack community
- Contact support@latch.bio
如遇问题或疑问:
- 查看上述文档链接
- 搜索GitHub问题
- 在Slack社区提问
- 联系support@latch.bio