Loading...
Loading...
Found 11 Skills
Flink Job Creator - Auto-activating skill for Data Pipelines. Triggers on: flink job creator, flink job creator Part of the Data Pipelines skill category.
Use this skill when building real-time or near-real-time data pipelines. Covers Kafka, Flink, Spark Streaming, Snowpipe, BigQuery streaming, materialized views, and batch-vs-streaming decisions. Common phrases: "real-time pipeline", "Kafka consumer", "streaming vs batch", "low latency ingestion". Do NOT use for batch integration patterns (use integration-patterns-skill) or pipeline orchestration (use data-orchestration-skill).
Set up end-to-end Change Data Capture (CDC) pipelines on Confluent Cloud using Debezium source connectors, Flink for transformation, and Tableflow for data lake integration. Supports JSON_SR, Avro, and Protobuf formats. Handles schemaless topics (plain JSON without SR) and multi-event topics. This skill handles the complete workflow from database to Iceberg/Delta tables. Use this skill when users want to capture database changes and materialize them into Iceberg or Delta Lake tables via Confluent Cloud Tableflow. Trigger phrases include "CDC to Tableflow", "database to Iceberg", "database to Delta Lake", "stream database changes to data lake", "set up Tableflow pipeline", "schemaless topic to Tableflow", or "multi-event topic to Iceberg". Do NOT trigger for general CDC, Debezium, or database replication requests that do not involve Tableflow or Iceberg/Delta Lake as the destination.
Use when user explicitly asks Flink/Ververica/Realtime Compute Console workspace operations: 草稿(draft), SQL校验/执行, 部署(deployment), 作业(job), Session Cluster, namespace, 表(table), 成员(member), 变量(variable), 或 checkpoint timeout 诊断, especially with workspace/deployment/job IDs (w-*, d-*, j-*, sc-*, draft-*). Also use when prompt asks to test/verify Flink Console lifecycle flow, safety guardrails, or parameter validation for these operations. This includes prompts such as create draft, deploy draft, list deployments, start/stop job, create/list session cluster, get tables, list variables. Also use when prompt explicitly asks to run `python scripts/flink_ververica_ops.py` for Flink Console workspace operations. Do not trigger for unrelated "workspace" contexts or generic cloud/platform tasks (ECS, OSS, RDS, Kafka, Spark, Kubernetes, billing, weather). Do not trigger for Flink instance lifecycle operations (create/scale/delete/renew); those belong to alibabacloud-flink-instance-manage.
Use this skill when building real-time data pipelines, stream processing jobs, or change data capture systems. Triggers on tasks involving Apache Kafka (producers, consumers, topics, partitions, consumer groups, Connect, Streams), Apache Flink (DataStream API, windowing, checkpointing, stateful processing), event sourcing implementations, CDC with Debezium, stream processing patterns (windowing, watermarks, exactly-once semantics), and any pipeline that processes unbounded data in motion rather than data at rest.
DataWorks data development Skill. Create, configure, validate, deploy, update, move, and rename nodes and workflows. Manage components, file resources, and UDF functions. Covers 150+ node types: Shell, SQL, Python, DI, Flink, EMR, etc. Supports scheduled and manual workflow orchestration via aliyun CLI or Python SDK. WARNING: Supports mutating operations (Move, Rename) requiring explicit user confirmation. Delete operations are NOT supported by this skill. Triggers: DataWorks, data development nodes, workflows, FlowSpec, scheduling tasks, data integration, ETL pipelines, .spec.json. Also triggers for Alibaba Cloud data development, scheduling node configuration, FlowSpec format, or DI task orchestration.
Build event streaming and real-time data pipelines with Kafka, Pulsar, Redpanda, Flink, and Spark. Covers producer/consumer patterns, stream processing, event sourcing, and CDC across TypeScript, Python, Go, and Java. When building real-time systems, microservices communication, or data integration pipelines.
Guides understanding and working with Apache Beam runners (Direct, Dataflow, Flink, Spark, etc.). Use when configuring pipelines for different execution environments or debugging runner-specific issues.
Manage Alibaba Cloud Real-Time Compute for Apache Flink instances across the full lifecycle, including create/query/scale/renew/convert, namespace operations, tagging, cleanup, and batch execution. Use this skill only when the user explicitly asks to operate Alibaba Cloud Flink instances or their direct child resources in a region; do not trigger for unrelated prompts.
Guides use of AWS messaging and streaming services. Covers Amazon SQS, Amazon SNS, Amazon EventBridge, Amazon MQ, Amazon Kinesis Data Streams, Amazon Data Firehose, Amazon Managed Service for Apache Flink, and Amazon Managed Streaming for Apache Kafka (MSK). Use when implementing messaging and streaming patterns.
Architect, build, and debug Kafka Streams apps (JVM-embedded stream processing). Use when user mentions KStream, KTable, topology, TopologyTestDriver, StreamsBuilder, interactive queries, GlobalKTable, joins/windows/aggregations, or debugging issues (rebalancing, state stores, lag, deserialization errors). Also use when user wants to optimize Kafka Streams for WarpStream or tune Kafka Streams client configuration for WarpStream. Do NOT trigger for Flink, connectors, CDC, or plain producer/consumer.