Перейти к основному содержимому
Перейти к основному содержимому

Best practices

Message Compression

We strongly recommend using compression for your Kafka topics. Compression can result in a significant saving in data transfer costs with virtually no performance hit. To learn more about message compression in Kafka, we recommend starting with this guide.

Limitations

Delivery semantics

ClickPipes for Kafka provides at-least-once delivery semantics (as one of the most commonly used approaches). We'd love to hear your feedback on delivery semantics contact form. If you need exactly-once semantics, we recommend using our official clickhouse-kafka-connect sink.

Authentication

For Apache Kafka protocol data sources, ClickPipes supports SASL/PLAIN authentication with TLS encryption, as well as SASL/SCRAM-SHA-256 and SASL/SCRAM-SHA-512. Depending on the streaming source (Redpanda, MSK, etc) will enable all or a subset of these auth mechanisms based on compatibility. If you auth needs differ please give us feedback.

IAM

к сведению

IAM Authentication for the MSK ClickPipe is a beta feature.

ClickPipes supports the following AWS MSK authentication

When using IAM authentication to connect to an MSK broker, the IAM role must have the necessary permissions. Below is an example of the required IAM policy for Apache Kafka APIs for MSK:

Configuring a trusted relationship

If you are authenticating to MSK with a IAM role ARN, you will need to add a trusted relationship between your ClickHouse Cloud instance so the role can be assumed.

примечание

Role-based access only works for ClickHouse Cloud instances deployed to AWS.

Custom Certificates

ClickPipes for Kafka supports the upload of custom certificates for Kafka brokers with SASL & public SSL/TLS certificate. You can upload your certificate in the SSL Certificate section of the ClickPipe setup.

примечание

Please note that while we support uploading a single SSL certificate along with SASL for Kafka, SSL with Mutual TLS (mTLS) is not supported at this time.

Performance

Batching

ClickPipes inserts data into ClickHouse in batches. This is to avoid creating too many parts in the database which can lead to performance issues in the cluster.

Batches are inserted when one of the following criteria has been met:

  • The batch size has reached the maximum size (100,000 rows or 20MB)
  • The batch has been open for a maximum amount of time (5 seconds)

Latency

Latency (defined as the time between the Kafka message being produced and the message being available in ClickHouse) will be dependent on a number of factors (i.e. broker latency, network latency, message size/format). The batching described in the section above will also impact latency. We always recommend testing your specific use case with typical loads to determine the expected latency.

ClickPipes does not provide any guarantees concerning latency. If you have specific low-latency requirements, please contact us.

Scaling

ClickPipes for Kafka is designed to scale horizontally. By default, we create a consumer group with one consumer. This can be changed with the scaling controls in the ClickPipe details view.

ClickPipes provides a high-availability with an availability zone distributed architecture. This requires scaling to at least two consumers.

Regardless number of running consumers, fault tolerance is available by design. If a consumer or its underlying infrastructure fails, the ClickPipe will automatically restart the consumer and continue processing messages.