How to Connect to Kafka from a Remote Machine

Updated: January 31, 2024 By: Guest Contributor Post a comment

Introduction

Apache Kafka is a powerful distributed event streaming platform that is widely used to build scalable real-time data pipelines and streaming applications. It is written in Scala and Java and is often used for building real-time streaming data pipelines that reliably get data between systems or applications. In this tutorial, we will go over how to connect to Kafka from a remote machine using various clients and tools, with a step-by-step approach and helpful code examples.

Prerequisites

  • A running Kafka cluster accessible over a network
  • Machine with network access to the Kafka cluster
  • Basic understanding of Kafka concepts like broker, topic, producer, and consumer
  • Java Development Kit (JDK) installed
  • Kafka clients or libraries for your programming language of choice

Connecting Kafka with Basic Configuration

Let’s start by establishing a simple producer and consumer connection to a remote Kafka cluster using Java:

Properties props = new Properties();
props.put("bootstrap.servers", "remote-kafka-broker:9092");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

KafkaProducer<String, String> producer = new KafkaProducer<>(props);
producer.send(new ProducerRecord<String, String>("test", "hello", "world"));
producer.close();

This will send a message with the key/value pair “hello/world” to the “test” topic on the remote Kafka broker. Ensure that the Kafka cluster’s firewall and security group settings allow connections on the configured port (9092 is the default port for Kafka).

Properties props = new Properties();
props.put("bootstrap.servers", "remote-kafka-broker:9092");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("group.id", "my-group");

KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("test"));

while (true) {
    ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
    for (ConsumerRecord<String, String> record : records) {
        System.out.printf("offset = %d, key = %s, value = %s\n", record.offset(), record.key(), record.value());
    }
}

The above code snippet creates a simple Kafka consumer that subscribes to the “test” topic and prints incoming messages to the console. Again, matching the broker’s network settings with your consumer’s configuration is paramount for successful communication.

Advanced Connection Settings

For more advanced use cases or when dealing with security protocols like SSL/TLS, you need to set additional properties:

props.put("security.protocol", "SSL");
props.put("ssl.truststore.location", "path/to/truststore.jks");
props.put("ssl.truststore.password", "truststore-password");
props.put("ssl.keystore.location", "path/to/keystore.jks");
props.put("ssl.keystore.password", "keystore-password");
props.put("ssl.key.password", "key-password");

These properties configure Kafka to use SSL for encryption and authentication. The truststore and keystore contain the certificates needed for establishing a secure connection.

Using Kafka Tools for Remote Connection

Besides using a client library, you can also use Kafka’s command-line tools to interact with the cluster. Here’s an example of how to produce messages using Kafka’s console producer:

bin/kafka-console-producer.sh --broker-list remote-kafka-broker:9092 --topic test
>This is a test message
>Another test message

To consume messages, use Kafka’s console consumer:

bin/kafka-console-consumer.sh --bootstrap-server remote-kafka-broker:9092 --topic test --from-beginning

This will consume messages from the beginning of the topic ‘test’ and print them to the console.

Connection Monitoring and Troubleshooting

Monitoring your Kafka connections is essential for maintaining a stable and efficient streaming application. You can use tools like Kafka’s built-in consumer group command:

bin/kafka-consumer-groups.sh --bootstrap-server remote-kafka-broker:9092 --describe --group my-group

For troubleshooting, ensure you have the proper permissions and network configurations. Access control lists (ACLs) or firewall settings may block unintended connections. Use command-line tools like ‘telnet’ or ‘netcat’ to verify connectivity:

telnet remote-kafka-broker 9092

If the connection is established, it means the network allows traffic to the Kafka broker.

Connecting with Kafka Connect

Kafka Connect is a tool for scalably and reliably streaming between Apache Kafka and other systems. It can be used for importing and exporting data as Kafka topics. Here’s a simple configuration example for a file source connector:

{
  "name": "local-file-source",
  "config": {
    "connector.class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
    "tasks.max": "1",
    "file": "/path/to/input/file.txt",
    "topic": "test"
  }
}

To apply this configuration, you’d use Kafka Connect’s REST API:

curl -X POST -H "Content-Type: application/json" --data '@source-connector-config.json' http://remote-kafka-connect:8083/connectors

This sends the connector configuration to the Kafka Connect cluster to start streaming data from the specified file to the ‘test’ topic.

Conclusion

In summary, connecting to Apache Kafka from a remote machine involves configuring the Kafka client with the proper network details and security parameters. With the above explanations and examples, you should now be able to set up your Kafka producer and consumer connections, use Kafka’s tools for message production and consumption, employ Kafka Connect for connecting various systems with Kafka, and monitor and troubleshoot your Kafka infrastructure effectively.