Best in Flow Competition Tutorials Part 1
Best in Flow Competition Tutorials
Author: Michael Kohs George Vetticaden Timothy Spann
Date: 04/18/2023
Last Updated: 5/1/2023
Useful Data Assets
Setting Your Workload Password
Creating a Kafka Topic
1. Reading and filtering a stream of syslog data 9
2. Writing critical syslog events to Apache Iceberg for analysis 29
3. Resize image flow deployed as serverless function 56
Use Case Walkthrough for Competition
Notice
This document assumes that you have registered for an account, activated it and logged into the CDP Sandbox. This is for authorized users only who have attended the webinar and have read the training materials.
A short guide and references are listed here.
Competition Resources
Login to the Cluster
https://login.cdpworkshops.cloudera.com/auth/realms/se-workshop-5/protocol/saml/clients/cdp-sso
Kafka Broker connection string
oss-kafka-demo-corebroker2.oss-demo.qsm5-opic.cloudera.site:9093,
oss-kafka-demo-corebroker1.oss-demo.qsm5-opic.cloudera.site:9093,
oss-kafka-demo-corebroker0.oss-demo.qsm5-opic.cloudera.site:9093
Kafka Topics
syslog_json
syslog_avro
syslog_critical
Schema Registry Hostname
oss-kafka-demo-master0.oss-demo.qsm5-opic.cloudera.site
Schema Name
syslog
syslog_avro
syslog_transformed
Syslog Filter Rule
SELECT * FROM FLOWFILE WHERE severity <= 2
Access Key and Private Key for Machine User in DataFlow Function
Access Key: eda9f909-d1c2-4934-bad7-95ec6e326de8
Private Key: eon6eFzLlxZI/gpU0dWtht21DI60MkSQZjIzeWSGBSI=
The following keys are needed if you want to deploy a DataFlow Function that you build during the Best in Flow Competition.
Your Workflow User Name and Password
Click on your name at the bottom left corner of the screen for a menu to pop up.
Click on Profile to be redirected to your user’s profile page with important information.
If your Workload Password does not say currently set or you forgot it, follow the steps below to reset it. Your userid is shown above at Workload User Name.
Setting Workload Password
You will need to define your workload password that will be used to access non-SSO interfaces. You may read more about it here. Please keep it with you. If you have forgotten it, you will be able to repeat this process and define another one.
From the Home Page, click on your User Name (Ex: tim) at the lower left corner.
Click on the Profile option.
Click option Set Workload Password.
Enter a suitable Password and Confirm Password.
Click the button Set Workload Password.
Check that you got the message - Workload password is currently set or alternatively, look for a message next to Workload Password which says (Workload password is currently set). Save the password you configured as well as the workload user name for use later.
Create a Kafka Topic
The tutorials require you to create an Apache Kafka topic to send your data to, this is how you can create that topic. You will also need this information to create topics for any of your own custom applications for the competition.
Navigate to Data Hub Clusters from the Home Page
Info: You can always navigate back to the home page by clicking the app switcher icon at the top left of your screen.
Navigate to the oss-kafka-demo cluster
Navigate to Streams Messaging Manager
Info: Streams Messaging Manager (SMM) is a tool for working with Apache Kafka.
Now that you are in SMM.
Navigate to the round icon third from the top, click this Topic button.
You are now in the Topic browser.
Click Add New to build a new topic.
Enter the name of your topic prefixed with your Workload User Name, ex: <<replace_with_userid>>_syslog_critical.
For settings you should create it with (3 partitions, cleanup.policy: delete, availability maximum) as shown above.
After successfully creating a topic, close the tab that opened when navigating to Streams Messaging Manager
Congratulations! You have built a new topic.
After successfully creating a topic, close the tab that opened when navigating to Streams Messaging Manager