FLaNK Stack Weekly for 14 August 2023
14-August-2023
FLiPN-FLaNK Stack Weekly
Tim Spann @PaaSDev
https://www.threads.net/@tspannhw
https://medium.com/@tspann/subscribe
A lot is going on and it’s starting the fast rush towards Fall when there are Flink, Kafka, Apache and other conferences through out North America.
Get your new Apache NiFi for Dummies!
https://www.cloudera.com/campaign/apache-nifi-for-dummies.html
https://ossinsight.io/analyze/tspannhw
CODE + COMMUNITY
Please join my meetup group NJ/NYC/Philly/Virtual.
http://www.meetup.com/futureofdata-princeton/
https://www.meetup.com/futureofdata-newyork/
https://www.meetup.com/futureofdata-philadelphia/
*This is Issue #98 *
https://github.com/tspannhw/FLiPStackWeekly
https://www.linkedin.com/pulse/schedule-2023-tim-spann-/
https://www.cloudera.com/solutions/dim-developer.html
Releases
EFM 1.6.0
https://docs.cloudera.com/cem/1.6.0/getting-started/topics/cem-component-support.html
CEM MiNiFi C++ Agent — 1.23.06
https://docs.cloudera.com/cem/1.6.0/release-notes-minifi-cpp/topics/cem-minifi-cpp-agent-updates.html
CEM MiNiFi Java Agent — 1.23.04
https://docs.cloudera.com/cem/1.6.0/release-notes-minifi-java/topics/cem-minifi-java-agent-updates.html
Docs
https://docs.cloudera.com/cem/1.6.0/rest-api-reference/index.html
https://docs.cloudera.com/cem/1.6.0/using-cem/topics/cem-agent-deployer-securing-agents.html
https://docs.cloudera.com/cem/latest/installation/topics/cem-set-encryption-password.html
Videos
https://www.youtube.com/watch?v=zEGffUz1jKo
https://www.youtube.com/watch?v=rQo3Pk5smz8
https://www.youtube.com/watch?v=0G98z_fs_SQ&t=605s&ab_channel=DataScienceFestival
https://www.youtube.com/watch?v=JdsY5p1GZ38&t=29s&ab_channel=DatainMotion
https://www.youtube.com/watch?v=nuS3X5DxFWM&ab_channel=DatainMotion
Articles
https://medium.com/@tspann/hbase-to-hbase-via-apache-nifi-d3d1d674eab2
https://medium.com/geekculture/decision-making-with-linked-data-event-streams-and-powerbi-5cd8379d32
https://medium.com/@samuel.vanackere/linked-data-event-streams-explained-in-8-minutes-e1c76d077bb9
https://medium.com/geekculture/decision-making-with-linked-data-event-streams-and-powerbi-5cd8379d32
https://hilla.dev/blog/ai-chatbot-in-java/
https://www.linkedin.com/posts/nicholasrenotte_watsonx-llms-mlops-activity-7093359957890240512-f8RZ/
https://cloudinfrastructure.substack.com/p/introducing-the-redpoint-open-source
https://www.loicmathieu.fr/wordpress/en/informatique/java-21-quoi-de-neuf/
https://kevinbtalbert.github.io/iceberg/nifi/nifi-iceberg/
Free Stuff
For anyone who needs to upgrade Java or escape from potential liabilities, this is the guide. It’s also provides some helpful insights for any Java developer or anyone developing on-top of current or future JVMs.
https://www.azul.com/openjdk-migration-for-dummies/
https://www.cloudera.com/campaign/apache-nifi-for-dummies.html
Throw Back Articles
https://github.com/apache/kudu/blob/master/examples/quickstart/impala/README.adoc
https://medium.com/@nifi.notes/building-an-effective-nifi-flow-replacetext-60a6016d378c
https://community.cloudera.com/t5/Community-Articles/Running-DNS-and-Domain-Scanning-Tools-From-Apache-NiFi/ta-p/248484
https://community.cloudera.com/t5/Community-Articles/Using-Cloudera-Data-Science-Workbench-with-Apache-NiFi-and/ta-p/249469
https://community.cloudera.com/t5/Community-Articles/Scanning-Documents-into-Data-Lakes-via-Tesseract-MQTT-Python/ta-p/248492
https://community.cloudera.com/t5/Community-Articles/Adding-Stanford-CoreNLP-To-Big-Data-Pipelines-Apache-NiFi-1/ta-p/249378
https://community.cloudera.com/t5/Community-Articles/Using-Apache-NiFi-for-Speech-Processing-Speech-to-Text-with/ta-p/249242
https://community.cloudera.com/t5/Community-Articles/Ingesting-Flight-Data-ADS-B-USB-Receiver-with-Apache-NiFi-1/ta-p/247940
https://community.cloudera.com/t5/Community-Articles/Integrating-lucene-geo-gazetteer-For-Geo-Parsing-with-Apache/ta-p/247993
https://community.cloudera.com/t5/Community-Articles/Creating-WordClouds-From-DataFlows-with-Apache-NiFi-and/ta-p/246605
https://community.cloudera.com/t5/Community-Articles/NIFI-1-x-For-Automatic-Music-Playing-Pipelines/ta-p/247994
https://community.cloudera.com/t5/Community-Articles/Using-Apache-NiFi-with-Apache-MXNet-GluonCV-for-YOLO-3-Deep/ta-p/248979
https://community.cloudera.com/t5/Community-Articles/Tracking-Air-Quality-with-HDP-and-HDF-Part-1-Apache-NiFi/ta-p/248265
https://community.cloudera.com/t5/Community-Articles/Monitoring-Energy-Usage-Utilizing-Apache-NiFi-Python-Apache/ta-p/247525
https://community.cloudera.com/t5/Community-Articles/Using-Command-Line-Security-Tools-from-Apache-NiFi/ta-p/248158
https://community.cloudera.com/t5/Community-Articles/Apache-NiFi-Processor-for-Apache-MXNet-SSD-Single-Shot/ta-p/249240
https://community.cloudera.com/t5/Community-Articles/Ingesting-Apache-MXNet-Gluon-Deep-Learning-Results-Via-MQTT/ta-p/248544
https://community.cloudera.com/t5/Community-Articles/Updating-The-Apache-OpenNLP-Community-Apache-NiFi-Processor/ta-p/248398
https://community.cloudera.com/t5/Community-Articles/Integration-Apache-OpenNLP-1-8-4-into-Apache-NiFi-1-5-For/ta-p/248010
https://community.cloudera.com/t5/Community-Articles/Tracking-Phone-Location-for-Android-and-IoT-with-OwnTracks/ta-p/244875
https://community.cloudera.com/t5/Community-Articles/Ingesting-Drone-Data-From-Ryze-Tello-Part-1-Setup-and/ta-p/249422
https://community.cloudera.com/t5/Community-Articles/Ingesting-RDBMS-Data-As-New-Tables-Arrive-Automagically-into/ta-p/246214
https://community.cloudera.com/t5/Community-Articles/Incrementally-Streaming-RDBMS-Data-to-Your-Hadoop-DataLake/ta-p/247927
https://community.cloudera.com/t5/Community-Articles/Ingesting-and-Analyzing-Street-Camera-Data-from-Major-US/ta-p/249194
https://community.cloudera.com/t5/Community-Articles/Basic-Image-Processing-and-Linux-Utilities-As-Part-of-a-Big/ta-p/249121
https://community.cloudera.com/t5/Community-Articles/Hosting-and-Ingesting-Data-From-Web-Pages-Desktop-and-Mobile/ta-p/244575
https://community.cloudera.com/t5/Community-Articles/QADCDC-Our-how-to-ingest-some-database-tables-to-Hadoop-Very/ta-p/245229
https://community.cloudera.com/t5/Community-Articles/Tracking-Air-Quality-with-HDP-and-HDF-Part-2-Indoor-Air/ta-p/249471
https://community.cloudera.com/t5/Community-Articles/Streaming-Ingest-of-Google-Sheets-with-HDF-2-0/ta-p/247764
https://community.cloudera.com/t5/Community-Articles/Ingesting-Golden-Gate-Records-From-Apache-Kafka-and/ta-p/247557
https://community.cloudera.com/t5/Community-Articles/Data-Processing-Pipeline-Parsing-PDFs-and-Identifying-Names/ta-p/249105
https://community.cloudera.com/t5/Community-Articles/Using-A-TensorFlow-quot-Person-Blocker-quot-With-Apache-NiFi/ta-p/248141
https://community.cloudera.com/t5/Community-Articles/Su-Su-Sussudio-Sudoers-Log-Parsing-with-Apache-NiFi/ta-p/249461
https://community.cloudera.com/t5/Community-Articles/Integrating-IBM-Watson-Machine-Learning-APIs-with-Apache/ta-p/247545
https://community.cloudera.com/t5/Community-Articles/Simple-Change-Data-Capture-CDC-with-SQL-Selects-via-Apache/ta-p/308376
https://community.cloudera.com/t5/Community-Articles/Deep-Learning-IoT-Workflows-with-Raspberry-Pi-MQTT-MXNet/ta-p/249456
https://community.cloudera.com/t5/Community-Articles/Parsing-Web-Pages-for-Images-with-Apache-NiFi/ta-p/248415
https://community.cloudera.com/t5/Community-Articles/Trigger-SonicPi-Music-Via-Apache-NiFi/ta-p/248587
https://community.cloudera.com/t5/Community-Articles/Using-Parsey-McParseFace-Google-TensorFlow-Syntaxnet-From/ta-p/246337
https://community.cloudera.com/t5/Community-Articles/Ingesting-osquery-Into-Apache-Phoenix-using-Apache-NiFi/ta-p/249308
https://community.cloudera.com/t5/Community-Articles/Converting-PowerPoint-Presentations-into-French-from-English/ta-p/248974
https://community.cloudera.com/t5/Community-Articles/Posting-Images-with-Apache-NiFi-1-7-and-a-Custom-Processor/ta-p/249017
https://community.cloudera.com/t5/Community-Articles/Parsing-Any-Document-with-Apache-NiFi-1-5-with-Apache-Tika/ta-p/247672
Events
https://attend.cloudera.com/ameropendatalakehousewithcdpon?lid=7vxyhds3tlv7
August 23, 2023: NYC. AI.
https://www.aicamp.ai/event/eventdetails/W2023082314
September 26–27, 2023: Current Event. San Jose, California.
https://www.confluent.io/events/current/
October 7–10, 2023: Halifax, CA. Community over Code.
https://communityovercode.org/
October 8, 2023: Streaming Track, Room 102
https://communityovercode.org/schedule/#Oct8
https://communityovercode.org/schedule-list/#SG007
https://communityovercode.org/schedule-list/#SG011
October 10, 2023: Internet of Things Track, Room 109
https://communityovercode.org/schedule/#Oct10
https://communityovercode.org/schedule-list/#IOT001
October 18, 2023: 2-Hours to Data Innovation: Data Flow
https://www.cloudera.com/about/events/hands-on-lab-series-2-hours-to-data-innovation.html
November 2, 2023: Evolve. NYC
https://www.cloudera.com/about/events/evolve/new-york.html#register
November 8, 2023: Flink Forward, Seattle.
https://www.flink-forward.org/seattle-2023
November 22, 2023: Big Data Conference. Hybrid
Cloudera Events
https://www.cloudera.com/about/events.html
More Events:
https://www.linkedin.com/pulse/schedule-2023-tim-spann-/
Code
- https://github.com/tspannhw/FLaNK-Edge/tree/main
- https://github.com/tspannhw/FLaNK-Edge-Models
- https://github.com/tspannhw/FLaNK-HTAP
Tools
- https://github.com/fluent/fluent-bit
- https://github.com/jdb78/pytorch-forecasting
- https://www.symmetricds.org/
- https://github.com/cloudera/CML_AMP_MLflow_Tracking
- https://github.com/ogakulov/CML_AMP_Churn_Prediction_mlflow
- https://github.com/Wisser/Jailer
- https://flatdraw.com/
- https://google.github.io/typograms/#installation
- https://github.com/mukel/llama2.java
- https://meoweler.com/
- https://github.com/karpathy/llama2.c
- https://github.com/simonmesmith/agentflow
- https://stability.ai/blog/stablecode-llm-generative-ai-coding
- https://dukope.itch.io/lcd-please
- https://github.com/feldera/dbsp
- https://github.com/libsql/libsql
- https://github.com/xusenlinzy/api-for-open-llm
- https://platform.openai.com/docs/guides/embeddings/what-are-embeddings
- https://github.com/openai/chatgpt-retrieval-plugin
- https://www.pythongasm.com/build-gpt-powered-chatbots-around-enterprise-data-with-python
- https://github.com/langgenius/dify
- https://github.com/languagetool-org/languagetool
- https://github.com/alipay/fury
- https://github.com/morph-labs/rift
- https://github.com/teaxyz/cli
- https://github.com/Aiven-Open/sql-cli-for-apache-flink-docker/releases/tag/1.17.1
- https://github.com/Aiven-Open/jdbc-connector-for-apache-kafka
- https://towhee.io/tasks/detail/pipeline/retrieval-augmented-generation
- https://github.com/towhee-io/towhee
- https://github.com/jason-jz-zhu/databathing
- https://github.com/microsoft/Llama-2-Onnx
- https://www.mlexpert.io/prompt-engineering/langchain-quickstart
- https://foojay.io/today/pi4j-operating-system-for-java-development-on-raspberry-pi/
- https://www.cambioml.com/pykoi/
- https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-server/readme.md
- https://litellm.ai/
- https://redpoint.metabaseapp.com/public/dashboard/5e802588-cc2c-489c-a2f3-283d6c3cd298
- https://github.com/HumanSignal/label-studio
- https://github.com/daefresh/awesome-data-temporality
© 2020–2023 Tim Spann