Please log in to watch this conference skillscast.
Get your hands dirty with distributed tools, during these two hours we’ll have a quick overview on how a dataset can be processed in a distributed way towards the exposition exposition as a web service. The tool we’ll use for this are Spark, Cassandra, Akka HTTP and the Spark Notebook. A primary basic knowledge (conceptual) of these tools are not required but welcome. Your take home for this workshop will be a docker image which will allow you to replay the whole thing at home or at work (don’t forget the sunglasses to add even more to the cool effect). Oh! And you’ll also have a better idea why and how these tools can be chained for even general purpose, yet data oriented.
Setup instructions can be found in this PDF
Please download this PDF now as the link will expire after the conference.
Note that this workshop requires a lengthy setup process.
YOU MAY ALSO LIKE:
- Distributed Data Science with Scala in a Browser (SkillsCast recorded in December 2015)
- Lightbend Akka for Scala - Professional (in London on 11th - 12th November 2019)
- Modern development with Java (in London on 9th - 11th December 2019)
- Scala eXchange London 2019 (in London on 12th - 13th December 2019)
- Scalax2gether Community Day 2019 (in London on 14th December 2019)
- Code Kata: Yilin Wei - Optics with Monocle (in London on 22nd October 2019)
- Don’t keep it to yourself - openness and honesty in the workplace (in London on 30th October 2019)
- Abstract Data Types In The Region Of Abysmal Pain, And How To Navigate Them (SkillsCast recorded in September 2019)
- The Last Frontier and Beyond (SkillsCast recorded in August 2019)
Workshop: Mind blown: Crafting a Distributed Data Science Pipeline using Spark, Cassandra, Akka and the Spark Notebook
Andy is a mathematician-turned-distributed computing entrepreneur. Besides running Skills Matter's Spark (and other) courses, Andy also participated in many projects using spark, cassandra, and other distributed technologies, in a range of fields including Geospatial, IoT, Automotive and Smart cities projects. Andy is the creator of the Spark Noeboo, the only reactive and fully Scala notebook for Apache Spark.
Xavier started his career as a researcher in Experimental Physics, and also focused on data processing. Further down the road, he took part in projects in finance, genomics, and software development for academic research. During that time, he worked on timeseries, on the prediction of biological molecular structures and interactions, and applied Machine Learning methodologies. He developed solutions to manage and process data distributed across data centres. He founded and now works at Data Fellas, a company dedicated to distributed computing and advanced analytics, leveraging Scala, Spark, and other distributed technologies.