Welcome!
Instructions on how to use the platform and connect to its services.
Documents
Learn basic programing skills such as the DevOps culture and set of tools.
Documents
Available datasets used for learning with their associative descriptions and schemas.
Documents
Learn how to use the Big Data components of the Hadoop ecosystem, ranging from HDFS and Hive to Spark.
Documents
Learn how to use the Kubernetes, Helm and Persistent Storage.
Documents
Latest news
- Security — Feb 2, 2022
A new security page is online covering password manager usage and SSH key generation.
- Learn more — Jun 7, 2021
The new learn more page is a collection of links that we find usefull to acquire additionnal knowledges on various topics.
- Spark Structured Streaming — Apr 30, 2021
Learn and practice the Spark Structured Streaming API. The documentation present the advantages of using the Spark Structured Streaming API and compare it with the Spark Streaming API. Then, you can practice the API with a tutorial which create a simple application which consume data from Kafka,
- Kafka Streams — Apr 19, 2021
Consume and produce a Kafka streams in Java. The lesson Kafka Streams illustrates how to build and execute a simple yet working Java application with Maven.
- Kubernetes onboarding — Apr 9, 2021
The Onboarding section is enriched with a new page to get you started with Kubernetes. It explains how to configure a working Kubernetes environment and how to authenticate yourself.
- Kafka pages — Jan 8, 2021
The Kafka pages are up. The Kafka Basics page shows the basic commands and the Kafka Tutorial page is a guide to learn how to create and use a topic.
- Datasets pages — Dec 18, 2020
The datasets pages are up. Go check out the available datasets used for learning with their associative descriptions and schemas.
- New components page — Sep 3, 2020
The components page is up. It lists all the components available on the platform as well as usefull information such as the connection interfaces.
- New Elasticserch and ELK section — Jul 7, 2020
A new tutorial describes how you can install the ELK Stack (Elasticsearch, Logstash, and Kibana) with the official Helm Chart on our Kubernetes cluster.
- New Kubernetes section — Jul 2, 2020
Kubernetes is the go-to platform to run containerized apps, micro-services and workflows in cluster mode. Designed by Google and maintained by the Cloud Native Computing Foundation, it helps user to deploy, operate and scale their containers. In this section, you will learn the basics of Kubernetes and how to use it with tools like kubectl or helm.
- Available datasets — Jun 20, 2020
Data is the new oil was a famous idiom a few years ago when the concept of Big Data was appearing. To help you manipulate different types of data, we place at your disposal several datasets with various characteristics.
- Learn YARN — Jun 18, 2020
YARN is at the heart of Hadoop’s architecture allowing various data processing engines to occur in a planned way, such as SQL, real-time text streaming, batch processing, and other concomitant processes. Before learning the applications of the Hadoop ecosystem it is mandatory to learn how the computing resources are distributed across the Big Data cluster.
- Learn to masterize regular expressions — Jun 6, 2020
Regular expressions are part of the Swiss army knife any software engineer should have at its disposal. The sooner you learn it, the sooner you will use it in your day to day life. The scope of usage encompasse several situations such as to extract pertinent information from a file, to refactor your documents and source codes and to filter and transfmorm records in a datasets.
- Kerberos on-boarding instructions — Jun 5, 2020
A new page describe what Kerberos is all about and how to use it to access the various services offered by the platform. It provides detailed instructions on how to set it up on Windows and other operating systems, how to create a ticket with a password and a keytab and how to access Kerberos-protected service over HTTP.
- New theme — May 28, 2020
The website comes with a new design. It is not perfect yet but certainly must nicer than the previous theme. There is still plenty of room for improvements. The next steps include having the drawer working on mobile and improving the layout once the user sign in.
- New onboarding section — May 26, 2020
The onboarding section is online. It provides an overview of the available components. It also details how to connect to the plateform by opening the VPN tunnel and creating an SSH connection to the edge node.
- Public website launch — May 25, 2020
We are release today the website. The focus is for now on the content. The short-term goal is to help our students onboard with the cluster and its service and leverage its ressources.