On 31 January 2017, the BDE Team announced the second public release of the open source BigDataEurope Integrator Platform (BDI). This platform is developed with the aim to facilitate and simplify the use of big data technologies by providing easy-to-use interfaces and deployment options.
The BDI platform is developed with the aim to facilitate and simplify the use of big data technologies by providing easy-to-use interfaces and deployment options. In this framework, the BDE Project offers practical solutions for big data related challenges, as identified and elicited in the context of the seven H2020 societal challenges, which constitute BDE’s Big Data focus areas.
During the last months, the BDE Platform has been upgraded and curently uses Docker 1.12 (based on open standards), benefiting thus from the new features added to Docker. For instance, Docker Engine is now offering multi-host and multi-container orchestration which is simple to use and accessible to everyone. Docker 1.12 Networking plays a key role in enabling these orchestration features. In this new release, BDI uses the following:
- Swarm-mode networking
- Routing Mesh
- Ingress and Internal Load-Balancing
- Service Discovery
- Multi-host networking with integrated KV-Store
- Fault tolerance
As a result, multiple containers on multiple nodes using Docker Compose can be now created. Docker Compose V2 and Docker Swarm aim to implement full integration, which means that it is feasible to point a Compose app at a Swarm cluster and make its use possible in the same manner as if a single Docker host was used.
The resulting BDI remains easy-to-deploy, easy-to-use and adaptable (cluster-based and standalone) for the execution of big data frameworks and tools. The BDE Team provided baseline docker images for Apache Hadoop, Apache Spark, Apache Flink and many others. We selected these components based on the requirements gathered from the participating Societal Challenges. Thus, the Platform makes it feasible to perform a great variety of big data tasks, including message passing (Kafka, Flume), storage (Hive, Cassandra) or publishing (Geotriples).
The Platform is tested by the uses cases related to each one of the Societal Challenges, where the Docker components provided by BDI are used to implement the desired work flow.
The BDE Team has developed components, such as the Integrator UI, an Init daemon which allows the creation of work flows and monitoring of the start-up status of inter-dependent docker components. The Pipeline service and Pipeline Builder are developed with the aim to support the creation of workflows. Furthermore there is a Pipeline Monitor frontend, which demonstrates the current status of docker components. The Swarm UI supports in terms of visualising the status of your swarm cluster...