Everything useful and less: Integrate raspberry pi 3 into spark cluster

I have the spark cluster already set up, so this is only about setting up a fresh raspberry pi 3 as a worker in the cluster.

This is based on setting-up-a-standalone-apache-spark-cluster-of-raspberry-pi-2

Installing raspbian
Put the sd card (I use a 16 GB Toshiba ADP-HS02) in your card reader.
I then use lsblk to find out the mount path of the sd card:

In my case it's sdb1, so that (sdb) is where I want to put the raspbian image:

sudo dd bs=1M if=~/Downloads/2016-02-09-raspbian-jessie.img of=/dev/sdb

This can take some time. When it's done unmount the sd card.

Physically attaching the pi to the cluster

Put the sd card in your pi.
Connect the lan port to the switch.
Power up the pi.

Configuring the pi
ssh into the headnode/master of your cluster (assuming it is a standalone cluster). I gave it the hostname "pi-headnode", so I can access it via:
ssh pi@pi-headnode

You can use nmap to scan the cluster for nodes.
As a reminder: This is a standalone cluster. So it has its own network, with its own IP addresses. The headnode is the bridge between the cluster, and the main network. From the headnode, both networks (the main network and the cluster) can be accessed. Use ifconfig to find out the IP addresses of the networks. eth0 will probably be connected to the cluster, and eth1 to the main network, but this depends on your setup. In my case, the IP of the main network is 192.168.0.*, that of the cluster 192.168.1.*. So to scan the cluster I use:
nmap -sn 192.168.1.0/24
and I see the new node.

run the scripts
sh map-network
sh copy-keys
sh upgrade-workers
to integrate the pi into the cluster and upgrade all pis. This can take some time.
Instead of upgrade-workers, you could ssh into the new pi and upgrade it manually with sudo apt-get upgrade.

The pi is now part of the cluster, but I like to configure it a bit more:
ssh into the new pi
start the configuration tool:
sudo raspi-config
and expand the disc space, set boot to console and automatic login to save ram, reduce ram for GPU to 0

copy spark to the cluster
update the slaves.txt
cp ~/workers.txt ~/spark-1.6.1-bin-hadoop2.6/conf/slaves
copy spark to the cluster
scp -r /home/pi/spark-1.6.1-bin-hadoop2.6/ pi@192.168.1.12:/home/pi/spark-1.6.1-bin-hadoop2.6/

Start the slaves
spark-1.6.1-bin-hadoop2.6/sbin/start-slaves.sh
this starts all slaves that are not yet running.

Done!

You will hopefully see something like this on your webUI

Everything useful and less

Mittwoch, 6. April 2016

Integrate raspberry pi 3 into spark cluster

Keine Kommentare:

Kommentar veröffentlichen