Install Apache Spark in Ubuntu 18.04

In this post I’m gonna share the steps to install Apache Spark in Ubuntu 18.04. In order to install Spark, you will need java and scala installed in your machine.

Ensure java is installed

 java -version 

Follow the steps here to install java

Ensure scala is installed

 scala -version

If scala is not installed, follow the steps here to install scala

Download apache spark from officil page. I’m using version 2.4.3 as it is the latest.

wget http://apachemirror.wuchna.com/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz

Extract the file
tar xvf spark-2.4.3-bin-hadoop2.7.tgz

Move the extracted directory to the location you want. I’m moving it to /usr/local and changing the name of directory to spark
mv spark-2.4.3-bin-hadoop2.7 /usr/local/spark

Now, we need to tell the location of spark binary files to the terminal. For that we can export the path to binaries using .bashrc file. If .bashrc is not present in the home directory check for .bash_profile. If you couldn’t find both, create .bashrc in your home directory using the command “touch ~/.bashrc”

If bashrc is present, open it using any editor and add the following line in .bashrc file. I use vim for editing.

$vim ~/.bashrc

export PATH=$PATH:/usr/local/spark/bin

Save and exit

Run source command to read the new path.

source ~/.bashrc
You are all set now. Run the following command to open Spark Shell
$spark-Shell

If the installation was a success, you will get the following response

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
 Setting default log level to "WARN".
 To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
 Spark context Web UI available at http://ip-60-10-10-9.us-west-2.compute.internal:4040
 Spark context available as 'sc' (master = local[*], app id = local-1562066148909).
 Spark session available as 'spark'.
 Welcome to
  ____              __
 / __/__  ___ _____/ /__
_\ \/ _ \/ _ `/ __/  '_/
/__/ .__/_,_/_/ /_/_\   version 2.4.3
  /_/
 Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_212)
 Type in expressions to have them evaluated.
 Type :help for more information.
 scala>

You can access the webUI using port 4040

IP:4040

Post navigation

Alvin Jaison

Myself Alvin Jaison. I'm a DevOps Engineer by Profession; A Mountaineer by Passion. I started this website to share DevOps related posts. Post your suggestions as a comment. You can reach me @ alvinjaison@outlook.com

Leave a Reply

Your email address will not be published. Required fields are marked *