A friend of mine just starting his data engineering career journey reached out to me because he had issues with setting Cassandra DB on his system. So I decided to document the whole process so other data engineering newbies can have access to it.
Basic Requirements for getting started
After you install Python open your command line or download the terminal from Microsoft Store.
You can use the guide provided by Apache here
Open the terminal and you will see this
After setting up your docker input this
docker pull cassandra:latest
Cassandra is already installed you can view it on your Docker UI
you can create a new host/port for Cassandra by the following command
docker run --name cassandra -d cassandra
your hostname must come after --name while -d cassandra remains unchanged I decided to name my host Cassandra itself.
verify that your hostname is setup on docker by using this command
docker ps
To set up the hostname I had several errors and I will explain how you can bypass the errors.
Open a new command line and input the following command
pip install cqlsh
use the following command
docker exec -it cassandra bash
you would see the following
input the following command after /#
apt-get update
Next you input this command
apt-get install -y python3-pip
After loading input this command
exit
input this command ti run the localhost
docker exec -it cassandra cqlsh
you should see this
indicating it is being set up already
To get teh above image I followed the following steps from Apache Cassandra Page Here
Step 3, 4, 6 and 7.