Connect to Cassandra with Python 3.x and get Pandas Dataframe

Carvia Tech | February 13, 2020 | 2 min read | 304 views

We will be learning

  • How to connect to Cassandra in Python

  • How to get output of query as Pandas Dataframe

Connect to Cassandra in Python

We will be using cassandra Library to make connection.

Install library to connect to Cassandra

Create virtual environment in project directory

virtualenv -p python3 venv

Activate virtual environment

source venv/bin/activate

To know more about virtual environment, click here

Install package

pip install cassandra-driver

Define a class for connecting to Cassandra

from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
import pandas as pd

class CassDao(object):

    def __init__(self):
        self.cluster = None
        self.username = cassandra_config['username'] //(1)
        self.password = cassandra_config['password'] //(2) = cassandra_config['host'] //(3)
        self.keyspace = cassandra_config['keyspace'] //(4)
        self.session = self.createSession()

    def __del__(self):

    def createSession(self):
        self.cluster = Cluster([],
                               auth_provider=PlainTextAuthProvider(username=self.username, password=self.password))
        self.session = self.cluster.connect(self.keyspace)
        return self.session

    def getSession(self):
        return self.session

    def get_data(self):
        query = """SELECT id from '{}'.table """.format(self.keyspace)
        data = pd.DataFrame(self.getSession().execute(query, timeout=None))
        return data
  1. Database Username

  2. Password of DB

  3. Host of database

  4. keyspace

Note : For getting data, define the method in class itself.
dataclass = CassDao()
data = dataclass.get_data()

Output of data will be a pandas dataframe. Happy working with Cassandra now!

In case, you are looking to connect to

  • MySQL Database then follow this article

  • PostGreSQL Database then follow this article

Top articles in this category:
  1. Connect to MySQL with Python 3.x and get Pandas Dataframe
  2. Top 100 interview questions on Data Science & Machine Learning
  3. Flask Interview Questions
  4. Python coding challenges for interviews
  5. Google Data Scientist interview questions with answers
  6. Connect to Postgresql with Python 3.x and get Pandas Dataframe
  7. Google Colab: import data from google drive as pandas dataframe

Find more on this topic:
Machine Learning image
Machine Learning

Data science, machine learning, python, R, big data, spark, the Jupyter notebook, and much more

Last updated 1 week ago

Recommended books for interview preparation:

This website uses cookies to ensure you get the best experience on our website. more info