Connect to Cassandra with Python 3.x and get Pandas Dataframe

Upasana | February 13, 2020 | 2 min read | 523 views

We will be learning

  • How to connect to Cassandra in Python

  • How to get output of query as Pandas Dataframe

Connect to Cassandra in Python

We will be using cassandra Library to make connection.

Install library to connect to Cassandra

Create virtual environment in project directory

virtualenv -p python3 venv

Activate virtual environment

source venv/bin/activate

To know more about virtual environment, click here

Install package

pip install cassandra-driver

Define a class for connecting to Cassandra

from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
import pandas as pd

class CassDao(object):

    def __init__(self):
        self.cluster = None
        self.username = cassandra_config['username'] //(1)
        self.password = cassandra_config['password'] //(2) = cassandra_config['host'] //(3)
        self.keyspace = cassandra_config['keyspace'] //(4)
        self.session = self.createSession()

    def __del__(self):

    def createSession(self):
        self.cluster = Cluster([],
                               auth_provider=PlainTextAuthProvider(username=self.username, password=self.password))
        self.session = self.cluster.connect(self.keyspace)
        return self.session

    def getSession(self):
        return self.session

    def get_data(self):
        query = """SELECT id from '{}'.table """.format(self.keyspace)
        data = pd.DataFrame(self.getSession().execute(query, timeout=None))
        return data
  1. Database Username

  2. Password of DB

  3. Host of database

  4. keyspace

Note : For getting data, define the method in class itself.
dataclass = CassDao()
data = dataclass.get_data()

Output of data will be a pandas dataframe. Happy working with Cassandra now!

In case, you are looking to connect to

  • MySQL Database then follow this article

  • PostGreSQL Database then follow this article

Top articles in this category:
  1. Connect to MySQL with Python 3.x and get Pandas Dataframe
  2. Connect to Postgresql with Python 3.x and get Pandas Dataframe
  3. Google Colab: import data from google drive as pandas dataframe
  4. Top 100 interview questions on Data Science & Machine Learning
  5. Google Data Scientist interview questions with answers
  6. Python - Get Google Analytics Data
  7. Python send GMAIL with attachment

Recommended books for interview preparation:

Find more on this topic:
Buy interview books

Java & Microservices interview refresher for experienced developers.