Pandas in Python

Yash Soni
2 min readMar 16, 2022

Pandas is an open-source library that is made mainly for working with relational or labeled data both easily and intuitively. Its feature-rich libraries allow users to combine the advantages of Python’s powerful programming language and data analysis packages together. It provides various data structures and operations for manipulating numerical tables and time series. This library is built on top of the NumPy library. Pandas are fast and it has high performance & productivity for users.

Advantages

Pandas is an open-source library, providing high-performance, easy-to-use data structures, and data analysis tools for the Python programming language.

Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with structured (tabular, multidimensional, potentially heterogeneous) and time-series data both easy and intuitive.

Getting Started

After installing pandas in your system, you need to import the library and the module is generally imported as :

import pandas as pd

Series and DataFrame are two data structures that are used to represent the structured data.

Series

Pandas Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). The axis labels are collectively called indexes. Pandas Series is nothing but a column in an excel sheet.

Creating a series

Pandas Series can be created from the lists, dictionary, and from a scalar value, etc. In the real world, a Pandas Series will be created by loading the datasets from existing storage, storage can be SQL Database, CSV file, an Excel file.

import pandas as pd
import numpy as np

ser = pd.Series()

print(ser)

data = np.array(['y','a','s','h'])

ser = pd.Series(data)
print(ser)

Data Frame

The pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns).

Creating Data Frame

In the real world, Python Pandas DataFrame will be created by loading the datasets from existing storage like SQL Database, CSV file, and an Excel file. A data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns.

import pandas as pd

df = pd.DataFrame()
print(df)

lst = ['Python','is','better','than','Java']

df = pd.DataFrame(lst)
print(df)

Thank you!

--

--