Open In App

How to Sort CSV by multiple columns in Python ?

Last Updated : 21 Apr, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we are going to discuss how to sort a CSV file by multiple columns. First, we will convert the CSV file into a data frame then we will sort the data frame by using the sort_values() method. 

Syntax: DataFrame.sort_values(by, axis=0, ascending=True, inplace=False, kind=’quicksort’, na_position=’last’)

Return Type: Returns a sorted Data Frame with same dimensions as of the function caller Data Frame.

After converting the CSV file into a data frame, we need to add two or more column names of the CSV file as by parameter in sort_values() method with axis assigned to 0 like below:

sort_values(‘column1’, ‘column2’…’columnn’, axis=0)

CSV file in use:

Below are various examples that depict how to sort a CSV file by multiple columns:

Example 1:

In the below program, we first convert the CSV file into a dataframe, then we sort the dataframe by a single column in ascending order.

Python3




# importing pandas package
import pandas as pd
  
# making data frame from csv file
data = pd.read_csv("diamonds.csv")
  
# sorting data frame by a column
data.sort_values("carat", axis=0, ascending=True,
                 inplace=True, na_position='first')
  
# display
data.head(10)


Output:

Example 2:

Here, after converting into a data frame, the CSV file is sorted by multiple columns, the depth column is sorted first in ascending order, then the table column is sorted in ascending order for every depth.

Python3




# importing pandas package
import pandas as pd
  
# making data frame from csv file
data = pd.read_csv("diamonds.csv")
  
# sorting data frame by multiple columns
data.sort_values(["depth", "table"], axis=0,
                 ascending=True, inplace=True)
  
# display
data.head(10)


Output:

Example 3: 

In the below example, the CSV file is sorted in descending order by the depth and then in ascending order by the table for every depth. 

Python3




# importing pandas package
import pandas as pd
  
# making data frame from csv file
data = pd.read_csv("diamonds.csv")
  
# sorting data frame by multiple columns
data.sort_values(["depth", "table"], axis=0,
                 ascending=[False, True], inplace=True)
  
# display
data.head(10)


Output:

Example 4:

Here is another example where the CSV file is sorted by multiple columns.

Python3




# importing pandas package
import pandas as pd
  
# making data frame from csv file
data = pd.read_csv("diamonds.csv")
  
# sorting data frame by multiple columns
data.sort_values(["depth", "table", "carat"], axis=0,
                 ascending=[False, True, False], inplace=True)
  
# display
data.head(10)


Output:



Previous Article
Next Article

Similar Reads

How to create multiple CSV files from existing CSV file using Pandas ?
In this article, we will learn how to create multiple CSV files from existing CSV file using Pandas. When we enter our code into production, we will need to deal with editing our data files. Due to the large size of the data file, we will encounter more problems, so we divided this file into some small files based on some criteria like splitting in
3 min read
Partitioning by multiple columns in PySpark with columns in a list
Pyspark offers the users numerous functions to perform on the dataset. One such function which seems to be too useful is Pyspark, which operates on group of rows and return single value for every input. Do you know that you can even the partition the dataset through the Window function? Not only partitioning is possible through one column, but you
5 min read
Python program to read CSV without CSV module
CSV (Comma Separated Values) is a simple file format used to store tabular data, such as a spreadsheet or database. CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the nam
3 min read
Copying Csv Data Into Csv Files Using Python
CSV files, the stalwarts of information exchange, can be effortlessly harnessed to extract specific data or merge insights from multiple files. In this article, we unveil five robust approaches, transforming you into a virtuoso of CSV data migration in Python. Empower your data-wrangling endeavors, mastering the art of copying and organizing inform
4 min read
How to sort a Pandas DataFrame by multiple columns in Python?
Sorting is a fundamental operation applied to dataframes to arrange data based on specific conditions. Dataframes can be sorted alphabetically or numerically, providing flexibility in organizing information. This article explores the process of sorting a Pandas Dataframe by multiple columns, demonstrating the versatile capabilities of Pandas in han
6 min read
How to convert CSV columns to text in Python?
In this article, we are going to see how to convert CSV columns to text in Python, and we will also see how to convert all CSV column to text. Approach: Read .CSV file using pandas dataframe.Convert particular column to list using list() constructorThen sequentially convert each element of the list to a string and join them using a specific charact
2 min read
Python - Read CSV Columns Into List
CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the name for this file format. In this article, we will read data from a CSV file into a list. We will use the panda's libr
3 min read
PySpark - Sort dataframe by multiple columns
In this article, we are going to see how to sort the PySpark dataframe by multiple columns. It can be done in these ways: Using sort()Using orderBy() Creating Dataframe for demonstration: C/C++ Code # importing module import pyspark # importing sparksession from pyspark.sql module from pyspark.sql import SparkSession # creating sparksession and giv
2 min read
PySpark RDD - Sort by Multiple Columns
In this article, we are going to learn sorting Pyspark RDD by multiple columns in Python. There occurs various situations in being a data scientist when you get unsorted data and there is not only one column unsorted but multiple columns are unsorted. This situation can be overcome by sorting the data set through multiple columns in Pyspark RDD. Yo
7 min read
Reading specific columns of a CSV file using Pandas
CSV files are widely utilized for storing tabular data in file systems, and there are instances where these files contain extraneous columns that are irrelevant to our analysis. This article will explore techniques for selectively reading specific columns from a CSV file using Python. Let us see how to read specific columns of a CSV file using Pand
3 min read
Practice Tags :