Python and PostgreSQL - Importing Data
In order to import data into a PostgreSQL database, it must be in a suitable format. CSV files are one such format. A CSV file, or Comma Separated Values file, is a delimited file that uses commas to separate values. They store tabular data in plain text. The first row in a CSV file often contains headers for each column of data. Each row in the file thereafter is a record of related data. In the example contents of a CSV file below, each row contains information relating to a person, including their first name, last name, title and date of birth in the format YYYY-MM-DD (four digit year, two digit month and two digit day).
"firstname","lastname","title","dob" "Zoe","Davis","Miss","1979-07-11" "Tom","Ingram","Mr","1971-10-04" "Karen","Thomas","Mrs","1969-03-08" "Samantha","Yates","Miss","1995-08-27"
Below is an example of how Python can be used to import data from a CSV file called ‘personimport.csv’, into a table called ‘person’, which was used in the examples for selecting, inserting, updating and deleting data.
Firstly, the file path is set and a check is made to see if the CSV file actually exists. If it does, a connection to the database is established and the CSV file is assigned to a reader object. The rows of data under the headers are then processed one by one. An SQL ‘Insert‘ statement is constructed, incorporating the values from the row in the reader object. This is used to insert the data into the database. Finally, feedback is given as to the number of records added to the database.
import csv import os import psycopg2 # File path. filePath = 'c:\\demo\\personimport.csv' # Database connection variable. connect = None # Check if the CSV file exists. if os.path.isfile(filePath): try: # Connect to database. connect = psycopg2.connect(host='localhost', database='Demo', user='testUN', password='testPW') except psycopg2.DatabaseError as e: # Confirm unsuccessful connection and stop program execution. print("Database connection unsuccessful.") quit() # Cursor to execute query. cursor = connect.cursor() # Assign CSV file to reader object. reader = csv.DictReader(open(filePath)) # Record count. recordCount = 0 # Insert person information into the database. for row in reader: # SQL to insert person information. sqlInsert = \ "INSERT INTO person (firstname, lastname, title, dob) \ VALUES (%s, %s, %s, %s)" try: # Execute query and commit changes. cursor.execute(sqlInsert, (row['firstname'], row['lastname'], row['title'], row['dob'])) connect.commit() # Increment the record count. recordCount += 1 except psycopg2.DatabaseError as e: # Confirm error adding person information and stop program execution. print("Error adding person information.") quit() # Close database connection. connect.close() # Provide feedback on the number of records added. if recordCount == 0: print("No new person records added.") elif recordCount == 1: print(str(recordCount) + " person record added.") else: print(str(recordCount) + " person records added.") else: # Message stating CSV file could not be located. print("Could not locate the CSV file.")
The contents of the ‘person’ table now looks as follows.