Fixing Numpy install errors and Panda install
There are sometimes issues reported on Raspberry Pi setups when installing using pip3 install (or pip install). These will typically mention:
- libf77blas.so.3: cannot open shared object file: No such file or directory, The solution will be to either:
- sudo apt-get install libatlas-base-dev to install the missing libraries expected by the self-compiled NumPy
- Alternatively use the NumPy provided by Raspbian. In which case run:
- pip3 uninstall numpy # remove previously installed version
- apt install python3-numpy
Pandas Module for Merging Files
- Import the Pandas module.
- Read both the files using the
read_excel()
function. - Combine them using the
merge()
function. - Use the
to_excel()
function, to create the resultant file
# importing the module import pandas # reading the files f1 = pandas.read_excel("registration details.xlsx")f2 = pandas.read_excel("exam results.xlsx") # merging the files f3 = f1[["REGISTRATION NO", "STUDENT EMAIL ID "]].merge(f2[["REGISTRATION NO", "Name", "Marks Obtained", "Percentage"]], on = "REGISTRATION NO", how = "left") # creating a new file f3.to_excel("Results.xlsx", index = False) |
Pandas Joining Columns to Create a New Column
- df[‘new column’] = df[‘column1’] + df[‘column2’]
- df[‘new column’] = df[‘column1’] + ‘ ‘ + df[‘column2’] #puts a space between the values
Regular Expressions (regex) with Pandas
- df.replace(‘(expressiontoreplace)’, ‘(replacewiththis)’, regex=True, inplace = True)
- df.replace(to_replace = ‘(expressiontoreplace)’, value = ‘(replacewiththis)’, inplace = True)
- regex testing site ==> pythex.org
Pandas copy a DataFrame to Clipboard
- DataFrame.to_clipboard(excel=True, sep=none)
- df.to_clipboard(sep=’,’) or (sep=’,’, index = False)
- Linux needs modules: xclip or xsel
- sudo apt install xclip
- sudo apt install xsex
- may not work with Python 3???
Conditional Search Using Pandas
From video at: https://www.youtube.com/watch?v=EWIUWVsejWY
import numpy as np
import pandas as pd
excel_file = ‘Pandas_Workbook.xlsx’
df = pd.read_excel(excel_file)
print(df)
print(df[‘Name’].where(df[‘Occupation’] == ‘Programmer’))
programmers = df[‘Name’].where(df[‘Occupation’] == ‘Programmer’)
print(programmers.dropna())
excel_files = [‘Pandas_Workbook.xlsx’,’Pandas_Workbook_copy.xlsx’,’Pandas_Workbook_copy_2.xlsx’]
for individual_excel_file in excel_files:
df = pd.read_excel(individual_excel_file)
programmers = df[‘Name’].where(df[‘Occupation’] == ‘Programmer’).dropna()
print(“File Name” + individual_excel_file)
print(programmers)
Interesting Python/Pandas information with examples: Sparks by (Examples)
Automate Excel
Getting the Year from a datetime column
- This will add a new column called year to the current dataframe
- df[‘year’] = df[‘datecolumn’].dt.year