|

Fixing Numpy install errors and Panda install

There are sometimes issues reported on Raspberry Pi setups when installing using pip3 install (or pip install). These will typically mention:

  • libf77blas.so.3: cannot open shared object file: No such file or directory, The solution will be to either:
  • sudo apt-get install libatlas-base-dev to install the missing libraries expected by the self-compiled NumPy
  • Alternatively use the NumPy provided by Raspbian. In which case run:
    • pip3 uninstall numpy # remove previously installed version
    • apt install python3-numpy

Pandas Module for Merging Files

  1. Import the Pandas module.
  2. Read both the files using the read_excel() function.
  3. Combine them using the merge() function.
  4. Use the to_excel() function, to create the resultant file
# importing the module
import pandas  
# reading the files
f1 = pandas.read_excel("registration details.xlsx")f2 = pandas.read_excel("exam results.xlsx")  
# merging the files
f3 = f1[["REGISTRATION NO", "STUDENT EMAIL ID "]].merge(f2[["REGISTRATION NO", "Name", "Marks Obtained", "Percentage"]], on = "REGISTRATION NO", 
how = "left")  
# creating a new file
f3.to_excel("Results.xlsx", index = False)
Merge Example

Pandas Joining Columns to Create a New Column

  • df[‘new column’] = df[‘column1’] + df[‘column2’]
  • df[‘new column’] = df[‘column1’] + ‘ ‘ + df[‘column2’] #puts a space between the values

Regular Expressions (regex) with Pandas

  • df.replace(‘(expressiontoreplace)’, ‘(replacewiththis)’, regex=True, inplace = True)
  • df.replace(to_replace = ‘(expressiontoreplace)’, value = ‘(replacewiththis)’, inplace = True)
  • regex testing site ==> pythex.org

Pandas copy a DataFrame to Clipboard

  • DataFrame.to_clipboard(excel=True, sep=none)
  • df.to_clipboard(sep=’,’) or (sep=’,’, index = False)
  • Linux needs modules: xclip or xsel
    • sudo apt install xclip
    • sudo apt install xsex
  • may not work with Python 3???

Conditional Search Using Pandas

From video at: https://www.youtube.com/watch?v=EWIUWVsejWY

import numpy as np
import pandas as pd

excel_file = ‘Pandas_Workbook.xlsx’
df = pd.read_excel(excel_file)
print(df)

print(df[‘Name’].where(df[‘Occupation’] == ‘Programmer’))
programmers = df[‘Name’].where(df[‘Occupation’] == ‘Programmer’)
print(programmers.dropna())

excel_files = [‘Pandas_Workbook.xlsx’,’Pandas_Workbook_copy.xlsx’,’Pandas_Workbook_copy_2.xlsx’]

for individual_excel_file in excel_files:
df = pd.read_excel(individual_excel_file)
programmers = df[‘Name’].where(df[‘Occupation’] == ‘Programmer’).dropna()
print(“File Name” + individual_excel_file)
print(programmers)

Interesting Python/Pandas information with examples: Sparks by (Examples)

Automate Excel

Getting the Year from a datetime column

  • This will add a new column called year to the current dataframe
    • df[‘year’] = df[‘datecolumn’].dt.year