Hello @liz; That might help if the someone, like me, had sense enough to look at the buttons down at the bottom (as I did not!). Here I'll try again.
I was hoping @Aleah would respond with an example of what she was working on. I believe context and example are often more helpful for understanding than an attempt at abstract explanation. So I will supply some context and example.
I was doing some home heating analysis for our home last winter. I have data loggers around my house and yard for temperature and some loggers to indicate the times the furnaces are on. But I do not have anything for other outside weather factors. So I went to
www.ncdc.noaa.gov the national climatic data center and downloaded data from the Iowa City airport since I live in Iowa City. Workbook 1 below contains 20 rows of data, columns A through K, selected from the download.
It should be clear that many of Stevie's best practices are violated and the data needs cleaning. Columns L and beyond are what I did and workbook 2 below will help explain in detail:
First, column A, labelled "Date" is text. One can tell because it is ALIGNED on the left. Workbook 2 columns C through G show how I extract year, month, day, hour, and minute from cell A2 using the "MID" Excel function. Use the EXCEL Help to get the details of the MID function. Looking at cell C2 shows how to get the year. Note the multiplication by "1" (*1). The MID function returns text format and multiplying by "1" changes it to standard or numerical format. Cell H2 shows how to use the "DATE" Excel function to put these together to get date and time into a standard format. (Ignore the "+1462" which I had to add to compensate for the difference between MS dating and Macintosh dating). These could all be put into one function if one is so inclined.
Workbook 2 cell D5 shows cell D8 from workbook 1. This is also aligned left and is text. The glossary for the download told me that text in that position was a very small amount of precipitation so I wanted to ignore text. Workbook 2 cell F5 shows if the cell is empty, I code it as "-99", if it is text, I code it as "-199" and otherwise it just stays as the number it is. Workbook 2 cell E8 is from workbook 1 cell E8. I was only interested in whether there was rain or snow so I used the MID function and a series of "if" statements to extract what I wanted as shown in cell H8. I use "-99" and "-199" often because these numbers are unlikely to be actual weather data values (and they will be obvious in a graph). Of course I have to write down this coding so I remember. One just uses "Edit fill down" to do these transforms for the entire data set.
Everything I wanted is then copied into columns S through AE as in workbook 1. I begin a NEW workbook and copy these cells and use "PASTE SPECIAL VARIABLES" to paste these in the new workbook. Paste special variables eliminates the functional history and everything is plain numerical values. This is then saved as ".csv" and imports easily into R. It should be obvious that I am not a "programmer". Efficiency and elegance are not my top skills. I prefer something I understand and can effect quickly.
Date and time are the only potential problems I saw in downloaded purple air data so I hope this helps Aleah.
As for my project of home heating analysis it dawned on me in the spring that whether my roof was snow covered and white (and reflective) or clear and charcoal could make a difference on heat in the attic and second floor. Unfortunately NCDC data won't tell me about my roof color. Maybe next winter!