Working with (mediumly) large #dataset

Author:

++ Large dataset++
Be very careful when working with dataset. Check the rows and columns for inconsistencies. I hope the note is readable, since I wrote it on the train.

The skill is not very much taught in the class (Geology Program ITB). I don’t know why, but would it be nice if a undergrad student build integrative database of their field data. Averagely they would go with more than 50 observation points and more than 10 variables. Let’s mention it:

  1. location id
  2. x coordinate
  3. y coordinate
  4. strike
  5. dip
  6. lithology
  7. grain size
  8. fabric
  9. sorting
  10. color
  11. porosity
  12. fresh/weathered
  13. sedimentary structure
  14. upper boundary
  15. lower boundary
  16. etc etc and the list continues

With such database, students can analyse the data more quantitatively, eg: histogram of porosity,  scatter plot between parameters. I don’t about other lecturer, but I think this is doable by geologically-hard headed students :-).

Originally post on my Path